Cluster

How To Configure a Multi-Node Cluster with Cassandra on Ubuntu 24.04 or Newer

Welcome back to the Greenhost.cloud blog! Today, we’re diving into the world of distributed databases with a step-by-step guide on how to configure a multi-node Apache Cassandra cluster on Ubuntu 24.04 or newer. As data becomes increasingly crucial for business operations, setting up a resilient and scalable database system like Cassandra can give you the edge you need.

What is Apache Cassandra?

Apache Cassandra is a highly scalable, distributed NoSQL database designed to handle large amounts of data across many commodity servers with no single point of failure. Its architecture allows for continuous availability and offers high write and read throughput, making it an ideal choice for applications that demand high availability and scalability.

Prerequisites

Before we begin, ensure that you have the following:

  • Ubuntu 24.04 or newer installed on all nodes.
  • Java Development Kit (JDK): Cassandra requires JDK 8 or newer.
  • Sudo privileges on all nodes.
  • Network connectivity between nodes.
  • Hostname resolution: Ensure all nodes can resolve each other’s hostnames.

Step 1: Install Java

Cassandra runs on Java, so we need to install it first. You can install OpenJDK using the following commands:

sudo apt update
sudo apt install openjdk-11-jdk -y

Verify the installation:

java -version

Step 2: Install Apache Cassandra

We will install Cassandra using the official Apache repositories. Start by adding the repository:

  1. Add the Apache Cassandra repository:
   echo 'deb http://www.apache.org/dist/cassandra/debian/ 40x main' | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
  1. Add the PGP key:
   curl https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
  1. Update package lists:
   sudo apt update
  1. Install Cassandra:
   sudo apt install cassandra -y

Step 3: Configure the Cluster

3.1 Modify the cassandra.yaml File

The primary configuration file for Cassandra is located at /etc/cassandra/cassandra.yaml. You need to modify a few parameters to set up your multi-node cluster.

  1. Set the Cluster Name: Open the configuration file:
   sudo nano /etc/cassandra/cassandra.yaml

Find the cluster_name field and set it to your desired cluster name:

   cluster_name: 'MyCassandraCluster'
  1. Configure Seed Nodes: In a multi-node setup, it’s essential to specify seed nodes for the initial contact points for the cluster. Modify the seeds line to include your seed nodes’ IP addresses:
   seeds: '192.168.1.101,192.168.1.102'  # Replace with your seed nodes
  1. Set the Listen Address: Each node must have a unique listen address. Set the listen_address to the IP address of the current node:
   listen_address: '192.168.1.101'  # Replace with the current node's IP
  1. Set the RPC Address: Ensure the rpc_address is set to the node’s IP or 0.0.0.0 for it to listen on all interfaces:
   rpc_address: '0.0.0.0'

3.2 Configure Other Nodes

Repeat the above process for each node in your cluster, ensuring the listen_address and seeds are updated accordingly.

Step 4: Start Cassandra

Once you have configured all nodes, start the Cassandra service on each node:

sudo systemctl start cassandra

You can check the status of the Cassandra service with:

sudo systemctl status cassandra

Step 5: Verify Cluster Status

To check the status of your Cassandra cluster, use the nodetool command:

nodetool status

This command should show all nodes in the cluster, their status, and other relevant information.

Step 6: Configure Firewall (Optional)

If your nodes are behind a firewall, ensure that the necessary ports are open:

  • Cassandra native transport: 9042
  • Cassandra inter-node communication: 7000 (TCP) and 7001 (TLS)

You can configure the firewall using UFW:

sudo ufw allow 9042/tcp
sudo ufw allow 7000/tcp
sudo ufw allow 7001/tcp

Conclusion

Congratulations! You have successfully configured a multi-node Cassandra cluster on Ubuntu 24.04 or newer. With Cassandra’s robust features, your application can now scale seamlessly as your data needs grow.

Happy clustering!


For more tech tips and tutorials, keep following our blog at Greenhost.cloud!