Complete Steps for Cassandra Installation on Ubuntu

Cassandra Installation on Ubuntu

Apache Cassandra is a highly scalable, fault-tolerant NoSQL database designed for handling large-scale data with high availability and zero downtime.

It excels in environments where speed and horizontal scalability are key, such as big data and real-time analytics.

Running Cassandra on Ubuntu provides a stable, efficient platform with easy maintenance, making it ideal for scaling databases without performance loss.

Prerequisites to Cassandra Installation on Ubuntu

Before you jump into the Cassandra installation guide, make sure your system is meeting below specifications:

  • A Linux VPS running Ubuntu.
  • A non-root user with sudo privileges.
  • Access to Terminal/Command line.
  • Java OpenJDK 8 for running Cassandra and accessing repositories securely.

Installing Apache Cassandra on Ubuntu: A Scalable NoSQL Database

Ubuntu’s lightweight and secure nature complements Cassandra’s peer-to-peer architecture, ensuring seamless management of growing data.

With a secured Linux VPS, let’s go through the step-by-step process for a successful Cassandra setup on Ubuntu, ensuring you get the most out of its powerful features.

Step 1: Install Required Packages

Before installing Cassandra, it is essential to ensure your system has Java OpenJDK 8 and the apt-transport-https package.

To install Java OpenJDK 8, run the command below to update your package repository:

sudo apt update

Then, run the command below to install OpenJDK 8:

sudo apt install openjdk-8-jdk -y

Once the installation is done, verify that Java is installed:

java -version

You should see an output confirming that version 8 of Java is installed. This is crucial because Cassandra specifically requires this version of Java to function properly.

Finally, to access Cassandra’s repositories over HTTPS, the apt-transport-https package is needed. If it’s not installed, run the following command:

sudo apt install apt-transport-https

This ensures your system is ready to access the Cassandra repositories securely.

Step 2: Add Cassandra Repository and Import GPG Key

In this step, you need to add the Apache Cassandra repository to the system and import its GPG key to ensure the packages are trusted.

To add the repository to your system’s sources list, run the following command:

sudo sh -c 'echo "deb http://www.apache.org/dist/cassandra/debian 40x main" > /etc/apt/sources.list.d/cassandra.list'

This adds the repository for Cassandra version 4.0. If you want an older version, like 3.9, you can replace 40x with 39x in the command.

To import the GPG Key, use the command below to download and add the GPG key for the Cassandra repository:

wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -

The OK message should appear if the key is successfully added.

Step 3: Install Apache Cassandra

Now that the repository is in place, you can install Cassandra.

First, run the command below to update your package list to include the newly added Cassandra repository:

sudo apt update

With the repository updated, run the following command to install Cassandra:

sudo apt install cassandra -y

Once installation is complete, Cassandra will automatically start, and a dedicated Cassandra user is created to run the service.

Step 4: Verify Apache Cassandra Installation on Ubuntu

After installation, it’s important to verify that Cassandra is running correctly. You can use the nodetool status command to see the status of the Cassandra cluster:

nodetool status

The output should show UN, which means the cluster is up and running.

Note: To keep track of your cluster’s health and performance, nodetool provides commands for checking node status, cleaning up data, and more. Integrating with monitoring tools like Prometheus or Grafana can help visualize Cassandra’s performance in real-time.

Alternatively, you can check Cassandra’s service status by running the command below:

sudo systemctl status cassandra

If everything is set up correctly, the status should display as active (running).

Step 5: Manage Cassandra Service

At this point, you can use the below commands to manually start, stop, or restart Cassandra at some point.

Start Cassandra:

sudo systemctl start cassandra

Restart Cassandra:

sudo systemctl restart cassandra

Stop Cassandra:

sudo systemctl stop cassandra

Then, use the following command to ensure Cassandra starts automatically when your system boots up:

sudo systemctl enable cassandra

This is an optional step.

Step 6: Configure Apache Cassandra

By default, Cassandra’s configuration is optimized for single-node operation. If you’re setting up a cluster, you’ll need to modify some settings in the cassandra.yaml file.

Before making any changes, create a backup of the cassandra.yaml file:

sudo cp /etc/cassandra/cassandra.yaml /etc/cassandra/cassandra.yaml.backup

To configure Cassandra, open the configuration file in a text editor:

sudo nano /etc/cassandra/cassandra.yaml

To Edit the Configuration:

  • Change the Cluster Name:

Find the cluster_name field and change it from the default Test Cluster to your preferred name.

  • Add Node IP Addresses (for clusters):

In the seed_provider section, add the IP addresses of the other nodes in your cluster, separated by commas.

Once done, save and close the file.

Step 7: Test Cassandra Command-Line Shell

Cassandra comes with a built-in command-line interface, cqlsh, which allows you to run Cassandra Query Language (CQL) commands.

To start the shell, simply type:

cqlsh

This will connect you to the Cassandra instance, and you can start interacting with your database.

Note: CQL is similar to SQL but optimized for Cassandra’s distributed architecture.

You’re All Done! Cassandra is widely used in applications requiring high-speed, scalable data processing, such as IoT systems, social media platforms, and recommendation engines for e-commerce.

With your system prepared, the next step is exploring Cassandra Query Language (CQL) and building powerful, high-availability applications.

How does Cassandra Work on Linux Ubuntu?

Apache Cassandra operates on Ubuntu by leveraging a distributed, peer-to-peer architecture that allows it to handle large volumes of data across multiple nodes seamlessly. Each node in the Cassandra cluster can accept read and write requests, ensuring high availability and resilience against failures. This design allows data to be automatically replicated across various nodes, providing fault tolerance and preventing data loss.

Ubuntu’s lightweight and secure environment complements Cassandra’s requirements, making deployment and maintenance straightforward. Additionally, Cassandra’s architecture is optimized for horizontal scalability, enabling users to easily add more nodes to accommodate growing data needs without sacrificing performance, making it a robust choice for modern data-driven applications.

How to troubleshoot ”No hosts are reachable” error in cqlsh?

This error often indicates a network issue.

To solve it, check the cassandra.yaml configuration file to ensure the correct IP address and port settings are specified, particularly under the listen_address and rpc_address fields.

Why is Cassandra not starting after installation?

If Cassandra does not start, check the logs located at /var/log/cassandra/system.log for error messages.

Common issues include insufficient memory or Java not being installed correctly.

Conclusion

In this guide, we covered the complete process of installing and configuring Apache Cassandra on Ubuntu, from setting up prerequisites to verifying and managing the service.

Following these steps ensures that your database is up and running efficiently, with the flexibility to scale as your data grows.

Once Cassandra is installed, you can begin optimizing its configuration for your specific use case, whether it’s for a single-node setup or a distributed cluster.

To secure your Cassandra deployment, consider enabling role-based authentication, encrypting communication between nodes, and setting up SSL for added security.

Leave a Reply

Your email address will not be published. Required fields are marked.