Docker

Setup Cassandra Using Docker

Databases are some of the most fundamental parts of any functional application. Although it can vary on the type of database that is appropriate for your app, if you want to store large volumes of unstructured data across multiple servers, Apache Cassandra is at the top of that list.

If you are unfamiliar, Apache Cassandra, commonly known as Cassandra, is a free and open-source, highly scalable distributed NoSQL database for storing and managing extensive volumes of unstructured data across multiple servers. Cassandra is mighty which provides high availability and scalability across multiple clusters.

On the other hand, Docker is a versatile containerization platform that allows us to package the applications and their dependencies into portable containers, making the deployment and management of software across different environments easier.

In this tutorial, we will walk you through setting up and running Cassandra in a Docker container, enabling you to harness the benefits of both tools in simple steps.

Requirements:

To follow along with this post, ensure that you have the following:

    1. A Unix or Windows host
    2. Installed Docker Engine on your target host
    3. Sufficient permissions to run the Docker commands

Running Apache Cassandra on Docker is supported on any machine with the Docker Engine installed.

Getting the Apache Cassandra Image

The first step is downloading the Apache Cassandra image from the Docker registry. We can use the “docker pull” command as follows:

$ docker pull Cassandra:latest

 
This should download the latest version of the Cassandra image on your host.

Starting the Cassandra Container

Once we have the image downloaded, we can run the Cassandra container. We start by creating a Docker network that allows us to access the container ports without opening them to the host machine.

Run the following command:

$ docker network create cassandra

 
Once we have the network created, we can go ahead and run the container and bind it to the created network using the command as follows:

$ docker run --rm -d --name cassandra --hostname cassandra --network cassandra cassandra

 
This should start the container on the defined Cassandra ports in the Cassandra network. You can check the running container using the “docker ps” command:

 $ docker ps

 

Testing Cassandra

Once the cluster is running, we can use the Cassandra Query Language to create and add the data.

Create a new file called “sample.cql” and add the CQL script as provided in the following:

-- Create a keyspace
CREATE KEYSPACE IF NOT EXISTS my_keyspace
WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': 3
};
-- Use the keyspace
USE my_keyspace;
-- Create a table
CREATE TABLE IF NOT EXISTS my_table (
  id UUID PRIMARY KEY,
  name TEXT,
  age INT
);
-- Insert data into the table
INSERT INTO my_table (id, name, age) VALUES (uuid(), J Doe', 30);
INSERT INTO my_table (id, name, age) VALUES (uuid(), J Smith'
, 25);
INSERT INTO my_table (id, name, age) VALUES (uuid(), B Johnson', 40);

 
The previous example script creates a new keyspace called “my_keyspace” using the SimpleStrategy replication with a replication factor of 3.

Using the USE statement, we then set the keyspace context to the created keyspace.

Finally, we create a table called “my_ table” and insert a random data into the table using the provided “insert” statements.

If you are new to Apache Cassandra, we have a host of tutorials that will guide you through a lot of the features of Apache Cassandra in the following link:

https://linuxhint.com/category/cassandra/

Loading the Cassandra Script

Once the script is ready, we can use the CQL shell to interact with the database and load the data into the database from the previous script.

Run the command as follows:

$ docker run --rm --network cassandra -v "$(pwd)/data.cql:/root/ sample.cql" -e CQLSH_HOST=cassandra -e CQLSH_PORT=9042 -e CQLVERSION=3.4.6 nuvo/docker-cqlsh

 
This should load the data from the script into the database.

Querying the Data

We can log into the CQL Shell and query the data that is stored in the database as follows:

$ docker run --rm -it --network cassandra nuvo/docker-cqlsh cqlsh cassandra 9042 --cqlversion='3.4.5'

 
Query the data:

SELECT * FROM store.my_table;

 

Conclusion

This post covered setting up a primary Apache Cassandra cluster using Docker and the Cassandra image.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list