Apache Cassandra

Cassandra Create Snapshot

When working with databases, keeping an updated copy of your data is critical which allows you to have a fail-safe mechanism in the instance of data corruption.

Apache Cassandra allows us to take backups of our data using the nodetool command. Join us in this tutorial as we explore how to create snapshots in Apache Cassandra.

NOTE: Before taking the snapshot of your cluster, make sure that you have a sufficient disk space and no sessions are active.

Cassandra flushes all the in-memory writes to the disk and performs a hard link to the SSTable files in the cluster.

Cassandra Nodetool Snapshot Command

The following snippet shows the syntax of the nodetool snapshot command:

       nodetool [(-h  | --host )] [(-p  | --port )]
                [(-pp | --print-port)] [(-pw  | --password )]
                [(-pwf  | --password-file )]
                [(-u  | --username )] snapshot
                [(-cf <table> | --column-family <table> | --table <table>)]
                [(-kt  | --kt-list  | -kc  | --kc.list )]
                [(-sf | --skip-flush)] [(-t  | --tag )] [--ttl ] [--]
                []

The following shows the parameters supported by the command:

  1. -h – Specifies the hostname or the IP address of the target cluster.
  2. -p – Sets the port number to the Cassandra cluster.
  3. -pwf – Specifies the password file that is used for cluster authentication.
  4. -pw – Specifies the password for a specified username.
  5. -u – Defines the username to login into the cluster.
  6. -cf – Sets the names of the tables that you wish to backup.
  7. -kc – Specifies the keyspace.tables to backup.
  8. -kt – Defines the list of keyspace.tables to backup.
  9. -sf – Prevents the SSTable flushing operation.
  10. -t – Name of the snapshot.
  11. Keyspace – Names of the keyspaces to backup. Defaults to all keyspaces.

Cassandra Backup All Keyspaces

To create a snapshot of all the keyspaces in a given cluster, we can run the following command:

$ nodetool snapshot -t my_backups

The given command initializes a backup process for all the keyspaces in the cluster.

Cassandra stores the snapshot files in the data directory. You can check your cluster configuration to determine the cluster data directory.

Cassandra Backup Selective Snapshots

We can take the snapshots of multiple keyspaces by specifying them as shown in the following syntax:

$ nodetool snapshot keyspace_1 keyspace_2 keyspace_n

For example, suppose we wish to backup the Linuxhint and system_auth keyspaces. We can run the following command:

$ nodetool snapshot linuxhint system_auth

The previous command should return a sample output as shown in the following:

Requested creating snapshot(s) for [linuxhint, system_auth] with snapshot name [1663410336447] and options {skipFlush=false}
Snapshot directory: 1663410336447

Cassandra Table Snapshot

You can take a snapshot of a given table as shown in the following syntax:

$ nodetool snapshot --table table_name keyspace_name

For example, suppose we wish to backup the table sample_table from the Linuxhint keyspace. We can run the following command:

$ nodetool snapshot --table sample_table linuxhint

Conclusion

In this post, you learned how to use the nodetool snapshot command to take snapshots of various objects in your Cassandra cluster.

Thanks for reading!

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list