If you are just getting started with Apache Cassandra databases, you will need to understand how to customize various parameters for your cluster.
In this post, we will walk you through various important parameters you will need to know when editing your Cassandra configuration file.
Keep in mind that the properties discussed in this post do not reflect the full capabilities of a Cassandra configuration.
Cassandra.yml Configuration File
When making customizations to your cluster, you will mostly be working with a cassandra.yaml file. This file contains properties and values that define the functionality of the cluster. It is good to stick to YAML rules otherwise it may lead to errors.
By default, the cassandra.yaml file is located in /etc/cassandra directory. However, if Cassandra is installed via archive, you can find the configuration file in install_dir/conf directory.
After making any changes in the configuration file, you need to start the nodes in the cluster for the changes to take effect.
Let us now dive in and discuss various properties and what they do.
Cassandra Configuration File Properties
The following are some of the properties you need to know for basic Cassandra cluster configuration.
- cluster_name – this property defines the name of your cluster. The default name for any cassandra cluster is set to “Test Cluster”. Ensure that all nodes share a similar cluster name.
- listen_address – this property defines the IP address or hostname of the Cassandra node. Cassandra does not recommend to set the address to 0.0.0.0
- listen_interface – this defines the default interface on which Cassandra will bind to when connecting to other nodes in the cluster.
- listen_interface_prefer_ipv6 – By default Cassandra will use IPv4 interfaces. If this property is set to True, Cassandra will prioritize IPv6 addresses.
- commitlog_directory – defines the directory where Cassandra will store the commit logs. By default, this value is set to /var/lib/cassandra/commitlog or install_dir/data/commitlog.
- data_file_directories – specifies the location where SSTable data is stored. By default, this is set to /var/lib/cassandra/data or install_dir/data/data
- saved_caches_directory – defines the location where table and row cache is stored. Defaults to /var/lib/cassandra/saved_caches or install_dir/data/saved_caches
- cdc_raw_directory – sets the location of the CDC log files. Defaults to /var/lib/cassandra/cdc_raw or install_dir/data/cdc_raw
- authenticator – allows you to specify the authenticator backend. This is responsible for user authentication. The supported values include:
- AllowAuthenticator – disables user authentication in Cassandra.
- PasswordAuthenticator – allows cassandra to use username and password authentication as stored in system_auth.roles table.
- Authorizer – this allows you to specify the authorizer backend which is responsible for access limits and user/role permissions. Cassandra supports the following authorizer backends
- AllowAuthorizer – disables authorization allowing any action to any user in the cluster.
- CassandraAuthorizer – checks the permissions stored in system_auth.permissions table to determine which permissions are allowed for which user/role.
- commit_failure_policy – specifies the policy for commit disk failures. Accepted values include:
- die – shuts down gossip and thrift. It also kills the JVM to avoid node replacement.
- stop – shuts down node and thrift.
- stop_commit – shuts down the commit log.
- ignore – ignores fatal errors and allows batch fail.
- disk_failure_policy – sets the rules on how Cassandra replies to disk failure. Accepted values include:
- die – kills the JVM and shuts down the gossip and thrift.
- stop_paranoid – kills the gossip and thrift, SSTable inclusive.
- stop – shuts down thrift and gossip
- best_effort – tells Cassandra to avoid using failed disk but instead respond from remaining SSTables.
- ignore – ignore fatal errors.
- rpc_address – defines the address for client connections.
- rpc_interface – specifies the listen interface for Thrift RPC service.
- enable_user_defined_functions – allows Cassandra to support UDF. This feature is disabled by default.
- incremental_backups – allow Cassandra to take incremental backups
- snapshot_before_compaction – specifies whether Cassandra will take snapshots before compactions.
Conclusion
This post describes some of the most common configuration properties when working with the cassandra.yaml configuration file. Ensure to check the complete documentation on the Cassandra configuration option to learn more.