Apache Cassandra is one of the leaders in the wide-column NoSQL databases. In addition, it is a free and open-source solution for schemeless data. As a result, Cassandra is a popular choice in environments that need to handle extensive scale data with minimal latency and high-availability features.
Cassandra provides the Cassandra Query Language with a powerful query language for interacting with Cassandra keyspaces. CQL is closely similar to SQL and provides similar commands and syntax for similar operations. This makes Cassandra easily approachable by developers from relational databases.
In this tutorial, we will learn how to use the SELECT command in Cassandra. The SELECT command allows us to fetch the records from a Cassandra table.
Let’s dive in.
Command Syntax
The command follows a relatively simple syntax. The command syntax is as shown in the following snippet code:
FROM [keyspace_name.] table_name
[WHERE partition_value
[AND clustering_filters
[AND static_filters]]]
[ORDER BY PK_column_name ASC|DESC]
[LIMIT N]
[ALLOW FILTERING]
The command then returns the specified rows from the target table:
NOTE: Since Cassandra is designed to handle a large-scale data, it is not recommended to fetch all the rows from a table as you would in a relational database.
Sample Data
Before learning to select the data from a table using the SELECT statement, let’s set up a sample data for illustration.
Start by creating a sample keyspace:
Switch the current keyspace using the following command:
Create a sample table using the following command:
Add a sample data as shown:
cqlsh:linuxhint> INSERT INTO users(id, username, email, active) VALUES (1, 'user2', '[email protected]', 1);
cqlsh:linuxhint> INSERT INTO users(id, username, email, active) VALUES (2, 'user3', '[email protected]', 0);
cqlsh:linuxhint> INSERT INTO users(id, username, email, active) VALUES (3, 'user4', '[email protected]', 0);
cqlsh:linuxhint> INSERT INTO users(id, username, email, active) VALUES (4, 'user5', '[email protected]', 1);
Examples
The following section provides the examples of performing the common select statements in a Cassandra cluster.
Example 1: Select All Record
In Cassandra, we can select all the records from a table using the query as shown:
We use the SELECT statement in this query to select all the user table columns.
The previous query should return the output as shown:
----+--------+---------------+----------
1 | 1 | user2@mail.to | user2
0 | 1 | user1@mail.to | user1
2 | 0 | user3@mail.to | user3
4 | 1 | user5@mail.to | user5
3 | 0 | user4@mail.to | user4
NOTE: As mentioned, this is not recommended for a large dataset.
Example 2: Select Specific Columns
We can select specific columns from a table by specifying the columns as comma-separated values.
For example, to select the id, username, and email columns, we can run the following command:
In this case, the command should return an output as follows:
----+---------------+----------
1 | user2@mail.to | user2
0 | user1@mail.to | user1
2 | user3@mail.to | user3
4 | user5@mail.to | user5
3 | user4@mail.to | user4
(5 rows)
Example 3: Using Column Aliases
To make the output more readable, we can use the column aliases that give custom names to the output result.
For example:
The AS keyword in the query allows you to rename the column to an alias name without making permanent changes to the table.
For example, the previous query returns the result as:
------------+---------------+------------
1 | user2@mail.to | user2
0 | user1@mail.to | user1
2 | user3@mail.to | user3
4 | user5@mail.to | user5
3 | user4@mail.to | user4
(5 rows)
Note that the id, email, and username columns are renamed to user_index, email_address, and identifier, respectively.
Example 4: Limiting the Number of Rows
Using the LIMIT option, we can also limit the number of records returned by the SELECT query. The LIMIT clause takes an integer representing the number of rows to retrieve.
For example, to fetch the first three rows, we can run the following command:
The resulting output is as follows:
----+--------+---------------+----------
1 | 1 | user2@mail.to | user2
0 | 1 | user1@mail.to | user1
2 | 0 | user3@mail.to | user3
(3 rows)
Example 5: Filtering Data
We can also fetch the data that matches a specific condition using the WHERE clause. This allows us to specify a condition that returns true/false and returns the matching values.
For example, to fetch all the rows where the value of the active column is equal to 1, we can do the following:
The previous query should return only the rows where the active column = 1.
----+--------+---------------+----------
1 | 1 | user2@mail.to | user2
0 | 1 | user1@mail.to | user1
4 | 1 | user5@mail.to | user5
(3 rows)
NOTE: The ALLOW FILTERING command allows Cassandra to use the data filtering features.
RECOMMENDATION: Only use the ALLOW FILTERING for development purposes only.
Conclusion
In this article, you learned the fundamentals of selecting the data from a Cassandra table using the CQL SELECT statement.
Thanks for reading!