An aggregate function takes a set of values, performs a specific function, and returns a single value. An example includes calculating the average of a given set of values.
Cassandra provides us with a set of native functions that allows us to perform common aggregation operations. Join us in this tutorial as we explore these types of functions and how we can use them.”
Aggregate Function Syntax
The following shows the simple syntax of a Cassandra aggregate function:
The aggregate function will then take the values of the specified column, perform the operation and return the resulting value.
Cassandra Aggregate Functions
Cassandra supports the following aggregate functions:
AVG Function
The AVG function allows you to calculate the average value of the given column. The function syntax is as shown:
The function will ignore any NULL values in the column. However, the function supports data types such as int, double, float, biging, tinyint, smallint, and decimal.
COUNT Function
As the name suggests, this function returns the number of records that do not contain null values. The function syntax is as shown:
The function will work on Cassandra’s data types.
MIN Function
This function will return the smallest value in a given column. The function syntax is as shown:
Keep in mind that this function will only work on numerical data types.
MAX Function
This function will return the largest value in a given column. Similarly, the function will work on numerical data types and ignore null values.
The function syntax is as shown:
SUM Function
Finally, the sum function will return the sum of numerical values in a given column.
The function syntax is as shown:
Let us now look at examples of how to use these functions.
Suppose we have a table as shown:
id int,
product_name text,
price int,
qty int,
primary key(id)
);
begin batch using timestamp 1664053851
insert into products(id, product_name, price, qty)
values (1, 'product_1', 100, 5640);
insert into products(id, product_name, price, qty)
values (2, 'product_2', 800, 550);
insert into products(id, product_name, price, qty)
values (3, 'product_3', 500, 5550);
insert into products(id, product_name, price, qty)
values (4, 'product_3', 150, 540);
insert into products(id, product_name, price, qty)
values (5, 'product_4', 160, 260);
insert into products(id, product_name, price, qty)
values (6, 'product_5', 130, 5640);
apply batch;
Finding Average Value
We can use the avg function to calculate the average price of a product in the table as shown in the query below:
Find the Number of Rows
To determine the number of non-null records, run the query:
Output:
----------------------------
6
(1 rows)
Find Minimum Value
To determine the lowest price from the products table, run the following:
Find Maximum Value
To find out the most expensive product on the list, run:
Finding Total
Calculate the total price:
Conclusion
Aggregates are some of the most important features that Cassandra allows you to include in your cluster. Instead of manually performing recurring operations, you can define an aggregate that can then be recalled on multiple objects or even used in a different cluster. We hope this tutorial gave you a first foot into Cassandra aggregates.