Without much further ado, let’s dive in.
What Are User Defined Functions (UDF)?
UDF or User Defined Functions are a set of aggregated instructions that are executed inside Cassandra daemons. Functions perform specific tasks, such as computation in the data stored in the keyspaces. Cassandra UDFs support language methods, such as Java, JavaScript, Python, Scala, and Ruby.
Cassandra Create Function Syntax
The following code snippet shows how to use the create function instructions:
We start with the CREATE OR REPLACE FUNCTION statement. This instruction creates a new function if it does not exist. If the function exists, the command will overwrite the instructions of the function.
You can use the IF NOT EXISTS command to hide the errors if you do not include the REPLACE instructions. Therefore, use the OR REPLACE to replace the function IF IT EXISTS and IF NOT EXISTS to suppress any errors.
The variable_name and variable_type are used to define a variable, and the corresponding data type is passed into the code block. To declare multiple variables, you can specify them as a list of a comma-separated list.
The CALLED ON NULL INPUT section runs the provided code block even if the input value is null.
The RETURN NULL ON NULL INPUT allows the function to return NULL on NULL input.
RETURN data_type specifies the return value of the function. This value must be a supported CQL data type.
The LANGUAGE language_identifier section defines the programming language of the function. By default, Cassandra supports Java and JavaScript out of the box. However, you can add support for other languages, such as Ruby, Python, and Scala.
Finally, the “code_block” | $$code_block$$ section defines the code for the function. If the function contains special characters, enclose the code block in dollar signs. Otherwise, enclose the code block in single quotes.
NOTE: Like normal functions, UDFs may result in various exceptions. You can ensure the functions do not fail by implementing error handling with your programming language of choice.
Example
The following example shows how to create a function that returns the maximum value between numerical input values:
CALLED ON NULL INPUT
RETURNS int LANGUAGE java AS
$$return Math.max(input1, input2);$$;
Once the function is defined, you can use it on a table as shown below:
FROM UDF_Function_test
WHERE column IN(values);
Conclusion
In this post, we covered the basics of creating UDF functions. We also illustrated how to create a simple function in Java.