This article will learn how to create and use array types in Standard SQL. It is good to note that although arrays are part of Standard SQL, databases such as MySQL do not natively support Arrays.
What is an Array?
An array refers to a collection of an ordered list of items. Arrays are very useful and powerful. Learning how to work with an array can help improve performance and provide complex data manipulation techniques.
In database engines such as BigQuery and PostgreSQL, an array is a built-in type that can be used anywhere in the database.
However, unlike PostgreSQL, BigQuery prevents you from creating multidimensional arrays.
SQL Create Array
The simplest way to create an array is to use its literal format. An example is as shown:
[1,
2,
3,
4,
5] AS my_array;
The code above should create an array called my_array with the elements inside the square brackets.
An example output is as shown:
"[1,2,3,4,5]"
The second method you can use to create an array is the generate_array function. This function is only available in BigQuery.
The code below shows how to use the generate_array function to generate an array.
GENERATE_ARRAY(1, 5) AS my_array;
The code above generates an array of items from 1 to 5. The resulting output is as shown:
"[1,2,3,4,5]"
The generate_array function follows the syntax as shown below:
You can use step_expression to set the step size for the generated elements.
Other similar functions include:
- GENERATE_DATE_ARRAY – generate an array of dates
- GENERATE_TIMESTAMP_ARRAY – generates an array of timestamps.
You can check the documentation on array functions to learn more.
Accessing array elements
BigQuery allows us to use either the offset value or the ordinal value to access elements in an array.
The offset is a 0-based value, while the ordinal is 1-based.
Consider an example query below:
my_array[
OFFSET
(0)] AS offset_0,
my_array[ORDINAL(1)] AS ordinal_0
FROM (
SELECT
GENERATE_ARRAY(1, 5) AS my_array );
In the query, we use the generate_array function to generate an array with values starting from 1 to 5.
We then use the offset and ordinal functions to fetch the first element in the array.
The code above should return:
1 1
You can choose any method of array access you wish to use. For example, choose offset if you prefer a 0-based index; otherwise, choose ordinal.
Finding Array Length
To get the length of an array, you can use the array_length function as shown:
ARRAY_LENGTH([1,2,3,4,5]);
This should return the length of the array as:
5
The array’s length refers to the number of elements in the array.
Convert Array into Rows
To convert an array into a set of rows, use the unnest function as shown below:
*
FROM
UNNEST(GENERATE_ARRAY(0, 20, 2)) AS even;
The code above creates an array of even numbers from 0 to 20 and converts them to rows using the unnest function.
The resulting output is shown below:
0
2
--- truncated output---
18
20
You can learn more about the unnest command in the linked tutorial.
Convert Array to String
If you have an array of strings, you can convert it into a string using the array_to_string function.
Example usage is as shown:
ARRAY_TO_STRING(['h','e','l','l','o'], "");
The code above converts the array of strings into a single string. The function syntax is:
Conclusion
This article provided the fundamental knowledge on how to work with arrays in Standard SQL using BigQuery. There is more about the array type beyond this tutorial’s scope. Check the resources below to explore more.
https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays
https://cloud.google.com/bigquery/docs/reference/standard-sql/array_functions
Thanks for reading, and I hope you enjoyed it!!