AWS

Redshift AVG Function

The Redshift AVG function is one of the aggregate functions to perform queries on Redshift. It is used to compute the arithmetic mean or average of the input numeric expression. As the AVG function calculates the arithmetic mean, it can only accept numeric values as input expression. If a NULL value is passed to the AVG function as input, it ignores the NULL value and computes the arithmetic mean of the remaining expressions or values. In this blog, we will discuss how the AVG function can be used in the Redshift to calculate the arithmetic mean of the numeric values passed as input expression.

Syntax to use the AVG function

The syntax to use the Redshift AVG function to calculate the arithmetic mean is as follows:

SELECT AVG ( [ ALL | DISTINCT ] expression )

The DISTINCT parameter is used when it is required to calculate the average of only unique values in a column. It is used to eliminate all repetitive values during calculating the average of the input expression.

The ALL parameter is used when it is required to calculate the average of all input expressions even if values are repetitive. By default, when it is not specified, the ALL parameter is set.

The expression is the targeted column in the table on which you want to apply the AVG function to get the arithmetic mean.

Examples of using the AVG function

In this section, we will take a few examples to fully understand how the AVG function in Redshift works. These are a few use cases examples of the Redshift AVG function:

AVG function using ALL parameter

In this example, we will apply the AVG function to calculate the arithmetic mean using the ALL parameter on the expenses table as shown below. The below table includes three columns week_number, week_day and expense_money.

week_number week_day expense_money
Week 01 monday 10
Week 02 tuesday 20
Week 03 wednesday 30
Week 04 thursday 40
Week 05 friday 50
Week 06 saturday 60
Week 07 sunday 70

If we manually calculate the average or arithmetic mean of the expense_money column from the above table, it will be 40.

In order to calculate the arithmetic, mean by using the AVG function, apply the AVG function on the third column i.e, expense_money.

SELECT AVG(ALL expense_money)

FROM expenses

The AVG function will return the following arithmetic mean of the expense_money column of the expenses table.

Avg

40

AVG function using the DISTINCT parameter

In this example, we will apply the AVG function using the DISTINCT parameter on the expense_money column of the expenses table. Using the DISTINCT parameter, the AVG function will only calculate the arithmetic mean of the unique values ignoring the repetitive values.

week_number week_day expense_money
Week 01 Monday 10
Week 02 Tuesday 10
Week 03 Wednesday 30
Week 04 Thursday 30
Week 05 Friday 50
Week 06 Saturday 50
Week 07 Sunday 70

If we manually calculate the average or arithmetic mean of the distinct values in the expense_money column, we will get 25.

Apply the AVG function on the third column i.e., expense_money to get the arithmetic mean of unique values.

SELECT AVG(distinct expense_money)

FROM expenses

The AVG function will return the following arithmetic mean of the expense_money column of the expenses table.

avg

25

AVG function with Default parameter

In this example, we will apply the AVG function using the default parameter on the expense_money column of the expenses table to get the arithmetic mean.

week_number week_day expense_money
Week 01 monday 23
Week 02 tuesday 47
Week 03 wednesday 53
Week 04 thursday 80
Week 05 Friday 69
Week 06 saturday 21
Week 07 sunday 19

When we apply the AVG function on the third column i.e, expense_money of the expenses table, we will get the following result.

SELECT AVG(expense_money) as avg_price

FROM expenses

The AVG function will return the following arithmetic mean of the expense_money column of the expenses table.

avg_price

44

AVG function with NULL values

In this example, we will see how the AVG function works when NULL values are passed to it as input expression. The AVG function does not consider the null values during calculation and calculates the arithmetic mean of remaining values. Let’s take the example using the null values in the expenses table as shown below.

week_number week_day expense_money
Week 01 monday 23
Week 02 tuesday 47
Week 03 wednesday NULL
Week 04 thursday 80
Week 05 friday NULL
Week 06 saturday 21
Week 07 sunday 19

Apply the AVG function on the third column i.e, expense_money of the expenses table to get the arithmetic mean.

SELECT AVG(expense_money) as avg_price

FROM expenses

The AVG function will return the following arithmetic mean of the expense_money column of the expenses table.

avg

38

AVG function with WHERE statement

In this example, we will see how the AVG function works when you apply a WHERE statement in your query. The expenseweek table in the Redshift cluster is shown below and we will calculate the arithmetic mean or average of the expense_money for week 01 only.

weeknumber weekday expense_money
Week 01 monday 23
Week 01 tuesday 47
Week 01 wednesday 57
Week 01 thursday 80
Week 01 friday 17
Week 01 saturday 21
Week 01 sunday 19
Week 02 monday 123
Week 02 tuesday 97
Week 02 wednesday 89

Now, in order to calculate the arithmetic, mean of the expense_money for only Week 01, we will apply a WHERE statement along with the AVG function.

Apply the AVG function on the third column i.e, expense_money of the expenseweek using the WHERE statement to get the arithmetic mean of only Week 01.

SELECT AVG(expense_money) as avg_price

FROM weekexpense

Where weeknumber= 'Week 01'

The AVG function will return the following arithmetic mean of the expense_money column of the expenseweek table.

avg_price

37

Conclusion

In this article, we have discussed how to use the AVG function in Redshift to calculate the arithmetic mean or average of numeric expressions. The AVG function can be used with the ALL or DISTINCT option to calculate the arithmetic mean of unique or all input expressions. It can also be used with the WHERE statement to get the arithmetic mean of specific entries from the column.

About the author

Zain Abideen

A DevOps Engineer with expertise in provisioning and managing servers on AWS and Software delivery lifecycle (SDLC) automation. I'm from Gujranwala, Pakistan and currently working as a DevOps engineer.