SQL Standard

SQL OVER Clause

One of the most advanced features of SQL is the OVER clause. It is a feature that allows us to perform the calculations and apply the SQL window functions over a specific subset of rows within a given result set.

It is particularly useful when you need to calculate the aggregations or rankings for groups of rows without actually collapsing the entire result set.

Join us in this tutorial as we learn everything about there is to know to get you started working with the OVER clause.

Requirements:

Before we dive into the functionality and workings of the OVER clause, ensure that you have the basics of SQL out of the way. We also assume that you have access to a database that you can use to test out your knowledge.

In our case, we will use the MySQL database with the Sakila sample database. Just ensure that you have sufficient permissions and that your database engine supports the window functions.

Syntax:

As we mentioned earlier, in most cases, we mainly use the OVER clause in conjunction with the window functions.

As such, we can express the syntax of the clause as follows:

<window function>(expression) OVER (

  [PARTITION BY partition_expression, ...]
 
  [ORDER BY sort_expression [ASC | DESC], ...]

  [frame_specification]

)

In the given syntax, we can break down each component as follows:

  1. <window_function> – It refers to the window function that we wish to apply over a specific window of rows such as SUM(), AVG(), ROW_NUMBER(), RANK, etc.
  2. Expression – This specifies a column or expression for which the window function is applied.
  3. PARTITION BY – This is an optional clause that divides the result set into partitions where each partition is like a separate unit where the function is applied. Rows within the same partition share the same values in the specified columns.
  4. ORDER BY – This specifies the order in which the rows in each partition is processed.
  5. frame_specification – This is an optional clause that defines the frame of rows within the partition. Common frame specifications include ROWS BETWEEN <start> AND <end> or RANGE BETWEEN <start> AND <end.

With that out of the way, let us explore some practical examples on how to use it.

Example:

Let us demonstrate how to use the clause using the Sakila sample database. Consider an example where we need to determine the total revenue for each film category.

We can use the sum window function with the OVER clause and a bunch of join statements as shown in the following example:

SELECT
    category.name AS category_name,
    film.title AS film_title,
    film.rental_rate,
    SUM(payment.amount) OVER (PARTITION BY category.name) AS total_revenue
FROM
    film
JOIN
    film_category ON
    film.film_id = film_category.film_id
JOIN
    category ON
    film_category.category_id = category.category_id
JOIN
    inventory ON
    film.film_id = inventory.film_id
JOIN
    rental ON
    inventory.inventory_id = rental.inventory_id
JOIN
    payment ON
    rental.rental_id = payment.rental_id
ORDER BY
    category.name,
    film.title;

In the given query, we start by selecting the film title, rental rate, and use the sum (payment.amount over partition by category.name) expression to determine the sum of each category partition by the category name.

We must use the PARTITION BY clause to ensure that the calculation of the sum restarts at each unique category.

The resulting output is as follows:

There you have it!

Conclusion

In this example, we explored the fundamentals of working with the OVER clause in SQL. This is not a basic clause and requires previous familiarity with other SQL features.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list