SQL Standard

SQL Cumulative Sum

In SQL, a cumulative sum refers to a method that allows us to calculate the running total of a numeric column within a database table. A cumulative sum comes into play in a wide variety of scenarios such as tracking the progress or calculating the rolling averages of a given value.

In this tutorial, we will learn how to implement and utilize the cumulative sum in SQL using various methods and techniques.

Sample Data

To better understand how we can implement and use the cumulative sum, let us start by setting up a basic table with a sample data. This allows us to demonstrate the workings of all the methods quickly and efficiently in this post.

Start by creating a basic table that stores the product information.

CREATE TABLE products (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(255),
    price DECIMAL(10, 2)
);

 

This should create a new table called “products” that stores the product id, the product name, and the price of each product.

We can then proceed to add a sample data as shown in the following example insert statements:

INSERT INTO products (product_id, product_name, price)
VALUES
    (1, 'Visual Studio Code', 10.00),
    (2, 'Sublime Text', 80.00),
    (3, 'PyCharm Professional', 199.00),
    (4, 'Eclipse IDE', 30.00),
    (5, 'IntelliJ IDEA Ultimate', 699.00),
    (6, 'GitHub Desktop', 20.00),
    (7, 'Jira Software', 10.00),
    (8, 'Xcode', 660.00),
    (9, 'NetBeans', 0.00),
    (10, 'Atom', 60.00);

 

NOTE: The given data is entirely fictional. It does not represent the actual pricing of any of the listed items.

The resulting table is as follows:

SQL Cumulative Sum (Self Join)

One of the techniques that we can use to perform a cumulative sum on a given column is using the self-join method. One advantage of this method is that it works in nearly all SQL databases even those that do not support the Window functions.

Take for example the previous “products” table. We can create the cumulative sum of the price column as demonstrated in following query:

SELECT
    p1.product_id,
    p1.product_name,
    p1.price,
    SUM(p2.price) AS cumulative_sum
FROM
    products p1
JOIN
    products p2
ON
    p1.product_id >= p2.product_id
GROUP BY
    p1.product_id,
    p1.product_name,
    p1.price
ORDER BY
    p1.product_id;

 

Did you notice the workings of the query? If not, stick with us as we explain it step by step.

In the given example query, we start by creating two aliases – “p1” and “p2” – for the “products” table which enables us to perform a self-join.

We then proceed to join “p1” and “p2” on the condition that the “product_id” of “p1” is greater than or equal to the “product_id” of “p2”.

In the next step, we call the sum() function which basically calculates the cumulative sum of the prices for each row.

Lastly, we group the results using the “product_id”, “product_name”, and “price” and order the results.

After this operation, we should have a cumulative sum for each record as demonstrated in the resulting table as shown in the following:

As you can see, we get the total sum of all the previous rows. The last row should contain the total sum of all the previous rows.

SQL Cumulative Sum (Window Functions)

A more efficient and practical way of calculating the cumulative sum in SQL is taking advantage of the window functions whenever it is supported.

If you have a database such as the SQL Server, PostgreSQL, or MySQL version 8.0 and above, this is the most useful and recommended method for determining the cumulative sum of a given column.

Take a look at the example as demonstrated in the following:

SELECT
    product_id,
    product_name,
    price,
    SUM(price) OVER (ORDER BY product_id) AS cumulative_sum
FROM
    products;

 

In this case, we start by selecting the “product_id”, “product_name”, and “price” columns from the “products” table.

We then use the SUM() function as a window function using the OVER clause.

In the OVER clause, we specify the ORDER BY clause which defines the order in which the cumulative sum is calculated.

This should return a similar output as shown in the following:

You will notice that using the window functions is more coherent, efficient, and readable compared to using self-joins.

Conclusion

In this tutorial, we learned all about cumulative sums in SQL. We also covered how to use the self-joins and window functions to perform the cumulative sum in SQL.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list