MySQL MariaDB

Using LAG Function in MySQL

MySQL version 8.0 introduced the MySQL window functions, allowing you to perform queries in an easier and organized method. Thereby, increasing the processing and performance. Such functions include: RANK(), ROW_RANK(), LAST_VALUE(), and many more.

In this tutorial, we shall focus on using one of the MySQL functions: LAG(). It is a window function that allows you to access and fetch the value of previous rows from the current row within the same result set.

Basic Syntax

The general syntax for using MySQL LAG() function is:

LAG(Expression, OffSetValue, DefaultVar) OVER (
    PARTITION BY [Expression]
    ORDER BY Expression [ASC|DESC]
);

Let us take a moment to explain some of the parameters in the LAG() function syntax.

They are as follows:

Expression: This is the value returned by the function from the row that leads the current row by the offset value specified.

OffSetValue: This value represents the number of rows preceding the current row from which to get the value. This value must be a 0 or a value higher than 0.

NOTE: The value of 0 represents the current row.

DefaultVar: This value is returned as the default value by the function if no preceding row exists. If the default value is undefined in the function parameter and no preceding row exists, the function returns a NULL value.

PARTITION BY: The PARTITION BY clause divides the rows in a logical partition set. The LAG function is then applied to the divided partitions.

ORDER BY: As usual, this value specifies the order of the rows in the available partitions.

Example Use Cases

Let us look at example use cases of the LAG() function to understand how it works. Start by creating a sample database called sample_db;

DROP DATABASE IF EXISTS sample_database;
CREATE DATABASE sample_database;
USE sample_database;
DROP TABLE IF EXISTS users;
CREATE TABLE users
(
    id          INT PRIMARY KEY AUTO_INCREMENT,
    name        VARCHAR(255),
    Score       INT,
    Enroll_Date DATE
);
INSERT INTO users(id, name, score, enroll_date)
VALUES (1, "Alexandra", 99, '2021-01-10'),
       (2, "Jacob", 81, '2021-05-20'),
       (3, "Leonard", 67, '2020-01-02'),
       (4, "Peter", 88, '2021-03-03'),
       (5, "Amy", 100, '2021-05-05');
 
SELECT * FROM users;

Now that we have a sample database to work with, we can proceed and illustrate how to work with the MySQL LAG function.

Example 1: Lag Function with No Default Value
Consider the example below that applies the Lag function on the Enroll_Date with an offset value of 1.

SELECT *, LAG(Enroll_Date, 1) OVER ( ORDER BY id ASC) as previous_date FROM sample_database.users;

Once we execute the above query, we get a new column previous_date that holds the previous value of the row as specified with an offset value of 1. Since there is no previous value in the first row, the value is null.

NOTE: You can specify the default value if a row does not have a previous value.

The output is as shown below:

Example 2: Lag Function with Default Value
You can also specify a default value for a row where the previous value does not exist. In our example, we will set the default value in the current date.

NOTE: In this example, we will also set the offset value as 2 instead of 1.

Consider the query below:

SELECT *, LAG(Enroll_Date, 2, CURDATE()) OVER ( ORDER BY id ASC) as previous_date FROM sample_database.users;

Once we execute the above query, we are going to get values with an offset value of two and the current date as the default for null values.

The output is as shown below:

Example 3: Lag Function with Partition by
We can use the LAG() function with the partition by clause. This clause first groups the data into various logical subsets and then applies the lag function to the partitions.

Before proceeding, let us see the data in the user’s table. Consider the following query:

INSERT INTO users(id, name, score, enroll_date)
VALUES (1, "Alexandra", 99, '2021-01-10'),
       (2, "Jacob", 81, '2021-05-20'),
       (3, "Leonard", 67, '2020-01-02'),
       (4, "Peter", 88, '2021-03-03'),
       (5, "Amy", 100, '2021-05-05'),
       (6, "Tobias", 100, '2020-06-06'),
       (7, "Kurtzman", 67, '2020-07-10'),
       (8, "Immortal", 50, '2021-03-01'),
       (9, "Anthony", 81, '2021-01-01'),
       (10, "James", 77, '2021-02-03');

Now that we have a table with 10 values, we can partition the data by the score and then apply the lag function.

The above operation is illustrated in the query below:

SELECT *, LAG(Enroll_date, 1, CURDATE()) OVER (PARTITION BY score ORDER BY id ASC) AS previous_date FROM sample_database.users;

In the above query, we start by partitioning the data based on the score and then applying the lag function with an offset value of 1. We also set the default value as the current date. The output result is as shown below:

NOTE: You may also note that the first row of every partition contains the current date, meaning there is no previous value in the set row.

Conclusion

This tutorial has discussed how the LAG() function works to get values of the previous rows in the current row.

To recap:

  • The MySQL function is a window function that gets the value from the previous row based on the offset value specified. Meaning, if the offset value is 1, it gets the value directly above it.
  • By default, the LAG() function uses an offset value of 1, unless explicitly specified.
  • If the data is out of range (no previous value in the offset specified) the value is set to NULL.
  • The LAG() function also accepts the PARTITION BY clause, which groups data into various logical partitions based on the column or condition specified.

Thank you for reading.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list