Grouping Sets in SQL
As discussed above, SQL allows us to group columns to provide compiled outputs in the rows of our database table. When we use tables in our database, we focus on the values to have a better understanding. Still, the complexity and scalability of the data make it a hassle for us to analyze the data. This is where the concept of grouping sets comes in handy, through which we simplify the columns according to our needs and then analyze the simplified data with ease.
The concept of grouping sets is accomplished by the command “GROUP BY” and “GROUPING SETS”. Although the “UNION ALL” command in SQL may do the same task, it requires a lot of sub-queries. It imposes various limitations, making it inconvenient for the developer and inferior to the idea of grouping sets, which is more convenient and exact with precision. The “UNION ALL” command demands the exact number of attributes with identical data types for each column, so we must write “NULL” at every such state in our code, which increases the number of query lines.
In newer updates of this language, the “GROUP BY” command is followed by the “GROUPING SETS” command, which does not require several sub-queries like the UNION ALL command and groups the set into a simplified form. For implementing this concept, we will have to follow the general syntax that is written below:
GROUPING SETS (
We used several commands in the previous syntax:
- The first is “SELECT,” in which we choose the columns to compute and receive the result and the aggregate function for computing and displaying the output in the last column. The aggregate function is a simple method of aggregating numbers in each row in the two columns: SUM, AVG, MIN, and MAX.
- Second, the “FROM” command indicates the name of the table in our database that we are using for compiling outputs. After that, the “GROUP BY” command is used in which there is an indication of how we want the columns to be compiled, and the order of the output is also defined in the same expression.
Grouping Sets in PostgreSQL
As we are familiar with the syntax of grouping sets now, we can look into several examples to have a good knowledge of implementing this concept in PostgreSQL. We can perform grouping with:
- Two columns of a table
- Three columns of a table
- Order by command
Grouping Two Columns of a Table
This example will refer to a simpler grouping set of only two columns. We will first create a table named Records with attributes, such as Roll number, name of the course, and marks, and with the help of the given code in the following snippet:
After creating a table, we will insert some data values in the rows of different columns that we will compile in the further grouping sets command:
Now, we will write the following commands that assist in the grouping sets in SQL. The student’s roll number will be grouped by the sum of all the marks in all the courses:
When we execute the previous code, we will get the following output:
As the output suggests, all the rows are grouped concerning the sum of all the courses’ marks. The sum of all the marks is shown against the same roll number. We can execute this function even with more than one roll number as it will group the same roll number of student’s marks in the second column against the given roll number.
Grouping Three Columns of a Table
In this example, we will create a table in our database of 4 different columns and use the grouping set concept on the table. First, we will have to execute the following code to create a table.
Then, we will insert a set of values for two different students for several courses, the hours required for the course, and then the marks of that course.
After executing and successfully adding the previous data, we will move to the grouping of this data concerning the roll number of the student and the name of the course. To do that, we will write and execute the code in the following snippet:
The output of the previous code is displayed in the following image:
As you can see, we have simplified the complex table into a simplified and desired one by grouping it into roll numbers and course names and adding the marks in case of repetition. This happened in the roll number “205,” whose Math’s marks were added twice since they were repeated.
We can also group the previous columns by only the course name and get the sum of the marks in each course. For that, we will write the following query and then execute it:
In this query, we only grouped the “Course Name” column, and the “marks” column was grouped by the “SUM” aggregate function. Once this query is executed, the output would be like this:
The result shows a more refined overview of the table, and this points to the fact that the concept of Grouping sets gives us a very simplified version of the table to help us in our analysis.
Grouping Sets With the “ORDER BY” Command
We can make sets of the columns and get the desired simplified version of the table, but sometimes it still seems disoriented. So we can further simplify this by using the “ORDER BY” command, which will prioritize the attribute according to our specification. After the “GROUP BY” command, this command will be executed. To better grasp the “ORDER BY” command with Grouping sets, we will use the previous example to apply it. As we successfully grouped the columns in the table in the last example, we will order it by the Course name and the Marks attribute.
In this query, we have placed the output to be in alphabetical order of the Course Names as the output as shown below:
We can also order the output in descending order of the Marks by writing this query:
In this tutorial, we learned about the concept of Grouping sets in SQL. The advantage of using the “GROUP BY” command over the “UNION ALL” command was also discussed in this article. After that, we also implemented this concept of grouping sets in the PostgreSQL environment which made us have a better knowledge of this important concept in the Structured Query Language. Last, we looked into different ways of implementing this concept by changing some factors and even amplifying the data sets to monitor the output efficiency. We hope you found this article helpful. Check out other Linux Hint articles for more tips and information.