Data comparison in SQL is a common task that every database developer will come across occasionally. Luckily, data comparison comes in a wide variety of formats such as literal comparison, Boolean comparison, etc.
However, one of the real-world data comparison scenarios that you might encounter is comparison between two tables. It plays a crucial role in tasks such as data validation, error identification, duplication, or ensuring data integrity.
In this tutorial, we will explore all the various methods and techniques that we can employ to compare two database tables in SQL.
Sample Data Setup
Before we dive into each of the methods, let us set up a basic data setup for demonstration purposes.
We have two tables with sample data as shown in the example.
Sample Table 1:
The following contains the queries for creating the first table and inserting the sample data in the table:
employee_id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(50),
last_name VARCHAR(50),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
INSERT INTO sample_tb1 (first_name, last_name, department, salary)
VALUES
('Penelope', 'Chase', 'HR', 55000.00),
('Matthew', 'Cage', 'IT', 60000.00),
('Jeniffer', 'Davis', 'Finance', 50000.00),
('Kirsten', 'Fawcet', 'IT', 62000.00),
('Cameron', 'costner', 'Finance', 48000.00);
This should create a new table called “sample_tb1” with various information such as names, department, and salary.
The resulting table is as follows:
Sample Table 2:
Let us proceed and create two sample tables. Assume that this is a backup copy of the first table. We can create the table and insert a sample data as shown in the following:
employee_id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(50),
last_name VARCHAR(50),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
INSERT INTO sample_tb2 (first_name, last_name, department, salary)
VALUES
('Penelope', 'Chase', 'HR', 55000.00),
('Matthew', 'Cage', 'IT', 60000.00),
('Jeniffer', 'Davis', 'Finance', 50000.00),
('Kirsten', 'Fawcet', 'IT', 62000.00),
('Audrey', 'Dean', 'Finance', 48000.00);
This should create a table and insert the sample data as specified in the previous query. The resulting table is as follows:
Compare Two Tables Using Except
One of the most common ways of comparing two tables in SQL is using the EXCEPT operator. This finds the rows that exists in the first table but not in the second table.
We can use it to perform a comparison with the sample tables as follows:
FROM sample_tb1
EXCEPT
SELECT *
FROM sample_tb2;
In this example, the EXCEPT operator returns all distinct rows from the first query (sample_tb1) that do not appear in the second query (sample_tb2).
Compare Two Tables Using Union
The second method that we can use is the UNION operator in conjunction with the GROUP BY clause. This helps to identify the records that exist in one table, not in the other, while preserving the duplicate records.
Take the query that is demonstrated in the following:
employee_id,
first_name,
last_name,
department,
salary
FROM
(
SELECT
employee_id,
first_name,
last_name,
department,
salary
FROM
sample_tb1
UNION ALL
SELECT
employee_id,
first_name,
last_name,
department,
salary
FROM
sample_tb2
) AS combined_data
GROUP BY
employee_id,
first_name,
last_name,
department,
salary
HAVING
COUNT(*) = 1;
In the given example, we use the UNION ALL operator to combine the data from both tables while keeping the duplicates.
We then use the GROUP BY clause to group the combined data by all the columns. Finally, we use the HAVING clause to ensure that only the records with a count of one (no duplicates) are selected.
Output:
This method is a little more complex but it provides a much better insight as you get the actual data that is missing from both tables.
Compare Two Tables Using INNER JOIN
If you have been thinking, why not use an INNER JOIN? You would be on point. We can use an INNER JOIN to compare the tables and find the common records.
Take the following query for example:
sample_tb1.*
FROM
sample_tb1
INNER JOIN sample_tb2 ON
sample_tb1.employee_id = sample_tb2.employee_id;
In this example, we use an SQL INNER JOIN to find the records that exists in both tables based on a given column. Although this works, it can sometimes be misleading as you are not sure whether the data is actually missing or present in both tables or just in one.
Conclusion
In this tutorial, we learned about all the methods and techniques that we can employ to compare two tables in SQL.