When working with databases, you often encounter instances where you need to find duplicate rows in a given database table. Finding duplicates can allow you to determine the ratio of identical values to all the values in the table.
You can then use the information gathered to act, such as removing duplicate values or moving them to a temporary table, etc.
In this post, we will learn how we can find duplicate values in an Oracle table using Oracle aggregate or analytic functions.
Setting Up Sample Table and Data
Let us setup a sample data for illustration purposes to best understand how to use the Oracle functions to find duplicate values.
If you have an existing table you wish to use, you can skip this section. Otherwise, run the statement below to create a sample table:
(
id number not null,
customer varchar2(50) not null,
order_date date,
product_id number,
quantity number,
price number,
constraint order_info_pk primary key (id)
);
Next, add sample data with duplicate values as shown:
values (1, 'Beatriz', DATE '2022-10-10', 4500, 45, 56);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (2, 'Ali', DATE '2022-10-10', 5400, 65, 109);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (3, 'Gabriel', DATE '2022-10-11', 6400, 405, 586);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (4, 'Beatriz', DATE '2022-10-11', 5800, 55, 550);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (5, 'Beatriz', DATE '2022-10-12', 4506, 46, 700);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (6, 'Gabriel', DATE '2022-10-11', 9001, 450, 5600);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (7, 'Beatriz', DATE '2022-10-10', 4500, 45, 56);
insert into orders_information(id, customer, order_date, product_id, quantity, price)
values (8, 'Ali', DATE '2022-10-10', 5400, 65, 109);
Once the table and sample data are ready, we can learn how to find the duplicate rows.
Oracle Find Duplicate Rows – Analytic Function
In oracle, we can use the count() function in conjunction with over and partition by clause to determine the number of duplicate records in a given table.
Take the example statement shown below.
from orders_information orders;
The query above should return the row and the number of times it occurs in the table as shown:
To get only the records that appear more than once, we can use a where clause as:
from orders_information orders)
select *
from item_count
where number_of_occurence > 1;
Output:
Oracle Find Duplicate Values – Aggregate Function
Another method we can use to find duplicate records in a table is the GROUP BY clause. We can then determine the number of times a given record occurs in each group using the count() function. If a row appears more than once, it is a duplicate.
An example is as shown:
from orders_information
where (customer, order_date) in
(select customer, order_date from orders_information group by customer, order_date having count(*) > 1)
order by customer, order_date;
The resulting information is as shown:
As we can see, the query allows us to get duplicate rows from the target table.
Conclusion
In this tutorial, you discovered two main methods of determining duplicate rows from a given database.