Select or delete duplicate rows from a table

When it comes to database, there are various ways you can select or delete the duplicate rows from a table.

Lets first focus on selecting the duplicate rows:

This can be done in various ways:

Solution 1:

select columnName, count(columnName) as cnt
from tableName
group by columnName
where cnt>1

//columnName is the column that you are checking for duplicates

Solution 2:

select columnName , count(columnName ) as cnt
from tableName
group by columnName
order by cnt desc

Solution 3:

select columnName
from tableName
group by columnName
having count(*) > 1

Now lets see how can we delete the duplicate rows from a table:

Solution 1:

//In this solution we create a new table, with only distinct rows and then
deleting the old table and changing the new table name to the old table
name. However, in this way the primary key gets dropped from the table:

Step 1:

create table new_table as
select * from old_table where 1 group by [column to remove duplicates by];

Step 2: delete delete the old table

drop table old_table;

Step 3: rename the new_table to the name of the old_table

rename table new_table to old_table;

Solution 2:

// This solution doesn't require what all we did and restores
the primary keys

ALTER IGNORE TABLE tableName ADD UNIQUE INDEX(columnName);

If you wish to delete duplicates based on more than one index do:

ALTER IGNORE TABLE tableName ADD UNIQUE INDEX(columnName1,columnName2);
Tags: , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*