30 April 2026

Finding duplicate records in SQL Server

 Finding duplicate records in SQL Server

1. The GROUP BY and HAVING Method

This is the most standard approach. It is best used when you only need to identify which values are duplicated and how many times they appear.

SQL
SELECT Email, COUNT(*) as Count
FROM Employee
GROUP BY Email
HAVING COUNT(*) > 1;

2. Using a Common Table Expression (CTE)

A CTE makes the ROW_NUMBER logic much cleaner and is the standard way to delete duplicates. Since a CTE is a "live" view of the data, deleting from the CTE deletes from the underlying table.

SQL
WITH DuplicateCTE AS (
    SELECT Email, 
           ROW_NUMBER() OVER(PARTITION BY Email ORDER BY EmployeeID) as RowNum
    FROM Employee
)
-- To view duplicates:
SELECT * FROM DuplicateCTE WHERE RowNum > 1;

-- To delete duplicates:
-- DELETE FROM DuplicateCTE WHERE RowNum > 1;


3. Using Dense Rank

WITH RankedDuplicates AS (
SELECT EmployeeID, Email, DENSE_RANK() OVER (ORDER BY Email) as GroupID, COUNT(*) OVER (PARTITION BY Email) as OccurrenceCount FROM Employee ) SELECT * FROM RankedDuplicates WHERE OccurrenceCount > 1 ORDER BY GroupID;

No comments:

Post a Comment

Comments Welcome

Finding duplicate records in SQL Server

 Finding duplicate records in SQL Server 1. The GROUP BY and HAVING Method This is the most standard approach. It is best used when you on...