
March 6, 2025 07:04 by
Peter
Using ROW_NUMBER() and PARTITION BY (Preferred Approach)
The ROW_NUMBER() function assigns a unique row number to each record within a partition (group). We can use this to identify and delete duplicates while keeping only the required data.
Query Syntax
WITH CTE AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY id) AS row_num
FROM table_name
WHERE condition -- Apply filtering condition here
)
DELETE FROM CTE
WHERE row_num > 1;
Example
Consider a Customer table with duplicate entries based on Email.
Removing Duplicates While Keeping the First Entry.
;WITH CTE AS (
SELECT ID
FROM (
SELECT ID,
ROW_NUMBER() OVER (PARTITION BY NAME, Email, City ORDER BY ID) AS RN
FROM Customers
WHERE City = 'NY' -- Only NY state filtering condition
) AS sub
WHERE RN > 1
)
DELETE FROM Customers
WHERE ID IN (SELECT ID FROM CTE);
Explanation of the Query
Identifies Duplicates
- ROW_NUMBER() OVER (PARTITION BY Name, Email, City ORDER BY ID)
- Assign a row number (RN) for each duplicate group, keeping the first record (RN = 1).
Filters Out Duplicates (RN > 1): Only marks duplicate records where City = 'NY'.
Deletes Duplicate Records: Deletes only IDs from the CTE that have RN > 1
This syntax will be useful when we are joining more tables and deleting duplicates from one specific table.
HostForLIFE.eu SQL Server 2022 Hosting
HostForLIFE.eu is European Windows Hosting Provider which focuses on Windows Platform only. We deliver on-demand hosting solutions including Shared hosting, Reseller Hosting, Cloud Hosting, Dedicated Servers, and IT as a Service for companies of all sizes.
