SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software Kodyaz Development Resources
Development resources, articles, tutorials, samples, codes and tools for .Net, SQL Server, Vista, etc.




download SQL Server 2014



T-SQL ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) to Delete Duplicate Rows in SQL Table


If you have duplicate rows in your MS SQL Server database tables, you may want to delete duplicate records.
T-SQL Row_Number() function can help sql developers to solve this sql problem.
Before removing some of the rows in the SQL database table, you should decide which one of the row duplicates you will keep. And you should decide which rows to delete.

You might want to keep the first entry, which can be assumed as the inserted date is earliest.
Or if your sql table has an identity column, let's say an Id column with data type integer (int).
You may want to keep the duplicate row with the smallest Id value, or with the biggest Id value.
So SQL Row_Number enables tsql developers rank records and table rows over desired columns and table fields even enables partitioning the result set based on table columns.





Before giving an sql example for deleting duplicate rows, let's create sql table and populate table with sample duplicate rows.

CREATE TABLE Suppliers
(
  Id int identity (1,1),
  CompanyTitle nvarchar(1000),
  ContactName nvarchar(100),
  LastContactDate datetime
)

Now insert sample data into Suppliers table for removing duplicate records example.

INSERT Suppliers VALUES (N'Melody Music Instruments',N'James Manning', '20090623 10:15')
INSERT Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090720 15:40')
INSERT Suppliers VALUES (N'Top Music',N'Katy Swan', '20090827 18:00')
INSERT Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090806 10:00')
INSERT Suppliers VALUES (N'Melody Music Instruments',N'James Brown', '20080121 11:20')
INSERT Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00')
INSERT Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00')

Order Rows by Datetime Fields

Let's order table rows according to Last Contact Date within each Company record.

SELECT
  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC),
  Id, CompanyTitle, ContactName, LastContactDate
FROM Suppliers

Now we can develop t-sql code to delete rows where RN is different than 1, which we assume duplicates since they are old records. We have a newer row in the sql table Suppliers.
This sql script uses the T-SQL Row_Number() function with Partition By option.
T-SQL developers will soon realize that they can use Partition Over clause in many cases to solve their problems.

WITH [CTE DUPLICATE] AS
(
SELECT
  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC),
  Id, CompanyTitle, ContactName, LastContactDate
FROM Suppliers
)
DELETE FROM [CTE DUPLICATE] WHERE RN > 1

Order Rows by Identity Field

For this example code, we will order sql table rows according to the Identity field column Id within each Company record.

SELECT
  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY Id DESC),
  Id, CompanyTitle, ContactName, LastContactDate
FROM Suppliers

And within each group that is formed by the Company field, we will only keep the row with biggest Identity field value (Id column value), and remove others by deleting them.

WITH [CTE DUPLICATE] AS
(
SELECT
  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY Id DESC),
  Id, CompanyTitle, ContactName, LastContactDate
FROM Suppliers
)
DELETE FROM [CTE DUPLICATE] WHERE RN > 1

Summary - Delete Duplicate Rows in SQL Table

It is very easy to identify duplicate rows and delete duplicates due to your requirements by using a partition over clause, ROW_NUMBER() OVER (PARTITION BY columnname1, columnname2 ORDER BY columnname3 DESC) and using it within a CTE t-sql statement as shown in the above t-sql examples.
I hope you enjoy the sql examples and find them useful.


Additional Tutorials for Dublicate Rows

If you have identical rows, I mean all column values are identical, how can we identify the duplicate rows and delete duplicate records from database table using t-sql commands. In this case there is still methods for sql administrators and t-sql developers to delete duplicate rows in sql tables. You can find an other sql tutorial titled How to delete duplicate records or rows among identical rows in a table where no primary key exists among Kodyaz SQL articles.






Follow Kodyaz on Twitter

Related SQL Resources

SQL Server Articles

SQL Server 2012

SQL Server Tools

SQL Blog

SQL Server 2008 Blog

Certification Exams Blog

Reporting Services Blog

Analysis Services Blog

MS SQL Server Forums



Free Exam Vouchers









Copyright © 2004 - 2014 Eralper Yilmaz. All rights reserved.
Community Server by Telligent Systems