'Finding duplicate rows in table with many columns

I have a table with 122 columns and ~200K rows. There are duplicate rows in this table. What query can I use to return these duplicate rows? Normally I would GROUP BY all rows and COUNT, but with 122 columns, this becomes unwieldy.

Basically, I'm looking for a query that does this:

SELECT *, COUNT(*) AS NoOfOccurrences
FROM TableName
GROUP BY *
HAVING COUNT(*) > 1
sql


Solution 1:[1]

If you are using SSMS you can right-click on the table and pick "Select Top 1000 rows..." - SSMS will generate the select query for you - then all you have to do is add GROUP BY then copy the column list and paste it after that. Add the HAVING COUNT(*) > 1 and the COUNT(*) AS NoOfOccurrences and run.

I suggest that you include an ORDER BY clause as well so that the duplicated rows are displayed together in your results.

If you are not using SSMS then you can run this dynamic SQL

-- Get the list of columns
-- There are loads of ways of doing this
declare @colList nvarchar(MAX) = '';
 
select  @colList = @colList + c.[name] +','
from sys.columns c
join sys.tables t on c.object_id =t.object_id
where t.[name] = 'tblFees';

-- remove the trailing comma
set @colList = LEFT(@colList,LEN(@colList)-1);

-- generate dynamic SQL
declare @sql1 nvarchar(max) = 'SELECT *, COUNT(*) AS NoOfOccurrences FROM TableName GROUP BY '
declare @sql2 nvarchar(max) = ' HAVING COUNT(*) > 1'
declare @sql nvarchar(max) = CONCAT(@sql1,@colList, @sql2)
--print @sql

-- run the SQL
exec sp_executesql @sql

For other ways of generating comma separated lists see Converting row values in a table to a single concatenated string

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 CHill60