Finding and removing duplicate data in Excel

Posted April 8, 2015

Finding and removing duplicate data in Excel can be a quick process. However, for many people who haven’t learnt how to do this the quick way in Excel, it can be an absolute chore.

One of the many things I love about my job is that I get to work with so many different people from all kinds of organisations. However, it doesn’t matter what industry or region I’m working in the challenges of working with large lists of data are pretty much the same.

Something I see often is the need to be able to quickly locate and remove duplicate rows of data (excellent for VLookups).  Most people just sort the data and then manually skim through the excessive lines of information trying to weed out the annoying duplicates.  This works, but boy-oh-boy it can be a time waster. Wouldn’t it be easier to have Excel quickly display the duplicates for you?  Then all you need to do is deal with them.

In this post I’m going to walk you through how to quickly locate the duplicates and then, once located, super quick ways in which to remove the duplicates.

The steps below cover:

Locating duplicates in a single column of data

The worksheet below holds data that has been exported from an in-house system.  Unfortunately it has a number of duplicate invoice numbers in the ‘Invoice #’ column.  Using the ‘Conditional Formatting’ feature I can apply formatting to any cells with duplicate content.  All other cells will remain as they are.  This is an excellent way to very quickly identify duplicates.

1 – Select the column that holds the duplicate data.  In this case it’s the ‘Invoice #’ column.

Duplicates in Excel

2 – From the Home tab, in the Styles group click Conditional Formatting.  A drop down menu is displayed.  Select Highlight Cell Rules from the menu.

3 – Select Duplicate Values to quickly identify duplicate or unique records.

Duplicate Excel

4 – The dialog box will display the condition to be matched.  In the example below the condition is ‘Duplicate’.  This setting will ensure any duplicate cells within the selected column will have the formatting applied to them.  The option box on the right displays the format that will be applied if the condition is met.  There are a range of pre-set formats available on the drop-down arrow, or if you would prefer, you can create your own by selecting Custom Format from the drop-down list.

Highlight duplicates in Excel

5 – Select your format options and then click OK.  All duplicates will now be formatted and easily identified.

Excel highlight duplicates

Note: You can also use the Quick Analysis tool (Excel 2013 and 2016 only) to quickly apply Conditional Formatting using default formats.  Just select the data range and then click the Quick Analysis button which will be displayed beside the selected range.  Select ‘Duplicate Values’ option.

Excel delete duplicate rows

Locating rows where the content of all cells is duplicated

In the steps above we located rows where there were only duplicate invoice numbers.  But what if you want to find duplicates where ALL of the data in a row is exactly the same as another?  For example, if you were trying to locate duplicates where the invoice number, date, customer name, product name and cost price where all exactly the same.

Here’s a quick way to do this.

1 – The first thing to do is go to an empty cell to the right of the existing data.  Using the CONCATENATE function I’m going to pull all of the data content into this one cell.

Excel duplicate formula

2 – In the Function Arguments dialog box enter each individual cell for the row into the Text boxes.  Note, as you enter a cell reference into one text box another will appear.  You can have up to 255 references within the function.  Once you have added all of the cells on the row click OK.  The content of all of the cells will be pulled into the one cell.

Delete duplicate rows in Excel

3 – Copy the function down the column and then apply to the same range of data conditional formatting to find the duplicates.  Only the cells where all cell content is the same will be formatted.

Delete duplicates in Excel

Removing duplicate rows of data

Once you have located the duplicates in most cases you will be wanting to remove them.  I see two different methods used again and again.  I’ll outline both of them.  You find your favourite.

Using Filter. If your conditional format applied a coloured background or changed the colour of the cell font you can use the Filter command to quickly filter and display only the duplicates.

1 – Select any cell in the data list and then from the Data tab click Filter.

2 – Click the filter drop-down arrow on the column that is currently displaying duplicate formatting.  From the drop-down list select Filter by color and then select the cell colour or font colour depending on your conditional formats.

Excel delete duplicates

3 – Excel will filter out the unique records and only display the duplicates.

4 – Now delete the duplicate rows.  Tip: to do this quickly hold down your CTRL key and click each of the duplicate row numbers.  Once they are selected from the Home tab click Delete, Delete Sheet Rows.  All selected rows will be deleted.

How to identify duplicates in Excel
5 – You will have now removed the duplicates and will only have the unique records remaining.  All formatting should now be removed as you no longer have duplicate values.

Using the ‘Remove Duplicates’ button you can have Excel quickly strip out any rows of data where duplicates are found.

1 – Select any cell in the data list and then from the Data tab select Remove Duplicates.

2 – Select one or more columns that contain duplicate values.

Excel remove duplicate rows

3 – Click OK.  Excel will delete any rows where duplicate values are found in the selected columns.  Please be aware that Excel will indicate how many records it has found with duplicates.  However you aren’t shown which these are prior to their removal.  If you are wanting to see them prior to removal apply conditional formatting first.

Count duplicates in Excel

Conclusion

There is no need to spend large amounts of time locating duplicates.  Using the steps above you should be able to identify and deal with duplicates swiftly.

Was this blog helpful? Please let us know in the Comments below.

Excel removing duplicates

Finding and removing duplicate data in Excel

Wanting to stop manually skimming through excessive lines of information to weed out duplicates in Excel? Follow this step by step guide to learn how to locate and remove duplicates in your Excel worksheet.

Instructions

Locating duplicates in a single column of data using Conditional Formatting

  1. Select the column that holds the duplicate data.
  2. From the Home tab, in the Styles group click Conditional Formatting. A drop down menu is displayed.
  3. Select Highlight Cell Rules from the menu.
  4. Select Duplicate Values to quickly identify duplicate or unique records.
  5. Leave the option 'Duplicate' in the condition to be matched and select the format you would like from the drop down list.
  6. Click OK. Duplicate data will be displayed with the selected format.

Locating rows where the content of all cells is duplicated

  1. Select an empty cell to the right of the existing data.
  2. On the Formulas tab, Select Text, then CONCATENATE.
  3. In the Function Arguments dialog box enter each individual cell for the row into the Text boxes.
  4. Once you have added all of the cells on the row click OK.
  5. Copy the function down the column and then apply Conditional Formatting to the same range of data to find the duplicates. Only the cells where all cell content is the same will be formatted.

Removing duplicate rows of data

Using Filter

  1. Select any cell in the data list and then from the Data tab click Filter.
  2. Click the filter drop-down arrow on the column that is currently displaying duplicate formatting.
  3. From the drop-down list select Filter by Color and then select the cell colour or font colour depending on your conditional formats.
  4. Delete the duplicate rows.

Using the ‘Remove Duplicates’ button

  1. Select any cell in the data list and then from the Data tab select Remove Duplicates.
  2. Select one or more columns that contain duplicate values.
  3. Click OK.

Notes

Tip: to delete rows quickly hold down your CTRL key and click each of the duplicate row numbers. Once they are selected from the Home tab click Delete, Delete Sheet Rows. All selected rows will be deleted.

Did you try this?

If you found this blog helpful please leave a comment below.

If you enjoyed this post check out the related posts below.

Elevate your Excel game and become a pro with our exclusive Insider Group

Be the first to know about new tutorials, videos, and tips for Microsoft 365 products. Join us now and claim your exclusive bonus, your list of Essential Excel Skills to become proficient in Excel!

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
Skip to Instructions
>