Microsoft Excel - Excel - Remove duplicate data from different sheets in an excel file.

Asked By Subash D on 12-Nov-14 08:46 AM
I have a database file in Excel with 5 sheets. Each sheet contains specific details of the customers. There seems to be some duplication of customer details in more than One sheet. 

How do I delete such duplication of data leaving the data in only one sheet.
Harry Boughen replied to Subash D on 12-Nov-14 10:58 PM
Hello Subash D

This really is a bit too general.  Are you talking about multiple columns containing data and if so is the whole row a duplicate?

If so, then one approach would be to write a vba macro to search  for duplicates of page1 in the others and delete the matching rows and then repeat for Page2 etc.  When all duplicate rows are removed, move all of the remaining data to a single page.

Another way to do it manually would be to move all of the data to one page and then use the Remove Duplicates function on the Data Tab.

Hope this helps but give more detail (and example data) if it doesn't.

Harry
Subash D replied to Harry Boughen on 13-Nov-14 02:52 AM
Hi Harry,

Thanks for your help.

I have a database of our customers in different sheets in an excel file. Each Sheet runs to around 50000 - 60000 customer's details. The details of the customers would be their names, contact numbers (Mobile No.), Email Id, their interest in a specific product and other details.

Each sheet in the excel is product specific like a particular customer would have shown interest in more than one or two products. Hence his details would be repeated in more than one sheet. 

I am trying to consolidate these details. Since the number of customers details are more, it becomes difficult to club all the customers in a single sheet and remove the duplicates.

Instead I was wondering whether we can delete the duplicates in different sheets, assuming that the names and other details might be similar and removing duplicates based on the name or other details would end up deleting different customers with similar name. Hence was looking to delete duplicates from these sheets keeping the Mail Id as a constant for deleting duplicates. 

Can someone help me with a VBA Code or is there any other procedure which would he help me.

Subash


Harry Boughen replied to Subash D on 13-Nov-14 06:24 AM
Hello Subash,

Without a sample fo your data, it is extremely hard to decide what sort of approach would most likely work for you.

Is it possible for you to upload a cutdown (maybe 100 rows per sheet) and sanitised (sensitive data changed) version of your spreadsheet  together with an indication of the resultant output that you require?

From what you have said, I am guessing that you would want to find unique MailId and extract relevant matching company information to a new sheet without concern for the product and other details.

You refer to the file as a database and, if it is properly set up, it might be possible to use SQL to extract the data that you want.  Otherwise it will likely require a search and extract VBA code module.  Unless, of course you have a complete list of the unique MailId values which might allow a formulaic approach.

So, ultimately, we need to have a much better idea of what you have and exactly what you want as an end result.

Regards

Harry
Subash D replied to Harry Boughen on 15-Nov-14 08:14 AM
Hi Harry,

Find attached a sample file.

A specific Mail id has been repeated in all the three sheets. I would like to know if there is any possibility to remove the entire row containing a specific Mail Id, leaving the details in only one sheet.

Subash
Dummy1.zip
Harry Boughen replied to Subash D on 15-Nov-14 07:32 PM
Hello Subash,

The macro <test> should do what you want.

Dummy.zip

Regards

Harry
alex smith replied to Harry Boughen on 05-Dec-14 01:33 AM
Here is vba code that may help you

http://www.ozgrid.com/VBA/RemoveDuplicates.htm