Fuzzy Lookup Add-in for Excel 2010

clean data If you work with data in Excel, you know what a mess it can be. I help my customers clean up data that they've imported from another computer system, or from reports received from another department or group.

Those files can be filled with spelling mistakes, strange abbreviations, extra spaces or missing punctuation. It can take you hours, or days, to match items between lists, and eliminate any duplicates. Your favourite Excel functions, like VLOOKUP or MATCH, can do some of the work, but fail in other areas.

Shaking your fist, or cursing at the data, doesn't help, as I'm sure you know. Fortunately, there is a new Excel add-in, from Microsoft's BI Labs, that looks promising.

Fuzzy Lookup Add-in for Excel 2010

The new add-in is named Fuzzy Lookup Add-in for Excel 2010, and it helps you find best matches, for items that aren't an exact match. The project description says:

It can be used to identify fuzzy duplicate rows within a single table or to fuzzy join similar rows between two different tables. The matching is robust to a wide variety of errors including spelling mistakes, abbreviations, synonyms, and added or missing data.

That sounds useful, so I downloaded the add-in, and installed it for testing.

Set Up the Fuzzy Lookup

When you open Excel 2010, after installing the Fuzzy Lookup add-in, you'll see a Fuzzy Lookup tab on the Ribbon, and it contains the Fuzzy Lookup command.

FuzzyLookup01

Follow these steps to set up the Fuzzy Lookup:

  1. The add-in works with formatted tables, so create one or more Excel tables in your workbook, if you haven't done so already.  The add-in download includes a sample file named Portfolio, which contains Excel tables.
  2. On the Ribbon, click the Fuzzy Lookup command, to open the Fuzzy Lookup pane.
  3. Select the left and right tables for the comparison (to identify matches in a single table, set the left and right tables to be the same)
  4. Columns with the same heading will be automatically joined. You can leave that join, or delete, and create new joins for other fields.
  5. For the Output columns, select the columns in each table that you want in the output table. Include the FuzzyLookup.Similarity column, to see how close the matches are.
  6. Leave the number of matches at the default of 1, or increase it to see more matches for each item in the left table.

FuzzyLookup02

Run the Fuzzy Lookup

After you set up the Fuzzy Lookup, follow these steps to run it:

Select a cell in an empty part of the workbook. The results  will start in the active cell, so make sure there is enough room, below and to the right of the selected cell.

Click the Go button, to output the results.

FuzzyLookup03

Configure the Fuzzy Lookup

You can change the configurations in the Fuzzy Lookup add-in, and create your own.

FuzzyLookup04

Test It Yourself

To do your own testing, you can download Fuzzy Lookup Add-in for Excel 2010. The download page has a couple of videos that you can watch, for an overview of how the add-in works.

____________

Related Posts Plugin for WordPress, Blogger...

Share and Enjoy

  • Facebook
  • Twitter
  • LinkedIn
  • Google Plus
  • Pinterest

17 comments to Fuzzy Lookup Add-in for Excel 2010

  • Jim Cone

    Hi Debra,
    Thanks for posting the information on Fuzzy Lookup.
    I was very curious about it when I ran across it yesterday,
    The MS website says it requires Windows 7 as well as XL2010.
    (both of which I don't have)

    Did you try it on Windows XP?
    Do you know if the code is available and what language is used?

  • Hi Jim,
    I was able to install Fuzzy Lookup on my XP machine, as well as the Win7 machine.
    Not sure about the code though -- you could post a question on the BI blog:
    http://blogs.msdn.com/b/business_intelligence_labs/archive/2011/04/27/fuzzy-lookup-add-in-for-excel.aspx

  • sam

    Works in Excel 2007 as well
    Thanks for the link Debra, at last a productivity tool from MS

  • Thanks Sam! Good to know that it works in Excel 2007 too.

  • Gerald Strever

    Hi Debra
    Thanks for advising us about this new add-in. Really very useful! Keep up the good work!
    Best regards
    Gerald

  • Pob

    Hi Debra,
    Do you know how this compares to some of the custom functions that have been floating around a few years e.g. at methodsinexcel or chandoo?
    Regards,
    Philip

  • @Philip, I haven't used those custom functions, so can't compare them.

  • Pob

    Shame. I'll have a play with this one in that case, and see what I think.
    Regards,
    Philip

  • Joe

    LOVE IT!!!! This add in just saved me hours of manual work. Can't wait to see the next version.

  • KJ

    VERY slow if you are comparing datasets of any size. For 10000+ records, be prepared for times in the hours.

  • allan

    hello all

    does anyone know if one can use the addin through vba. i tried to use the addin as a COM addin in VBA but i was unable to get this to work. does anyone know how one can use the addin and call it from vba?

  • I'm having the same issue as allan ... I need to be able to activate Fuzzy Lookup using Access VBA, and ideally set all of the parameters and run it so my user doesn't have to. I'm finding very little on the web re: this. Anyone have any ideas? I'm using Office 2010. Thanks!

  • Loren

    If this is the site where I complained that a worksheet was missing, Sorry, it was sheet 2
    which I swear was not there before.

  • Camden

    Doesn't work. When I created tables, and selected them int eh dropdown, the column names did not appear as selectable.

  • Vick

    I had the same problem as Camden. When i create tables, the column names do not appear - i still see the sample column headers from the "portfolio" example that was provided with the download of the add-in. Any ideas on how to fix this so that the columns that relate to the selected tables actually appear??

  • Vick

    Looks like you have to shut down excel and re-start in order to see the column names...

  • Ali Kazim

    Hi,

    First of all, I must appreciate smart people at MS who produced this add-in. It's really great add-in and can save hours data cleansing. I've got to report two issues:

    1) After comparing for few times... it stops responding that is, when I press 'go' nothing happens. Based upon the recommendation, I've uninstalled it and then installed it using the setup file. It worked for some time, but it stopped working again. Is it due to some bug in the code?

    I've noticed just before it stopped working, it started giving me peculiar results i.e. two not so same items are returned with very high similarity rating.

    I can certainly try uninstalling and installing again... and it might work... but I was just wondering if there is a way to avoid this hassle all together.

    2) Secondly, when matching rows in the left and right table, I would like the tool to use the row in the right table once... that is once it has been matched (to the best possible) with a row on the left, it shouldnt be reused and appear again infront of another left row. E.g:

    Left Right
    AA A A
    AAA AA

    Combined table after Fuzzy Lookup...
    AA AA
    AAA A A (or may be blank)

    I would really appreciate if any of the fine gentlemen here could answer my queries.

    Regards,

    Ali Kazim
    MS Office Professional Plus 2010
    MS Excel 2010 (14.0.6129.5000) 32-bit
    MS Windows 7 Professional - 64 bits
    Dell i7 1.73GHz, 8GB

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>