Clean Excel Data With TRIM and SUBSTITUTE
You have two Excel lists, and you’re trying to find the items that are in both lists. You know there are matching items, but your VLOOKUP formulas can’t find them. In this screenshot, the tiptech.html page is in both lists, but the VLOOKUP formula in cell C2 can’t find it.
Spot the Differences
Working with Excel data can be like one of those “Spot the Difference” puzzles. What’s different between list A and list B? If you’re lucky, the differences are obvious, like the forward slash in column B, and no leading slash in column E. Other times, it’s tougher to find the differences.
A common problem is items with leading or trailing spaces. You can’t see them on the screen, but after you’ve encountered them a few times, you learn to check for them. The LEN function is a great help if you suspect there are hidden space characters.
If you use the LEN function to compare the length of the text in cell B2 and E4, you’d see that there are 2 additional characters in cell B2. One character is the forward slash, and the other character is a trailing space.
Remove Leading and Trailing Spaces
If you want to use a VLOOKUP or MATCH to find column B items, in column E, you’ll have to get rid of any extra characters. First, you can deal with the spaces, by using the TRIM function.
To return the text from cell B2, without any leading or trailing characters, you’d use this formula:
=TRIM(B2)
If you use a formula in cell B12 to check the length of the trimmed text, it’s now 13 characters, instead of 14. The trailing space has been removed.
Remove a Specific Character
Next, you can use the SUBSTITUTE function to remove the forward slash from the text in cell B2.
=SUBSTITUTE(B2,”/”,””)
- The first argument, B2, is the cell that contains the text value.
- The second argument, “/”, is the old text, that you want to replace.
- The third argument, “”, is the new text, that replaces the old text. If you want to remove the old text, without inserting new text, use “” as an empty string, as we did here.
Combine the Functions
The TRIM and SUBSTITUTE functions work well separately, and you can combine them, to remove both the spaces and the forward slash. The order doesn’t matter, so you can use either:
=TRIM(SUBSTITUTE(B2,"/",""))
or:
=SUBSTITUTE(TRIM(B2),"/","")
Add to the VLOOKUP
Now that you know the TRIM and SUBSTITUTE functions will clean up the text in column B, you can add those functions to the VLOOKUP formula.
Instead of using B2 in the VLOOKUP:
=VLOOKUP(B2,$E$2:$F$8,2,FALSE)
use the TRIM and SUBSTITUTE functions:
=VLOOKUP(TRIM(SUBSTITUTE(B2,"/","")),$E$2:$F$8,2,FALSE)
A match for the cleaned up text is found in column E, and the Update column is filled in with the correct date.
Troubleshooting a VLOOKUP Formula
If TRIM and SUBSTITUTE don’t solve your VLOOKUP problems, there are a few more suggestions on the Contextures website: Troubleshoot the VLOOKUP formula. You’ll also find examples of using the IF function or IFERROR function to deal with VLOOKUP function errors.
And if that doesn’t work, you could try some Mr. Clean; according to this commercial from the 1950s, it works on everything, including the dog!
_____________




February 17th, 2010 at 1:23 am
In the "Combine the Functions" section, you wrote...
=TRIM(SUBSTITUTE(B2,"/",""))
or:
=SUBSTITUTE(TRIM(B2),"/","")
implying that these two formulas return the same result. And, of course, this is true for the data you showed us. However, so readers of your blog don't make the mistaken assumption that these two formulas always return the same results, it should be pointed out that, for other types of data in B2, the order could matter. For instance, if the there were one or more spaces in front of the slash **and** one or more spaces immediately after the slash (possibly followed by other text), the second formula will retain the spaces located after the slash whereas the first formula will not. When the TRIM function is applied matters for data such as I just proposed.
February 17th, 2010 at 7:53 am
In addition to TRIM, the CLEAN worksheet function can assist in sanitizing your data set. CLEAN is less common and honestly, I think I have only used CLEAN once when I imported some data that was loaded with lots of garbage characters.
February 17th, 2010 at 10:27 am
Hi Deb
Your readers may also come across that pesky non breaking space character, Char(160) which, whilst not visible as a space is not cleaned with TRIM.
SUBSTITUTE will remove it fine though
=SUBSTITUTE(B2,Char(160),"")
February 17th, 2010 at 12:03 pm
For a macro solution:
http://www.mvps.org/dmcritchie/excel/join.htm#trimall
works from a selection. By using a macro you fix the problem and don't have to keep reworking bad data with a formula.
February 17th, 2010 at 7:19 pm
Rick, Jason, Roger and David, thanks for adding your Excel data cleaning tips.
March 3rd, 2010 at 5:43 am
Hello,
Can someone help with this:
Removing space(s)-regular and non regular, in an excel text or number. For example
1) 123 5697
2) 23145 5
3) 1 98742
4) 569 49 12
Thank you
March 3rd, 2010 at 6:15 am
Hi Oladapo
You can just use the Substitute function
=Substitute(A1," ","")
July 15th, 2010 at 1:44 am
Excellent resource and examples, especially intuitive for those seeking to combine functions.