Cleaning up data imported into Microsoft Excel

By | July 25, 2014

ExcelMany MS Excel users find that they have to work with data originating in other applications, such as database, payroll and accountancy, websites and content management systems. Often problems arise when the data is inserted into MS Excel, resulting in difficulties working with the new spreadsheet.

One solution I’ve found extremely useful is to ‘clean’ the data being imported; stripping away any unwanted hidden characters included in the imported data. An example of this is the inclusion of a non-breaking space at the end of a value imported from a website. On a website, or MS Word document, a non-breaking space is used to prevent text wrapping to the next line. When this character is included at the end of a value Excel stores the value as text, not numeric. So, the value 456.99 imported from a website may actually be 456.99° in the MS Excel cell. The ° is thenon-breaking space. (The best way to determine if your data contains hidden characters is to paste your data into a MS Word document and turn on Show/Hide {¶} in the Paragraph group on the Home tab).

MS Excel has a number of Text category functions that may be used to clean up imported data. You can also use the Find and Replace feature in Excel to remove unwanted content (I often paste my data into a MS Word document and use Word’s Find and Replace feature, as it is more versatile than Excel’s own Find and Replace).

Find and Replace

Replacing unwanted characters and symbols with a correct substitution is often all that is needed to make imported data readable to MS Excel. In the example below the imported numeric values have a hyphen () instead of a decimal point (MS Excel does not view this data as a valid number). By replacing the hyphen with a decimal you can make the data recognisable as numeric, allowing you to format the data and perform calculations on it.

Imported Data Find What (hyphen) Replace With (decimal) Result
456-99 . 456.99

PROPER, TRIM and CLEAN Functions

Excel’s CLEAN function removes non-printing characters from your data. The PROPER function returns a leading capital letter on each word. The TRIM removes unwanted spaces between words. The example below shows imported data which contains multiple spaces, upper case words, and missing capitals. The nested text functions correct use of upper and lower case, remove non-printing characters, and remove multiple spaces.

Imported data Function Function Result
HIRE a     car in LONDON    heathrow =(PROPER(CLEAN(TRIM(A1))) Hire A Car In London Heathrow


View our latest Excel course dates here.




Leave a Reply

Your email address will not be published. Required fields are marked *