Alteryx Tools: Data Cleansing

by Peter Silvester

Its day 3 of our work with Alteryx at the Data School and I’m delighted to report that we have found the Data Cleansing tool. A relatively recent addition to Alteryx this handy tool is a quick way to perform simple data cleansing tasks without the need for using regex or a formula. As someone who hasn’t used regex before the introduction to it as an Alteryx tool I found this to be a much more intuitive way to clean up the data in my workflow. Although the uses for Data Cleansing are more limited than the possibilities of regex it is often an extremely useful tool for simple tasks.

 

Dealing with Nulls

The appeal of this tool is the simplicity of it all. There are a number of ways that you can get rid of the nulls in Alteryx but none are as easy as dragging in the tool and checking the box. Null values will be dealt with differently depending on whether the field type is string or numeric. String fields will be made blank and 0 will be entering into the numeric ones.

 

Unwanted Characters

The next section of the tool deals with unwanted characters within fields including removing whitespace, letters, numbers or punctuation. This is great for getting rid of any rogue spaces in the data or fixing currency fields that contain the symbol and a comma as a thousand separator.

 

Change the Case

The final part of the tool is for modifying the case of your fields. Fields can be set to all uppercase or lowercase or set to title case which will set the first letter of each word to uppercase. This is useful when you are joining data and need to quickly make sure that the fields that you are joining on are matched to the same case.

 

Edit the Macro

The Data Cleansing tool is a macro built of other tools within Alteryx so you can see what is going on behind the scenes. For those at a more advanced level that are doing specific tasks regularly can edit the macro to fit your needs perfectly.

 

data-cleansing-tool

Avatar

Peter Silvester