Data analysis brings huge value to any business, however, the data itself needs to be in a rather “clean” state for analysis to be possible. Unfortunately, often the data that businesses have compiled is in a less than perfect state. Because of this, data preparation is a prerequisite for data analysis. There are a few ways to clean and prepare data for analysis. I will focus on the ways Tableau can help clean and prepare data.
When doing data preparation with Tableau, all the options are accessed in the Data Source Page where your data can be previewed. After locating the column, you would like to change, there are two ways to access the menu. Either right click on the column itself or click on the small downward pointing arrow at the right side in the column. Here are useful options:
Rename and Hide Fields
You can easily hide a column of data that you will not be using in your analysis as well as change the name of a column. Easily accessible in the drop-down menu. These are very simple but come in handy quite a lot. I frequently hide dimensions while working in Tableau as it keeps my data clean. The best part is you can unhide and hide again as you work.
Change data type
Tableau is very good at identifying what data type it’s looking at but sometimes we might want to change the data type. This happens often with dates, for example, we have the years as 2015, 2016 etc and Tableau thinks it’s looking at numbers. Another frequent issue involves ID numbers being shown as an integer type, which should be converted to string type. Data types can be changed by clicking on the small icon on the left in the column header in the Data Source Page. Another way to do it is in the Data Pane by clicking on the same data type icon.
Tableau likes data that is tall, meaning more rows and fewer columns. Wide tables with many columns are harder to work with and require more computing power. Wide, heavy tables combined with a big data set can significantly hinder computing speed. The pivot tables function helps in creating these tall tables by transforming columns into rows. It is also possible to undo or reverse pivot the data into a wide table.
The option is accessible in the Data Source Page after two or more columns are selected in the same way as previously explained, right clicking on a column or clicking on the small downwards arrow.
Merge mismatched fields
The Merge Mismatched Fields is useful when we have the same data split up into multiple columns. This may occur when we union data from different sheets, each containing data for different time periods. In the example shown we have a union of spreadsheets with data from 3 years, the data need to be of the same type.
To remedy this, we can use the “Merge Mismatched Fields” option. The feature is accessible by selecting the same type fields, right click and pick the Merge option.
Splitting fields help to clean up a data fields that are made difficult to analyze because of unnecessary symbols in it. Sometimes two different types of information are combined in the same column. Issues like this can be solved by splitting a single column into 2 or more columns. Our data might have rows looking like this:
In cases like these, we might want to split on the underscore sign, creating 2 fields.
The way to do this is straightforward. Click on the column to get to the menu where the split options are “Split” and “Custom Split”. Either option opens another box asking us to specify what delimiters we want to use. Just type in the delimiter and specify the number of new columns to be created. It’s that easy!
Sometimes we might want to change what is inside the data itself. A simple example is when we have geographic information showing country or city names shortened or abbreviated. If we want the names shown differently we can change this manually if necessary.
This is a quick summary of the data preparation capabilities of Tableau, I hope this will come in handy to