Today we spent all day performing data prep. It was really fun, leading to some frustrations and interesting language by both us and the coaches, but ultimately was very satisfying compiling lovely, clean data sets. We used both Alteryx and Tableau Prep, often attempting to repeat the same cleaning steps in each – to varying successes! The ‘Data Interpreter’ tool within Tableau Prep is pure magic!!

 

Cross Tab

Cross tabbing data can be really useful when performing data prep. It is similar to transposing or pivoting, but takes the data from columns and moves it into rows. Whilst this can be done in Excel (by ‘reverse transposing’ so to speak), in Alteryx the Cross Tab tool allows the data to be ‘grouped’ before it’s cross tabbed. This splits it into columns whilst the data is moved from columns to rows.

This is especially useful if you have, for example, one column of dog breeds and count in another column, but you’d like the breeds to each be a column to help when visualising it. The cross tab tool would group by dog breed and then add the count that was in a column, to the relevant rows under each breed.

 

During our data prep today we used both Alteryx and Tableau Prep….but soon realised you can’t directly cross tab within Tableau Prep.

But, there is a work around that allows you to do this in just a few steps as detailed below.

 

Tableau Prep

I was using Tableau Prep to clean Cancer Waiting Times data, downloaded from NHS England. The data was in multiple sheets and involved a fair amount of pivoting. I’d almost got there when I realised that I needed to cross tab the number of patients seen and the performance score from two columns into three, shown in the image below.

Before and after ‘cross tabbing’ in Tableau Prep

 

I knew how I wanted the data to be, but in Tableau Prep this is a slightly longer process than in Alteryx.

 

How to ‘Cross Tab’ in Tableau Prep

 

1) Ensure your measure names and measure values are the relevant columns. I pivoted the three original columns so that when I joined the multiple sheets containing similar data on each department the measure names would all be in one column, and the values in another.

 

2) Create a calculated field for each of the ‘groups’ in the column that needs to be it’s own individual column using the following formula:

IF [Measure Name]=”Within Standard” then [Measure Value] END

 

Calculated fields and their formulas

 

This will create three columns (or however many you need) with the data ‘filled in’ from the values column. The ‘original’ columns of the measure names and values can now be removed (right click on the field in the step and select ‘Remove field’).

 

However, there will be ‘nulls’ where the rows have no data for that type. These nulls needs to be aggregated out as simply filtering or excluding them will impact the other columns’ data.

Unwanted null values

 

3) Aggregate (or summarise) the data to clean up the resulting nulls by clicking on the plus next to the shelf and selected ‘Add Aggregate’.

 

4)  Click and drag the columns you created in step 2) onto the ‘Aggregated fields’ part of the configuration window. Then click and drag the other fields in the data set onto the ‘Grouped Fields’ part.

Aggregate configuration window in Tableau Prep

 

This will keep your other fields the same, but aggregate (in the case by summing up) the columns.

Resulting data structure after ‘cross tabbing’ in Tableau Prep

 

And voila – the ability to cross tab in Tableau Prep!

 

Thank you Coach Carl for letting me blog about this and letting go of your shotgun on the topic for a blog, hopefully my how-to is useful!

 

Tableau Prep Data Cleaning Workflow and Calculated fields step details