While Sankey charts may seem daunting at first, they’re a great way to show a change of flow between more than one category or condition.
For this example we’re looking at how the sum of sales changes between Regions and Categories in the Tableau Superstore data.
To create the above Sankey chart you will need to copies of your data. One for the categories on the left hand side of our chart, and another for the regions on the right hand side. Union your data to itself to get these two copies.
Once unioned, Tableau will create a new field called ‘Table Name’. This field, and it’s contents, will be used in calculations later on.
Create a new calculated field, I’ve called mine ToPad. This calculated field will mean that one copy of our data will be labelled 1 and the other 49.
When this field is added to the field, we can see it has values for 1 and 49, but nothing in between. A value of 49 here will mean that when Tableau draws the curve of our sankey chart, it will be smooth and not jagged.
By creating bins on the ToPad field, we can simulate values between 2 and 48.
Create a new calculated field, I’ve called mine T. This field will ensure that are marks are spread out across the view.
Add the calculated field ‘T’ to the columns and add ‘Padding’ to the details shelf. Then change your mark type to a circle, this will make it easier to visualize the next steps.
As you can see, only one mark is shown. Right click on the T field, and select ‘Compute Using’ and ‘padding’. This will show all of your marks.
Create a new calculated field, Rank 1. Duplicate this field and call the new field, Rank 2.
These calculations will help inform Tableau where the lines of the sankey chart should start and end.
The signature s-shaped curve of a sankey chart is a sigmoid curve. So create a calculated field that will show tableau how to draw that characteristic s-shaped curve.
Using the previous calculated fields, you will need to create one final calculated field that will build out the curve of the sankey chart.
This calculation will go from the starting point (Rank 1) to the finishing point (Rank2-Rank 1) drawing a sigmoid curve between them.
Add your new calculated field, curve, to the rows shelf.
Initially, only two marks will appear, one on the extreme edge and one on the extreme right. This is because tableau hasn’t been told how to calculate the curve.
Add Region and Category onto the details shelf. Right click on your curve pill and select edit table calculation. Select Compute Using Specific dimensions, using the dimensions specified below, in that order.
You will need to edit your axis. I’ve made the X-Axis between -5 and 5, and the Y-Axis between 0 and 1. The Y-Axis also needs to be reversed, so the points that appear first alphabetically are at the top of the axis.
Once this is done, hide your headers.
Change your mark type to a line. Initially, the chart will look strange, this is because Tableau hasn’t been told how to draw the line. Make sure that your Padding field is on the paths shelf.
To size the lines of the chart based on the overall sales, we need to create one final calculated field.
Add this new calculated field onto the sizing shelf and compute using ‘Padding’.
Add Category and Region to your colour shelf.
In two separate sheets, create a stacked bar chart for % of total sum of sales, coloured by Category and Region.
The last step is straightforward. Combine the three sheets you’ve made onto one dashboard.
I hope this blog post has helped you to understand how to create a sankey chart. I know writing it definitely helped cement my understanding of the process.