Today’s challenge for dashboard week required us to download data from NCDRisC.org about risk factors from noncommunicable diseases, analyse the data and create a visualisation. Full details on the task can be found here.
Download the data
The data is available to download for each country and for each of the medical conditions one file at a time. So 4 conditions x 200 countries = 800 files. Time to fire up Alteryx!
First I downloaded a file from the site that has all the required country names. I then constructed the 4 base URLs and joined these two fields. Then two formulas to a) construct the full URL and b) construct the folder paths to download the files to. Finally the Download Tool to do all the hard work. Here’s the workflow, with snippets of the data along the way (right-click and ‘Open image in a new tab’ for a clearer view):
Once all the data was downloaded, it could be merged in a different workflow (below) and pivoted to create one ‘Thin and Tall’ file.
The resulting file has a few key dimension fields (Country, Gender, Year, Condition) followed by just two columns containing Name – Value pairs. A few examples below:
Having the data in this forms leads to an interesting challenge in Tableau as you do not have direct access to all the measures as you normally would but instead have to use filters (on the Name field) to then get access to corresponding measure (in the Value) field. This can be a bit unintuitive if you are used to dropping different green measure pills into your view.
After working with the data in the above format, I decided to try and re-merge the data in a wide format with all measure field headings side by side in a table to give direct access to the measures in Tableau. Below is the modified merge workflow to create a ‘Short and Wide’ file. The only main differences are removing the Pivot Tools and replacing the Union Tool with Join Multiple tool.