DS24 Dashboard Week - Day 3 - Risk Factors for Non-Communicable Diseases (Height)

by Muneeb Sajjad

Final week of training, dashboard week, day 3.

Today's topic is risk factors for non-communicable diseases. The data comes from NCD RisC.

Requirements for today, as per Andy's blog:

  • Pick one of the six data sets from the available;
  • Download the data for each individual country using Alteryx;
  • Visualise the data on Tableau.

NCD Risk Factor Collaboration (NCDRisC) provides data on six major risk factors for non-communicable diseases for every countries. The major risk factor I selected was height.

Downloading and cleaning the data on Alteryx

I started by searching for the URL to download the data. I right-clicked on the download button, which gave me a link to the .csv download for the selected country (United Kingdom). I removed the final part of the URL ("country".csv) before pasting it into Alteryx. This is the part that differs for each country that we have data for.

Extracting the URL from webpage

To retrieve the list of countries available, I right-clicked on the country drop-down, and selected inspect, edit as HTML, and copied all of the text that contained a list of the 200 countries.

Extracting a list of countries from webpage

I added the URL and the text containing countries into two text input tools in Alteryx, and joined them using an Append Fields tool. Before appending, I needed to split out the names of the countries from the text input. I used a text-to-columns tool and a couple sample tools; to split the data into columns and then keep just the name of each country once. I then used a Formula tool to create a new URL/download link for each country ([Base URL]+[Countries]+".csv"). I could now download the data by connecting to a Download tool and selecting the URL field. I opted to output as a string, so that I clean the data on Alteryx. I deselected all fields except DownloadData which returned all the data that I needed. I split the data into tabular form using a couple Text-to-columns tools and removed quotation marks using a Multi-field formula tool with the REPLACE function (REPLACE([CurrentField], '"', ''). The dynamic rename tool allowed me to take field names from the first row of data. The data cotained a list of recurring field headers for each country - to remove these I used a filter tool. Once the data had been cleaned, I removed the DownloadData field and output the data in a Hyper file.

Alteryx Workflow

The Dashboard

The dashboard I created looks at the evolution of height over time: across the world, and includes bar charts comparing the sexes. I also included a few filters and parameters for interactivity.

Tableau Public Dashboard

Link to the dashboard on Tableau Public: https://public.tableau.com/views/DashboardWeek-Day3EvolutionofHeightOverTime/Dashboard1?:language=en-US&publish=yes&:display_count=n&:origin=viz_share_link