The data set for the second day of Dashboard Week came in the form of beautifully formatted Excel files (in multiple sheets) from the UN Population Division of the Department of Economic and Social Affairs. The data showed the “international migrant stock” as a percentage of the total population across various geographies (national, international regions, and global) across different age groups and gender. The data also contained notes detailing which definitions of migrant were included for the various statistics.
The first task was therefore to strip out all that nice formatting and turn it into something a bit more Tableau-friendly (as opposed to human-friendly). An easy enough job for Alteryx, just took some thinking and a bit of tricky business with headings being contained in merged cells.
Getting the data out of multiple sheets (one per year) was also a challenge. While I had used the Dynamic Input tool once before, my memory was a bit rusty and after a bit of googling and experimentation, I decided my limited time was probably better spent elsewhere on this occasion and I turned to applying brute force to this challenge as there were only 6 sheets and fiddling with a batch macro again was also a potential risk in terms of time management.
The first step of my “manual batch” approach was to get each of the sheets out into .yxdb format for two reasons: 1. Because I thought it might make my workflow slightly faster as .yxdb is optimized for Alteryx, and 2. So that I could keep the Excel file open as a reference as I tried to work out what manipulations needed to be done without Alteryx throwing up an error about the file already being open elsewhere and therefore not being able to open it. The plan (yes I did have one!) was to get my 6 sheets out into .yxdb with only a few simple manipulations, and then to do the rest of the necessary tasks once I could have the Excel workbook open in front of me and I was confident that I had 6 .yxdb files with the same schema that I could then get into the necessary structure and union together.
This worked relatively well, except I didn’t quite enjoy the manual process of inserting the year through a formula tool as I ran each sheet through (doing it manually also left it open to errors, especially as I went through and tweaked things here and there in the workflow). The only place the year could be found was in a page heading that I had skipped in the download process. It looked straightforward enough to get it back in, so I went back to extract it, but that then led to further problems down the line and I ended up having to rebuild most of the workflow and bring in most of the restructuring before it got output as a .yxdb.
From there, I unioned the 6 files, cleaned up the fields a bit more, and output two .tde files (one with only country data, and the other where the countries are grouped together).
Overall, I enjoyed getting more familiar with the Alteryx tools, I was able to try out a couple of new tools, and it was a less frustrating day than yesterday, when I couldn’t quite find where the problem lay with my macro.
The biggest challenge for Day 2 (at least for me) probably lay more in trying to work with percentages (as a data set in Tableau). You don’t really want to sum them, and some of the data was quite granular.
The good news is that DS5 has made it to the halfway point of DASHBOARD WEEK!
(The very cool featured Image is from The European Financial Review)