My first week at the Data School was slightly shorter than usually as we all started on Tuesday. That meant one day of (repetitive) introductions and only two days of Alteryx training - which no one from DS23 has used before. In turn, we had a very interesting first Friday: Carl, our coach, has given us a task to go back to the dashboards from our first applications to DS and add more insight into them. The trick? In three hours, we needed to find extra data and prep it using Alteryx - the tool we all just started to use!
Even though it sounded a bit scary at first, the tips from that week’s training sessions helped a lot. Once you have your raw dataset, you first need to sketch the data in a way you wish it to look at the end. Second, you want to write a plan of steps on how to achieve this particular sketch. The final thing is then to upload the raw dataset into Alteryx and follow your own steps. Luckily, the software is very user-friendly and easy-to-navigate, so two days of training were enough to enable me to prepare the dataset quickly, and actually left me with enough time to create a mini dashboard on Tableau, too.
So, how do these steps look in practice? For my application to DS, I used multiple datasets to talk about asylum numbers in Europe. For this challenge, I decided to focus on one bit of my dashboard that talked about so-called “bogus” refugees. Initially, I used an index of political instability and violence to prove that people fleeing their home countries are doing so for the reasons that fall under the refugee definition. This time, I decided to think about a root cause and looked at the index of corruption (which in turn causes political instability and violence). Therefore, my data sketch looked like the table below.
Following this, I wrote down the steps I needed to do in order to clear the new dataset (and then combine it with the previous datasets). For example, you may want to connect two columns - for Month and Year - into one Date column; or pivot a couple of columns; etc. My plan looked like this:
- Remove first two rows:
1) delete first row;
2) make the second row as a header
- Remove extra columns that I will not use
- Pivot some columns to get Year column
- Rename all the columns
- Check if data types are valid
- Upload other datasets, prep them
- Join all datasets together
I have used a couple of different tools in Alteryx to get to where I wanted: Select, Sample, Transpose, Browse, Join, etc. The end result was the table above. Since I had some extra time left, I quickly prepared a very simple viz on Tableau that showed a positive relationship between the level of corruption in a country and a number of citizens fleeing from that country.
All in all, I am very happy with the progress in such a short period of time - I know I would have taken so much more time to get the data ready without Alteryx.