Today we were tasked with assessing the strengths and weaknesses of our 2 favorite data preparation software by picking a random preppin data challenge and solving it with each of them. As we were tasked with a challenge about the NBA this week and I wanted to stay in the sport field, I decided to choose a challenge about the English Premier League (i.e. the challenge of week 13 2021). To briefly summarize the task, I was given 5 excel files containing player statistics over the past 5 seasons and the objective was to create a table displaying the top 20 goal scorers by position, keeping only goals scored on open plays (For the football experts out there it means goals not including penalties and free kicks). So at first, nothing too tricky it seems.
I first tackled the challenge using Tableau Prep builder, a software that I had barely used until now. My first observation is about the difference between the interface of both software. While you have 2 main views in Alteryx, Data and Metadata, the data view have a clear picture of the output table created by each tool in the result pane, Tableau prep offers the same views and adds an extra view that organizes the tool outputs differently with 2 tables. The top table displaying the field names, field types, their common values or range of values if it’s a continuous field, the percentage and number of records for each (range of) value and a set of options similar to what can be found in Tableau when connecting to a dataset albeit more comprehensive (adding, removing, renaming, duplicating fields, creating calc fields and so forth…).
I mainly used this view during the challenge but I find it somewhat overcrowded at times and will probably revert to the view that is similar to Alteryx as the set of options described earlier is available in both views. However, the extra view may be useful to do some quick data validation as it allows the viewer to quickly see the values contained in each field and the number of records for these values. A last thing about the interface is about the workflow canvas of Prep that is less flexible than Alteryx where the user can place its tools about anywhere on the canvas. In Prep, it is still possible to move the tools around but they are placed in some kind of invisible containers which means that your canvas can get busier more quickly than with Alteryx.
Honestly, it took me a while to get used to the interface as I think any novice would and this was made worse by my Alteryx experience as I kept looking for non-existing tools everywhere. Yet, one thing that I realized throughout the challenge is that contrary to Alteryx, a lot of steps can be done with a single tool directly on the view in Prep. For instance, I removed a large amount of fields, created a few calculations and applied a filter to some fields just by adding a single ‘Clean Step’ tool. A list of all your modifications is displayed on the left of the view so you can edit and remove them at will. Be aware that the calculated fields work the same way as Tableau so it seems that a good understanding of LODs is necessary. Besides, I ran into a few issues when creating my calculations that I can’t explain yet (Min and Max functions not supporting dates and integers being one of them!). A great advantage of Prep for this particular challenge is the rank embedded calculation that enables the user to rank a field according to various parameters. I haven’t fully understood how this rank tool work to be honest but it gave me the output that I needed in a faster way than in Alteryx where I had to use an endless multi-row formula, see above:
The aggregations in both software seem pretty similar and offer the same set of options depending on your field data type as far as I can tell. The joins are also working the same way albeit with one significant exception. Indeed, Prep offers the possibility to make conditional joins. My expertise on the matter is really limited so I’ll refrain from going into a detailed explanation here as it may be more confusing than useful.
Finally, after spending way more time that was needed and I admit a couple of mental breakdowns, I succeeded in solving this simple challenge with Prep. I then jumped onto Alteryx with the feeling that it would be more straightforward thanks to my experience with the software. However, as mentioned previously I realized that the amount of steps needed was larger and some useful functionalities (i.e rank) were absent and therefore required tedious calculations. You can compare my 2 workflows hereafter:
In conclusion, I need to apologize to Carl as I may have spoken in a derogatory way about his beloved data preparation software when going through my breakdowns. I must recognize that Tableau Prep offered me some tools that proved really useful to solve this particular challenge. I’m confident that with more practice Tableau Prep will reveal itself to be a powerful tool and sometimes a better option to prepare my data. The winner of the round for this challenge: Tableau Prep!