Roman Numerals Preppin' Data Challenge in Alteryx and Tableau Prep

by Ollie Matthews

The Preppin’ Data, 2020:Week25  ­– Roman Numerals – challenge asks participants to build a workflow which converts any Roman numeral number into a decimal numeral value (what we would call a normal number).  Just like this:

MMMCDXVII = 3417

To solve this problem, a different approach was required for Alteryx and Tableau Prep and this time, unlike in my last blog post, Alteryx takes the cake 🍰.

Alteryx

Solution:

Steps.

1. Use Regex tokenize (split to rows)  to get one numeral per row. Add a record ID here to make sure we can maintain the order of the numerals later.

2. Join with the Roman numeral lookup table to get the value for each individual numeral.

3. Use the Multirow formula tool to make certain values negative. This depends on their position in relation to other numerals:

4. Reduce everything to one row and sum up the values, and voila!

Tableau Prep

Solution:

Steps.

1. Calculate the length of the input string (which is in this case 9 for 'mmmcdxvii') using the Len() function.

2. Using a conditional join, join with the scaffold table which is just a sequence of numbers, using this join clause:

3. Using the Mid() function, extract the letter that is at the index position of the scaffold:

MID([Number],[Scaffold],1)

So that again we have one numeral per row.

4. Join with the lookup table to get the value for each individual numeral.

5. Calculate the numeral that comes AFTER each numeral like so:

MID([Number],([Scaffold])+1,1)

and join with the lookup table again. To end up with something like this:

6. Use a calculation to determine whether or not the numeric equivalent should be positive or negative depending on the value of the next numeral.

7. Sum up the numbers to get the final output.

Comparison

As you can see from above, Tableau Prep required more steps and a fair bit more thought.

Because Prep lacks a regex tokenize equivalent, a scaffold input was needed and then a calculation to extract the individual numerals from the string. This made the process more tedious and led to a less elegant solution. Alteryx's regex tool is vastly superior in this case.

Perhaps an even bigger loss for Prep here is the lack of multi-row formula functionality. Calculating whether the numeric value needed to be positive or negative was easy in Alteryx. With the data sorted, I could simply use the multi-row formula tool to look at the next row to determine if that value was larger or smaller than the current value. In contrast, with Prep, I had to use the mid() formula to extract the next numeral along and then join with the lookup table again to get the value. More steps, more time.

I have heard that something like Alteryx's multi-row formula is on the way for Tableau Prep and I do think it would be a great addition to the tool and its intuitiveness.