Preppin' Data on Snowflake

by Ali Agah

This Friday, our challenge to complete Preppin' Data challenges on Snowflake using SQL rather than using Tableau Prep as the challenge intended. As Carl gave us this challenges, I assumed abandoning Tableau Prep was not blasphemy.

I attempted challenge 2021 Week 01 as my second challenge of the day:

1) Split the 'Store-Bike' field into 'Store' and 'Bike'

The first challenge was using the right split function. Using SPLIT(), I realised I get one column back with the two values in it (see fig.1).

fig.1 - the SPLIT() function

To solve this I used SPLIT_PART() to only return one of the parts at a time (see fig.2).

fig.2 - Using SPLIT_PART()

2) Clean up the 'Bike' field to leave just three values in the 'Bike' field (Mountain, Gravel, Road)

Fig.3 - Cleaning misspellings

I didn't have the neatest solution but noticed a simple IF statement based on the length of the Bike values will do the trick in cleaning misspellings. Fig.3 shows the results.

3) Create two different cuts of the date field: 'quarter' and 'day of month'

This was very similar to Tableau. I just used the DAY() and Quarter() functions (see fig.4).

Fig.4 - Quarter and Day of Month

4) Remove the first 10 orders as they are test values

A simple line of WHERE based on the ORDER ID which, in this dataset, is nice and chronological.


5) Output a csv

You can see my final SQL code in fig.5 and the results table in fig.6. Note that I've ordered the code so the results view is in the same order as the output example on the challenge page.

Fig.5 : My SQL code for solving the Preppin' Data Challenge 2021 week 1
Fig.6 : Final results table using the SQL above

If I had more time I would have liked to use other functions to use my solution more robust. For example, while the CASE function does well in this case, it wouldn't be a good solution when dealing with real misspellings.

Also, I ended up putting the "Bike" SPLIT_PART() code inside the LEN() of the "Bike Type" code to get a cleaner results table. I didn't do this for Fig.5 as to prevent confusion.