The following will be an Alteryx solution to the Preppin’ Data 2019: Week 3 Challenge. If you haven’t already, go check out the challenge and the associated input data here.
This week’s challenge is a brilliant example of when we need to use scaffolding to create a data structure that allows us to perform the type of analysis we want to.
The input data for this week looked like this:
This is representative of the kind of data format a company that sells subscriptions might have. In this form, it would be impossible to draw a monthly revenue graph, making it much harder for the company to be aware of the revenue it can expect in future months. Our task is to change the form of this data so that it can be used to draw a monthly revenue graph. This will require us to have a separate row for each month that a customer has an active subscription, and have each day of payment associated with each of those rows. By the end, the data structure should look like this:
Notice how there are now 24 rows for Carl – one for each of the months of his subscription.
Step 1: Input and append
For us to create the desired data structure, we need a way of creating the same number of rows for each customer as their contract length. One way to do this is to append the data set to a field containing a list of integers from 1 to the maximum contract length. By doing this, each row of the original data is now repeated the same number of times as the maximum contract length, and each of those rows has a number corresponding to the integer field to which it joined with.
Note: the field named ‘Length’ is that field corresponding to that list of integers to which we appended our original data set.
Step 2: Filter out inactive subscriptions
The problem with the current state of the data is that we don’t want each row of the original data to be represented the same number of times as the maximum contract length. We want each row of the original data to be represented only once for each month of the length of the customer’s contract. To do this we can use a filter and specify that the ‘Contract Length (months)’ field must be greater than or equal to the ‘Length’ field.
So now instead of having 96 records, you should have 54. This makes sense, since the sum of all our contract lengths is 54.
Step 3: Create dates for each month of subscription
To create a summary graph of our monthly revenue, we’ll need each of the remaining rows to have a date associated with it representing the month in which a payment was made. We can do this by taking the ‘Start Date’ field and using a DateTimeAdd() function to add on the ‘Length’ in months. This way we will generate a separate date for each month of the contract. Remember to subtract 1 from our ‘Length’ field when doing this, because we want the very first month for our customer to be the same as the starting date of the contract.
And that’s it! You should now be able to easily draw a monthly revenue graph by placing the new ‘Date’ field on the x-axis and the SUM(monthly cost) on the y-axis.
The graph will look something like this: