How to sort column headers in chronological order in Alteryx

In this blog, I will talk you through a very nice trick I learnt during one of DS clients projects to sort fields in chronological order in Alteryx. This might come extremely handy in case you are trying (for some reason) to create reports in Alteryx.

Setting the scene

Let’s assume that we have in our disposal data about the sales of three products (products A, B and C) from August 2019 until February 2020. The provided data has the structure shown below.

Figure 1 – Initial structure of data

Our client has kindly requested that we bring this data to a structure where the different months will be our column headers and each of the products (A, B and C) will be in a separate row. In addition, our months should be sorted in chronological order, i.e. August 2019 should populate the first column, followed by September 2019, October 2019 and so on. Essentially, what we have to do is to pivot the orientation of data in our table by moving vertical data columns onto a horizontal axis.

Now your first thought might be that we can easily do this by using the Cross Tab tool. However, if we just use a Cross Tab tool, the structure we are going to end up with will be that shown in Figure 2.

Figure 2 – Using the Cross Tab tool would result in the desired table orientation, but our column headers are sorted in an alphabetical and not in a chronological order

As we can clearly see, our column headers aren’t sorted chronologically but alphabetically and that’s because Alteryx can’t understand that these fields represent dates.

The Solution

Step 1 : Create a new column which Alteryx understands as a date

As mentioned earlier, Alteryx cannot sort months in a chronological order simply because it can’t understand that the “Date” field represents a date. By default the data type of this column is set to be string, so Alteryx thinks that the content of this column is just text. By using the DateTime tool, we can convert the “Date” field from string to Date format.

Since the format of the incoming string is not included in the configuration window, we are going to use the custom option and specify the format of the incoming string fields as:

Month yyyy (i.e. August 2019)

We can see that a new field is created, named “Month_Year” which contains data of a date type. The default date format in Alteryx is yyyy-mm-dd, hence “August 2019” will be “2019-08-01”. Sorting by the Month_Year field in an ascending order sorts our rows in the correct order.

Figure 3 – Using the Date Time tool, we have created a date field in a format that Alteryx can understand.

Step 2 : Assign a Record ID to each month

Use a summarise tool to “Group by” the different dates in your “Month_Year” date column. You should end up with a table of a single column where each row would be populated by a different date. Then, use the Record ID tool to assign a unique identifier to each record. You should end up with a table similar to that illustrated in Figure 3.

Figure 4 – Assigning a record ID to each date record

Step 3 : Join your data back together

Use a Join tool to bring your data back together by using your “Month_Year” column as the specified field to join on.

Then, use a Formula tool to concatenate the original “Date” field with your “Record ID” field. In the output of your Formula tool, your table should be similar to the Table shown below.

Figure 5 – Forming the desired column headers name (ID_MONTH_YEAR) through the use of a formula tool

We have set the Formula tool to update the “Date” column by adding in front of each record the respective number of the Record ID field with an underscore :

Tostring ( [RecordID] ) + ‘_’ + [Date]

Since the “Date” field is a string type and the “Record ID” column contains numeric data, we must convert the Record ID into a string before adding the two fields together, hence the use of the expression ToString(). What we’ve achieved so far is to have our original string “Date” field with identifier in front it. Thus, if we pivot our data now, column headers will be sorted based on the record ID in front of the actual month and not in an alphabetical order as before.

Step 4 : Pivot your data with a Cross Tab tool

Now, use the Cross Tab tool to bring your table to the desired orientation. Change the column headers to contain data from the “Date” field and your values populating your new columns should come from the “Sales” column. Finally, group based on the name of each product. Your Cross Tab tool should be configured as shown in Figure 5 and your output should have a similar format.

Figure 6 – Pivoting our table to acquire the desired orientation. Note how our column headers are now in the correct chronological order

As you can see in Figure 6, we have managed to sort our column headers in a chronological order from August 2019 to February 2020.

Step 5 : Rename your column headers

Our final step is to rename our column headers and remove both the identifiers and the underscore. To do that, you can use a Select tool and change each column header name manually. However, this process can be quite labour intensive and rather impractical if your data range for more than a year. A more robust way would be to use a couple of the Developer tools.

Use the Field Info tool to get the metadata about each field. Then, I have used a RegEx tool to parse my data with an appropriate expression. The “Text to Columns” tool would also work, but it would need some extra tools to bring the data back together. For this example, I rather use the Regex tool for brevity.

Figure 7 – Configuration of the RegEx tool

The regular expression I’ve used is the following:

(\d_)?(\w+_?.*)

EDIT : The aforementioned expression will only work if there is a single digit before the underscore. If you have more than one digits, then the following expression should be used :

(\d+)?_?(.*)

We can see that the “Name” column on the left-hand side contains the original names of our column headers, whilst the “Month_Year” column contains the replacements of those fields. To make this replacement happen, we can use a Dynamic Rename Tool, configured to take field names from right input rows. Configuring the tool as shown in Figure 6, we can see that the output is in the desired format.

Figure 8 – Configuration of the Dynamic Rename tool and Final Output of our Flow

If you have made it so far, thank you for reading my blog. I believe this is a very nice trick to learn and also allows you to practice with a couple of tools outside the ordinary. Also, inspiration for this blog is a question posted in the Alteryx community, answered by Phillip Mannering.

Author:

Angelos Pachis

View Profile