There are more than 100 tools in Alteryx. Check out this list with 12 of Alteryx basic tools that can be used as a start point to play around with the software
If you are starting to learn a new language you would not choose the most difficult novel ever published in that language to be your first one, would you? The same way, Chris Love introduced us to Alteryx picking up several tools that are at the same time basic and very handy to carry out data preparation.
Alteryx has in total more than 100 tools (If you are curious about them, you can access the full list here), so it is crucial to choose from where to start the learning process and I think Chris did it very well with us.
For this reason, I decided to select and organise my favourite tools so far in a kind of guide. Before start talking about the tools themselves, a brief note: Alteryx organises them by capabilities and each capability has a colour. All the time you see tools with the same colour, it means they are from the same category. For instance: Select and Filter are both blue because they both are “Preparation” tools.
Besides organising my list by capabilities, I also grouped in pairs of tools that have something similar to highlight the differences between them.
They are used to access the data from locations such as your desktop, a database, the cloud etc.
Input Data: Most of the time this will be where your workflow starts. I like to think it as a door that you will open to allow your data to enter in Alteryx. 😀 The data can be in different formats such as xls, csv, jason, xml, tab and many others.
Output Data: If the “Input Data” was the door to allow your data to get into the workflow, the “Output Data” is the closing door that will finish the Alteryx process. Say goodbye to your workflow! Your data now is ready to be used.
The blue tools are made to facilitate the process of getting data ready for analysis. Instead of worrying about SQL coding our Excel formulas you only need to drag and drop the tools inside the workflow.
Filter: Basically, the filter allows you to split your data into two streams: the records that satisfy your expression (True) and the records that don’t (False). You can access both streams after, but as a good practice it is recommended to build the expression in a way the desired answer will be in the “true” side.
Select: It is like filter in a sense, once it also works within the columns, however, instead of an expression that can result in a true or false answer, here you can select and edit your fields. Useful to rename, resize, change the data type and others.
Formula: It allows you to create or update fields in different ways. For example, we have used formulas to calculate the number of columns in a file, to create a new column with the sum of another two columns…
Multi-Row Formula: This tool allows you to create or update the rows in a single field. Today we used it to fill the “null” values in a column with US states, making the states appear in all rows.
Those are the tools used to blend different sources of data.
Union: Used to combine two or more data streams with a similar structure (same columns), but different data. The result will be a file with all the information of both streams together.
Join: It is useful in those situations in which you need to combine datasets that have one or more common fields. Good examples of situations to use Join are when you have either a unique ID or the name of a city or country that is the same in both datasets.
PARSE & TRANSFORM TOOLS
They help to restructure and re-shape the data to the format needed to the analysis.
Transpose: This turns horizontal fields into vertical fields.
Crosstab: This does the opposite than the transpose tool: it turns vertical fields into horizontal fields.
Text to Column: It splits the text from one field into separate rows or columns. Useful when you have words in a row separated by comma, for instance, and would like to split them into a new column.
Regex: With this tool, you can parse, match or replace data using regular expressions. For instance, while analysing Twitter data we needed to get rid of the punctuation marks. For this we created a simple expression instructing Alteryx to keep only letters, “@” and “#”. This way we kept the words, hashtags and users and got rid of thinks like full-stops, ellipsis, exclamation marks and question marks.