Alteryx Core Certification Preparation: The basic tools you need to know

by Imogen Emmett

Here is an overview of what each basic tool does in Alteryx to help prepare you for the Alteryx Designer Core Certification.

INPUT/OUTPUT

This category includes tools for providing inputs and outputs for workflows.

Browse: displays data ‘Profile’ information, including data type, number of records, data quality, and a variety of statistical information.

Input data: brings data into your workflow by connecting to a file or database - the data can be in different formats such as xls, csv, jason, xml, tab, etc.

Output data: writes results of a workflow to a variety of supported file types or data sources.

Text input: allows you to manually input text you want to use as an input function. From there, you can use all the common Alteryx tools - Filter, Formula, Join and so on.

JOIN

This category includes tools that join two or more data sources.

Join: the most common tool used to combine 2 data sources that have one or more common fields. Joins bring data side-by-side horizontally and create a “short and wide” dataset. Good examples of when to use a Join are when you have either a unique ID or the name of a state/country that is the same in both datasets.

Union: combines 2 or more datasets on column names or positions. In the output, each column contains the rows from each input – in other words unions stack data vertically by field, position, or manually, creating a “tall and skinny” dataset.

Find Replace: this can sometimes be a simpler and more efficient way of joining 2 data sources – it finds a string in one column of a dataset and looks up and replaces it with the specified value from another dataset.

Append: performs a cartesian join on both incoming datasets - this joins every row from one table (source) to every row of another table (target). For example, if the source table has 100 rows, and the target table has 1,000 rows, the Cartesian join of these two tables results in 100,000 rows.

PREPARATION

This category includes tools that prepare data so that it's ready for analysis.

Sample: used to limit the data stream to a specified number, percentage, or random set of rows. You can also apply the selected configuration to the columns selected to group by.

Unique: deals with duplicates – it separates unique and duplicate data by grouping on one or more specified fields and then sorts those fields.

Select: includes/excludes and reorders the columns of data that pass through a workflow. Can also be used to modify the type and size of data, rename a column, or add a description.

Select Records: returns records and ranges of records that are specified. It is very useful for troubleshooting and sampling.

Filter: selects data using a condition. This creates 2 outputs True and False - rows of data that meet the specified condition are output to the True anchor; those that do not meet the condition are output to the False anchor.

Formula: creates a new column, or updates an existing column, using one or more expressions to perform various calculations and operations.

Data Cleansing: fixes common data quality issues – it can replace null values, remove punctuation, modify capitalisation and lots more. Important note: only select the column you want to cleanse when using this tool.

Sort: arranges records in a table in alphanumeric order (A-Z, 1-9 ascending and Z-A, 9-1 descending). The tool also allows you to sort by dictionary order – this means that numbers get sorted alphabetically.

TRANSFORM

This category includes tools that summarise or rearrange data.

Cross tab: pivots the orientation of data in a table by moving vertical data fields onto a horizontal axis and summarising data where specified – this makes data sets “shorter and wider”. Using this tool also allows you to aggregate data.

Transpose: reverse application of the Cross Tab tool - it transforms the data so you can view horizontal data fields on a vertical axis, making data sets “thinner and longer”. Unlike the Cross Tab tool, you cannot aggregate data.

Summarise: conducts various summary processes, including but not limited to grouping, summing, count, string concatenation, etc. This tool works by aggregating data across rows.

PARSE

This category includes tools that help restructure and re-shape the data to the format needed.

DateTime: transforms date-time data to and from a variety of formats, for example transforming Date/Time format to string and vice versa.

Text to Columns: takes text in one column and splits the string value into separate, multiple columns (or rows), based on a single or multiple delimiters.