Combining Data in Alteryx - A crash course in Unions and Joins

by Morgan A Rennie

There are many ways of combining data in Alteryx, allowing you to merge multiple data sources into a cohesive dataset. This can be useful for a variety of reasons, including cleaning and transforming data, identifying trends and patterns, and creating reports and visualizations.

To combine data in Alteryx, you will first need to have your data sources prepared and ready to go. This may involve cleaning the data, ensuring that it is in the correct format, and organizing it into tables or files. Once your data is ready, you can use a variety of tools and techniques to combine it.

Unions

One way to combine data in Alteryx is by using the Union tool - depicted by a DNA image. This tool can be found in the purple "join" tab or within the default "favourites" tab. This tool allows you to merge two or more datasets into one single, central dataset while preserving the original structure and organization of the data. The Union tool is particularly useful when dealing with datasets with different columns or fields, as it allows you to merge the data based on common fields - such as an ID or Location field.

Union Example

You receive two customer information databases from different locations and the central office has asked you to combine this information into one central customer directory - as your locations build the customer lists with unique IDs, you can combine the data using a Union tool and match the tables on these fields.

Remember: A Union aims to help employees from the top to the bottom of a company - Unions expand your data set vertically.

Joins

Another way to combine data in Alteryx is by using the Join tool. The Join tool can be found in the purple "Join" tab, or in the default "Favorites" tab - depicted by a neural web. This tool allows you to combine two or more datasets by matching values in one or more fields or columns. For example, you could use the Join tool to combine two datasets based on a customer ID field, or to merge two datasets based on a common date field. The Join tool is particularly useful when dealing with datasets with a lot of common fields or columns, as it allows you to merge the data quickly and easily.

Join Example

You receive the sales data from a satellite bookstore including information on the book ID, Sales Quantity, and the price of each book sold. At the central office, you have further information on the books, including the Author and Title relating to the sales ID. The stakeholders require an in-depth report combining this information, to see the number of sales, per book, including the book details. To do this, we can union the data. This would allow us to further group the data by Author if we wanted to determine popular authors.

Remember: When two or more people Join hands, they are too wide for the sidewalk - The Join tool expands your data horizontally.

Summary

In addition to the Union and Join tools, Alteryx also offers a variety of other tools and techniques for combining data. For example, you can use the Append tool to add new rows to an existing dataset, or the Intersect tool to create a new dataset that contains only the rows that are common to two or more datasets.

Overall, combining data in Alteryx is a powerful and flexible way to merge multiple data sources into a cohesive dataset. Whether you are dealing with datasets that have different columns or fields, or datasets that have a lot of common fields or columns, Alteryx has the tools and techniques you need to combine your data quickly and easily.