Understanding Alteryx as a SQL user: part 1

by Soha Elghany

SQL is one of the most common programming language used for designing, managing and analyzing data. However, it is one of the most complex programming languages to work with and can be incredibly time-consuming and frustrating especially when your query will not execute because you forgot to add a comma!  

Alteryx differs from alteryx because it has a more simplistic workflow-based environment that allows you to prepare, blend and analyse your data regardless of how many various unstructured data sources you have included. The huge advantage the workflow offers is that it visually shows how you’ve worked with the data, which allows for transparency.

Traditionally when you work with SQL you begin with a SELECT statement, followed by either writing the columns you want to view or viewing everything:

SELECT c.Country,

              r.First_name,

              r.Email

FROM r.Registered

INNER JOIN c.Customer ON c.Customer_ID = r.Customer_ID

Instead of having to do all of this in Alteryx you use the Input tool onto your canvas area and connect it to the Select tool and tick of the columns that you want to view.

The biggest adjustment is re-learning joins in SQL you write the type of join that you want to do whereas, with alteryx you have several different types of join that you need to understand how they operate. I’m so used to way I write joins in SQL and this was further reinforced in Tableau that allowed me to do join in the same way, so it’s quite a learning curve having to re-learn joins and how they work. The Information Lab has a great learning resource that explains joins in alteryx click here to view it.

I made the initial mistake of assuming joins in alteryx would operate the same way as they do in SQL for example in SQL a left join also includes the intersection as well as all of the left data and vice versa for a right join. However, Alteryx does NOT include the intersection.

The advantage of not including the intersection is that it makes it easy to be able to join if you’re joining on the wrong field and its causing duplicates this is due to the way the join tool is build

The Join tool has 3 output L, J, R:

L: this means left join

R: this means right join

J: this means inner join

You can easily view each output that comes from the join to be able to examine your join and make sure there aren’t any errors.

Once you understand how joins in alteryx differ to SQL the journey towards being able to do much more complex things with your data sources becomes a lot easier.

For more resources, I would recommend going to the alteryx website and reading their guide here