I’m four weeks into learning Alteryx and if there’s one thing I have noticed it is this; there is usually a way to use one tool instead of a chunky three (or more) tool workflow you have hacked together.
Removing the unwanted top rows from a data source
In this blog I’ll talk about loading data into Alteryx and removing the first rows of data (when they contain unhelpful information). The example in Figure 1 shows such a case. I have loaded in the excel file and as you can see rows 1 and 2 do not contain any useful information. Furthermore, the field names are not correct. The aim is to convert row 3 to the field names and remove the first two rows from the data.
Figure 1. Data input default menu
How do you do this?
There is a long way and efficient way to do this. In this case the efficient way is actually easier to set up too!
The Long Way
First the long way! Many people clean this data up using a sample tool to remove the first two rows followed by a dynamic rename to get the correct field names, as shown in Figure 2. This way takes a long time and adds unnecessary clutter to your workflow. Check out the much more efficient way to do things below.
Figure 2. The tools required to clean this data via the long way.
The Efficient Way
Instead of using three tools here you can get the data in the right format directly from the text input screen. By changing 7) Start Data Import on Line you can complete all three steps in one. This is shown in Figure 3 where I have set the data to start inputting from line 4. This sets Aruba as the first observation and takes the information for field names from row 3. The result of this selection is shown in Figure 3. When doing this yourself change 7) Start Data Import on Line to whatever line your data begins at. This is a great way to clean up your workflow.
Figure 3. The set up required in the input data tool to remove unwanted rows and yield the correct field names.
Any questions please leave a comment or get in touch with me on twitter @CoalWilson