Using fundamental Spatial analysis in Alteryx is pretty easy, thanks to some of the tools, which do all the work for us 😊.
This time we are going to calculate the distance between two points on the Earth, a basic measure in the Spatial world, which might come handy sometimes. Maybe your company wants to calculate the average distance between the warehouse and the stores it needs to deliver to, or what was the total driving distance your taxies usually do.
For this example, we are going to use the London Bus database, and we will calculate the distance between each stop for each route, which will also allow us to calculate the total distance for the route.
You can download the London Bus dataset here.
The main things we need to have to calculate the distance between two points are… the points! We can call them Centroids as well.
So first we will calculate the Centroids for each stop, and after that, we will look into the distances.
1. Load and Preparation of the data
Under the variable “Run” we have to unique values: “1” and “2”. These two values represent the two ways the bus could go (Each Run has a different final stop).
Because we will only need one way, we will filter out, using the “Filter” tool, Run 2 and keep just Run 1 (or vice versa, it does not matter)
2. Create Centroids (points)
After preparing the data, we will create the centroid for each bus stop using the “Create points” tool, under the Spatial section. In order to create centroids, we need the Latitude and Longitude for each point.
In this case, we do not have the classics Latitude and Longitude, but instead, we have “Location_Easting” and “Location_Northing” (British Latitude and Longitude).
Due to this, we need to change the type of “Projection” to “EPSG:27700”, which is the British National grid.
Once the centroids are created, we will clean the dataset to keep only the variables we will use to calculate the distances.
3. Create a column with the Centroid for the next stop
To calculate distances between stops, we will need to have the centroid for the current stop and the next stop in the same row.
This could be done with the “Multirow Formula” tool, which allows us to return the value for another row in our dataset.
We will create a new field “Next Stop” that will contain the centroid for the next stop in the route. Because we want this to be done by Route (you don’t want the next stop from another bus route), we will group by the Multirow formula by “Route”.
Be aware that you will need to set the new column “Next Stop” as a Spatial Object, as we are returning a Centroid.
4. Create Distances
Now we will use the tool “Distances”, under the Spatial section, to calculate the distance between the two stops.
You will need to select a point of origin or “Centroid Source” and a “Point of destination”. For our example, these will be “Centroid” (Current stop) and “Next Stop” (Destination stop).
We will also untick “output Cardinal points” as we are not interested whether the direction is East or North etc…
You can also choose the unit measure you want. In this case, I have chosen Kilometres that makes more sense for me.
5. What Bus Number has the longest route in London?
Now that we have calculated the distances from stop to stop, we are going to sum all the distances for each route and check which one has the longest one.
It seems that N199 has the longest route with 33.82 km each way!
That is all about the basics on how to calculates distances in Alteryx. In the future, I will post about how to create lines.
Hope you liked the post and comments are welcome. Thanks a lot!