APIs – Open up to a world of data (Part 2)

by Joe Stokes

So yesterday I did a blog post on APIs, and how you can carry out a URL call to bring in each crime along with there counts. This is great and gives us a basic look at how we can utilise APIs.

However, let’s take it a step further. APIs often hold huge amounts of data, and it takes some level of navigation using the URL to get at it. Take for example the NFL arrests data I am using. We have all of the crimes that occurred in the league, a high level of granularity. The API also contains crime by teams, players and more.

So how do we get to it? The answer is in the URL, changing the endpoint to find the area of the database will do it. We can use Alteryx to edit the URL to modify the endpoint to what we need.

In our example, so far we have used this URL: NflArrest.com/api/v1/crime … This brings back this:

If we want to bring back each crime by team, the URL is as so: ‘https://nflarrest.com/api/v1/crime/topTeams/<CRIME TYPE>’, where the placeholder needs to be replaced by one of the categories above. For example in our browser typing in ‘https://nflarrest.com/api/v1/crime/topTeams/DUI’ will bring:

So we need to be able to use the API for each crime in our list. This is where Alteryx comes in…

Essentially we are going to look to append each category to the end of each URL.

So starting from where we ended our last workflow, add another text input and create a new column with header ‘URL’ and the field being ‘http://nflarrest.com/api/v1/crime/topTeams/’

Then, drag on an append field and leave it at default.

The results will be:

Next we want to essentially concatenate the ‘url’ field with the ‘Category’ field. We do this with a formula tool. Drag it on and use the following formula.

Results:

Now we are going to use the download tool to call the API on each row, drag it in and choose the ‘New URL’ (or whatever you called it) field.

Similar to yesterday we now have all of the data we need in the ‘DownloadData’ column.

Let’s use the JSON Parse tool next to parse out this data, again choose the ‘DownloadData’ field.

Next let’s use ‘Text to Columns’ and split into two rows with the delimiter ‘.’

Now it’s time to Cross Tab so we can associate each column header with their value. Drag in the CrossTab tool, group by Category, arrest_count and JSON_Name1. Choose column headers as ‘JSON_Name2’ and Value as ‘JSON_ValueString’. The output is as so:

Next, let’s use a formula tool to pull together the Team City and Team name. Drag it in and here’s your formula:

Last step put in a select tool, remove the unneeded fields and clarify some names:

Fantastic, this our final output:

This now allows us to analyse the NFL arrests at a lower level of granularity. We can take our previous Viz and iterate to show even more analysis.

First we had this:

But now we can have this:

Thanks for reading and happy API-ing!