Today, we are tasked with downloading journey data from the Uber movement site and viz it.
The data travel time data is displayed as choropleth maps on the website. Once the website is displaying what you want, you can download the data, relating to each polygon. There is also geodata available in .json format.
The first step. Parse the .json geo data to extract out the lat/longs in a way that I can then build polygons. The polygons are not readily available to take from the website.
I decided to look at Cairo over the summer holidays! In particular, from the airport, how long does it take to get to a range of destinations and how do the journey durations differ across the city?
Firstly, I brought the .json file into Alteryx. This was then parsed out to bring the lat and long per ‘point’ to the same line. The steps I used here was absolutely not the best way to do this. I used a series of 3 or 4 prep (blue) tools which basically looked for a ‘1’ in the JSON Name. Fill a new field ‘Separate’ with a 1. Then bring accross into a new field the coord on the same row. Fill up using a multi-row formula. Then filter out the 1’s.
What should I have done? CROSS-TAB. I think the reason was, I started by focusing on the information above each polygon. Trying to develop a robust method for bring out what I wanted, rather than thinking of the data on a larger scale. If I’d have had cross tab in mind, all i’d needed to do was remove the information after i’d extracted what I wanted. Then perform a cross-tab, easy.
After joining the polygons onto the rest of the data pulled from the Uber website, I ended up wit something I could bring into Tableau. After I got the data into Tableau, I started playing round with the polygons, really nice.
Above is the viz I created showing the mean journey length by colour (choropleth) and the variance in journey length by sized circles. This was calculated by subtracting the lower limit of journey length from the upper limit of journey length.
On reflection, a completely insane but amazing week. Such a great learning experience. The time pressure, the variation in the tasks, the daily presenting. Now we get to set the challenge for DS16!