Playing with Big Data Set

by Lorna Eden

Well this weeks task was to download all the data from United States Department of Transport, but with the intention of uploading to tableau public post presentations. This is a big data set. The issue with this is Tableau Public can only have 10 million records, and knowing that 2014 alone was 10 million records, it was going to be difficult.

 

The first aim was to download as much data as we possibly can. From the website it you were able to download each month from every year, but individually. Doing this would have taken us a long time. So as a team we found several different ways to download all the zip files, firstly we downloaded one or two zip files to see the pattern in the file name. We then created an excel document with all the website zip file names in, we then also created another column to say where we wanted these files to be downloaded. This was then imputted into Alteryx and using the download tool, it downloaded every zip file needed.

 

My next aim was to decided on a story. A few different ideas floated around, but I wanted to focus on the American summer holidays, which according to Andy run between June to August. Ok so I have narrowed this down, now lets look at the past 5 years worth of delays for those months. Knowing I wanted to do a hub and spoke map I knew I had to reduce the amount of records to being less than 5 million (Creating the paths for hub and spoke requires union-ing origin and destination above and below each other). So I reduced it down to 2013 to 2015 for June to August.

 

Great now to clean up the data and create the Tableau workbook.

 

My Alteryx workflow looks like this

workflow week 8

 

I now needed to create my Tableau work book. I wanted a hub and spoke, departure and arrival delays for origin and destination. During the week I also learnt how to create a calendar within Tableau. Which was pretty cool.

Originally this is what my viz looked like.

American Summer Holidays

But during the presentations Paul Chapman (EasyJet) picked up the planes don’t point the direction they are travelling and this was one of his pet hates. He sent me a folder that included 360 different direction planes. Now I just had to figure out how to do this. This will be featured in my next blog.

And now here is my final viz. Click to interact.

Plane Directions

Any feedback is always welcomed

 

Thanks

#DS2TipWeek - Publishing to Tableau Server using Command Prompt

2 mins read

Fri 19 Feb 2016

#DS2TipWeek - Arranging Shapes Using Index

3 mins read

Thu 18 Feb 2016

#DS2TipWeek - LODs of Fun with Custom Dates

2 mins read

Wed 17 Feb 2016

#DS2TipWeek - Fill a Date Range Using Alteryx

3 mins read

Tue 16 Feb 2016