Tools, Tests and the Importance of 'Bio Breaks'

by Ravi Mistry

The last two weeks have flown past. 
Not only have we been exposed to more aspects of Tableau, we’ve had a crash course in Alteryx and been able to effectively connect with the Tableau community – And boy, was it great fun to be around people as enthusiastic about data’s potential as I am.

Since I last blogged, the Alteryx ACE and two times Grand Prix winner Chris Love spent two or so days teaching us – But it wasn’t Alteryx from top to bottom. We approached it with the idea of using Alteryx as an option, and exploring what it could offer us that what we could offer it. Using flashcards to build processes was as useful as drawing out visualisations with jumbo crayons. For me at least, it helped to understand how Alteryx worked in my head. So when we actually opened our laptops and started to understand, we had a good idea of the order in which a process had to be built. We were all amazed at the power of Alteryx, and the simplified complexity was great fun to play with. I called it data laundering, as dirty data was taken, processed through the various tools and spat out the other end as a nice single clean data set, prime for manipulation.

My favourite part of Alteryx was that I could break it. Again, and again and again. The original data stayed the same, and I didn’t need to download the output until I was ready. The mass amount of  browse tools we used were definitely helpful here, and a good practice from Chris. The project for Week 2 was simply ‘education’. With the addition of Alteryx and a variety of web-scraping tools to our arsenal, we all took different areas of the educational spectrum to tackle. My project was looking at the Guardian League Table 2015/16 and including all subjects offered, and the overall university ranking. (Of course, these are arbitrary at you can find out how the Guardian weights their scores here.


The trick for me was to firstly find the html code which had all the course codes and names, which would then allow me to scrape the data from the tables. This was done using the webpage source finder, and a few minutes of hovering and trying to understand, I found the relevant string of html. I then inputted this as text into Alteryx, followed by the struggles to turn a 1 by 1 html string into a nice clean table.

This is what I started with:

The top stream from the inputted data actually was pointless and the eventual output I got could have been sorted out with three tools – A .xml parse (the one which goes along the bottom) and a formula tool to concatenate into a format which allowed me to scrape the data using Kimono.

Once I had this, I was able to combine the subject names and totals with the extracted data from Kimono (which itself had to be sorted out and arranged into a format which could be combined) – With a little help from Chris and Paul, I was able to create this workflow, which gave me a simple, clean dataset which was ready for Tableau.

image (1)


Once in Tableau, I was a bit underwhelmed. I toyed with the idea of hunting down long/latitude numbers for the variety of universities listed, but there was no obvious dataset to take from, and working out how to scrape might have taken too long. Next time, baby. I wanted to show whether there were clusters of universities who offered similar sort of quality on groups of courses – So humanities, business related etc. But, without the geographic data, I steered away from that idea.
What I went with was a continuation of the University league table, however I wanted to only show the top 25 (Which would be user-customisable to show the bottom 25/the rest etc.) and if the user wanted to focus on one university, I wanted to highlight that. That’s where parameters became my best friend.

Parameters allowed me to not only act like a filter, but also create a dynamic axis which was customisable by the metric I had chosen. This was super duper cool, and I’m already thinking ahead as to how I could use this with my football blog. I think a good use case might be dynamically showing a calculation (ie. defensive on one axis, offensive on the other) and allowing the user to customise and see what they wanted. It’s one to think on, and one I will probably revisit as I learn more with the school.

So my second week’s viz can be found here, with a screenshot below;

Week 2 viz


I’d appreciate any and all feedback!

The other half of week 2 was spent preparing for the Tableau Certification exam – A test designed for the basic Tableau user who has been using the software for 5 months… Us? We’d used it for three weeks.
Robin Kennedy and Mike Lowe were our coaches alongside Andy Kriebel, and the revision of the practice test we took, some key areas and terminology were activities which ran parallel to our weekly projects.

So onto week 3, and the conference ft. the Associate Exam..
My exam experience was an interesting one – Coach Kriebel went through some key points in the morning, and after a Tableau Conference lunch we took the exam. Before you go in, the pre-exam material clearly states ‘You will not be able to leave – Please use the restroom before starting’ – Advice which I clearly did not adhere to… So twenty minutes into a two hour test in exam conditions, I am bursting to go.. SO a lesson for you folks; visit the little boys room before you sit any exam! But I was delighted to pass nonetheless.

The Tableau conference was simply awesome. The Keynote speakers were captivating, the sessions I attended were all excellently presented and the people I met came from all walks of life and each had a story to tell. Great stuff. As well as the Keynote speakers (which included James Eiloart, Francois Ajenstat, Ben Goldacre, John Medina and Hannah Fry – All of which had presentations well worth a re-watch) I attended;
> How I Lost My Fear of Clients’ CEOs at mib:con by Filip Dousek (Really good which included these two slides which were excellent in my opinion)
unnamed (1) unnamed2

> How United Utilities Put Data at the Heart of its Business, which looked at how they worked with Exasol and Tableau to visualise the mass amounts of data they collated
> To the Moon & Back Eight Times a day with easyJet, was a great presentation by Nik Stoychev who explained how easyJet were able to map their flight-paths for optimised fuel saving etc.
> 100 years of Visualization Best Practices with Andy Cotgreave looked at the fact that we hadn’t learnt too much on best practices over time – My favourite quote was this one below
> Beefing Up Your Vizzes with Titles, Labels and Tooltips was a great little tutorial/workshop where I learnt how to create this viz which allowed me to see the medals visually when I hovered over.

My first conference was a really great experience, and being around people who are so enthusiastic about data as well as those can show and teach others effectively the tricks and shortcuts was interesting. I particularly enjoyed networking and speaking to people, as well as the keynotes – of which all the speakers were friendly and approachable too!

The week was rounded off with a treat, having Francois Ajenstat coming in to do a talk on Thursday morning – Where we spoke about how product development worked at Tableau, as well as a look at Elastic (which is supremely intuitive and smart and simply awesome), with the week rounded off by the amazingly cool George Mathew, President of Alteryx who talked about the company, the structure and how it has moved and developed within the sphere of data analytics. I was so impressed by his relaxed yet informative presentation style – to have his presence whilst being so laid back would be so cool.

So weeks 2 and 3 have been eventful as expected, and the Data School rollercoaster continues!