Friday Project Woes

by Theo Isaac

If you are lucky enough to join the Data School, you will immediately realise that you are in for an incredible, but very intense, first few months! The sheer amount of new information you receive can be quite overwhelming, with multiple software training being taught simultaneously and a seemingly endless number of data prep challenges being thrown at you along the way. The weekly training culminates in a Friday project – these being short 4 hour internal projects at the start of your training, and later being weekly projects for real clients. In this blog, I will talk about my first three Friday project experiences, and how, although they don’t always go to plan, they provide an invaluable learning experience.

At this point, I would like to highlight the incredible amount of support The Information Lab provides alongside this training – there is always someone to go to if you are struggling, and there is a vast array of helpful tips and information online. I mention this now because inevitably, for some, these early Friday projects will not go as you would have liked, and sometimes it is easy to feel like you are falling behind. However, these projects are designed to emulate potential fast-paced client situations, for example a situation where you have limited time to produce a dashboard for a stakeholder. The head coaches Andy and Carl are not looking for perfectly polished Dashboards, more a demonstration of some of the skills you have learnt at the Data School so far.

My first project – re-viz an old application

So my first project – what went wrong? Well, perhaps the question ‘what went right?’ would be more astute here. The first Friday project we were given required us to add an additional data set to one of our application vizualisations, then redo the dashboard. Sounds quite straight-forward right? Well, yes and no.

The viz I decided to rework was originally on male grand slam tennis winners – so adding data about female winners and opening it up to all tournament winners seemed the obvious way forward. However, almost immediately I encountered a problem when trying to read in my data to Alteryx using the wildcard function (I had 17 CSVs for both male and female players). Even though all excel sheets where in the same format, the wildcard function would only read in some of the sheets. Long story short, I wasted a significant amount of time trying to work this out, leaving me with very little time to finish the rest of the data prep and create a visualisation to present. When the afternoon deadline came calling, I essentially had a viz that looked at more data than my first one, but actually revealed less insights.

Obviously, this is not how I had wanted the project to go, so walking up to present I was dreading the negative feedback I was bound to receive. However, once I had gone over my workflow, the problems I encountered and quickly running through my viz, Andy and Carl were far less critical than I imagined and provided me with some valuable feedback regarding my Alteryx flow and dashboard presentation.

The second project – air quality in Madrid

The second project was perhaps the best project I have done so far, likely due to the fact we were paired up with a member of DS17 (I was paired with Juliana) in a joint project. The idea was that we could learn from them and see where we would be in 2 months, while also giving them a chance to mentor us. The project itself required us to use a data set on the air quality in Madrid, and create a vizualisation to present.

This was the first project where I really appreciated the power of Alteryx and how it can be used on real life messy data – taking a load of excel files (half of which was in Spanish) and turning it into one usable data set on which we could do our analysis.

However, as with my first project, things didn’t go to plan in the data prep stage. Without getting into too much detail, Juliana and I got really bogged down in one of the summarization/tranpose stages and lost a lot of time. Again this left us with limited time in which to do our visualisation, but it was incredible to see how quickly Juliana could put together our charts into a interactive dashboard. I was still nervous when going up to present as we should have spent more time on the dashboard, but again Carl and Andy provided interesting and constructive feedback.

Project number 3 – spatial data challenge

My third project, and my last one so far upon writing this, required us to use spatial data and look at rental prices within a city of our choice. Again the project sounded fairly straightforward and I was eager to get going.

However. Problems ensued.

The project started reasonably well. After initially trying and failing to find rental data for cities I was actually interested in, I settled on using Bristol as we had found clear rental data online for all major British cities. I had found a SHP file for Bristol and it even appeared to have the correct polling ward boundary areas (which we would use to match up our rental data). Everything was going smoothly for once. Until I realised, about 2 hours into the project and 3 data set joins in, that the ward boundary numbers I was using in my SHP file were outdated as Bristol had changed these a few years ago and I had to redo a whole section of my Alteryx flow. Again, long story short, to complete the data prep phase I was now severely behind, and again left myself with minimal time for creating a dashboard in Tableau.

The dashboard I ended up presenting was not at all finished and had several flaws that were highlighted by both Andy and Carl. However, while it was of course critiqued, it was again all constructive and actually made me feel better about my dashboard. Personally, I was just happy to have made a map I could show by the end of it.

What did I take away from these projects?

If you have made it this far through the blog, you might think that Friday projects sound a nightmare from start to finish. However, they are invaluable in terms of dealing with difficult situations and working to extremely tight deadlines. I will almost certainly face similar situations to these at some point in my client placements, so every moment of blind panic and stress during these projects will have been worth it. In addition, they are also fun, despite the issues – it’s pretty amazing to see how the skills we have learnt (in just 3 weeks) can be used on real life data and to see the insights that can be made in such a short space of time.