It is the start of the week after dashboard week, this new week is one of festive joy rather than anticipation for what curve balls Andy could throw at us. During these moments of calm, I wanted to reflect on what I have learnt from last week’s madness. The week actually turned out to be rather enjoyable! It was good fun to be able to try a variety of new techniques, be working under a time pressure and have a mixture of working as a group to start but branching off on our own projects as the day went on. In the blog I will go through things I learnt on the individual days and end with things that I learnt as the week went on.
Day 1 – Snow Plough data in NYC, 250 million rows of data
During day 1 on dashboard week was instantly made clear that working with big datasets can be very slow work. It took us over 2 hours to just download the data, then a little while longer to upload the data to Exasol as we had to break the dataset into smaller sized chunks, then after connecting the data to Tableau every query would take at least 10 minutes to run. Here it was important to run a database filter to get rid of as much unneeded information as possible from entering in to the used section of Tableau in order to reduce the time taken for queries to be run, it was also important to think about what you wanted to use the data for rather than just playing with the data until finding something of use, and it also taught the importance of knowing what queries take the longest times to run. Trying to add a filter of the roads took over 30 minutes to run, however after the had been run it saved so much more time over the rest of the day. Whilst I am not too happy with what I actually produced during the day, the idea of focusing in on a certain thing early on was a good idea. Either focusing on a certain period of time or focusing on certain roads dramatically decreased the size of the data and made it a lot more usable. Something I learnt as I came to the end of the day was that it made sense to be writing the blog throughout the day as I was working rather than doing it all at the end. It is very easy to get caught up in certain areas of the daily project and forget about others, so to have some sort of a reminder to keep up with other bits too or just to make notes of what you’re doing is actually really useful and ends up using less time.
Day 2 – NOPD Body Camera Data, 2.7 million rows of data
This day I learnt that even a dataset 1% of the size of Monday’s and without the help of Exasol could be very slow to run too; this is definitely true when it comes to using density maps in Tableau, with this amount of data they were pretty much unusable. I also learnt how to make a radial bar chart using a few different blogs online, the big thing is having to duplicate the data so that the two ends of the bars could be made, the rest was fairly easy to follow as I got used to working in polar form (radians and lengths) during my time at university. This was an interesting skill to learn and it was cool to see what techniques are used to achieve radial charts in Tableau. After creating the radial chart I realised that basing a dashboard around one chart was not a good idea in this case and luckily it was easy to find a dataset that was very similar which made using the same style charts very valid.
I started to write the blog from as soon as we were given the data. This meant it was easy to jump back to when things were loading as I already had it open and started, however I feel it did make the way the writing was structured rather odd as I started the blog in present tense and ended in past tense.
Day 3 – Using Power BI – Large Foreign Gift Amounts to U.S. Universities
I found learning a new software good fun and a really useful exercise, especially such a big rival to Tableau. Whilst teaching Tableau the week before, someone in for the lesson was arguing that Power BI is all you really need to have in order to do data analytics. To this I found it difficult to structure a good argument as I did not have much knowledge about Power BI, I just knew that it was a tested and calculated decision by TIL that Tableau is the best software to use and invest a company into so I argued for Tableau. I now feel that I have more of an understanding and would be able to give a much better argument. Power BI is very simple to use, it lays out graphs for you and you don’t have to think too much about the data you are using, however when it comes to exploring the data it is incredibly limited, the formatting and moving into new graph types is very limited also and I found the online community to be a lot less informative and friendly than Tableau’s. Power BI certainly has strengths but I really believe now that Tableau overall is very much stronger. Because of the limited use within Power BI it was very easy to finish early in the day with a report I was quite happy with, it was simple but effective. The data was very simple to use and the day was more based around learning about Power BI, so this day I mainly just learnt about the new software, and also how much Qatar gives to US universities…
Day 4 – Bicycles in Seattle, 886752 rows after union of 11 files
From the get go I started by drawing out an idea of what I wanted to do with the data. I had previously created a dashboard on TfL’s bike data so had an idea on what looks good with this, so drew something out and ended up sticking to it quite well. What I drew out was quite simple in design so it was easy to get to this point and then keep making improvements. I think that it was due to starting simple and making improvement after improvement that I ended up being quite proud of this viz,which is something I will certainly take forward in my dashboard creations. Small iterations and improvements made it really easy to keep an eye on the time and what was possible for the day. I also found out that sheets in tooltips are incredibly easy to use and very easy to format the way that you want them to be.
Day 5 – Boy Bands, 56 rows and 236 rows for boys and bands respectively
This day was more of a fun dataset and I really enjoyed it. I started off the day with the thought of “I am not creative enough for this” and what I ended up coming out with was something I was quite proud of. I had the idea from the start that I would focus on just the one band and, due to how small the data was, the idea quickly turned into an infographic style dashboard. I found making a more visual style dashboard very enjoyable and because I had enjoyed finding out the information and ways to display it, presenting it was really good fun too. When making changes to it over the weekend I found out quite a few little tricks to make the style of the dashboard better and that will likely come in handy for the future too, such as formatting of icons.
During dashboard week I learnt that the design of dashboards doesn’t have to be as daunting as I was making it out to be, keeping it simple and making improvement is actually a really good way of working which allows you to get to a stage that you are proud of. The pressure of having the time restrictions was very useful for getting things done, having a deadline made it important to give a good scope to the work and what data would be good to use, as well as forcing a dashboard to be made in which ever way was possible and looked the best/was most informative. And I found out that I still quite enjoy the pressure of fast paced work, I found the rush of getting the data into a usable form and finding a use for it really enjoyable which made dashboard week quite an enjoyable week! It was interesting to see everyone’s different approaches to it and the different outcomes that everyone achieved. I am really glad that dashboard week is a part of the Data School training and I hope the cohorts to come enjoy it as much as we did.