Today was the first day we had to deal with XML files, which was equal measures frustrating and fascinating.

Today’s dataset was about the UK food hygiene rating and came from the¬†Food Standards Agency.

Regarding the ETL, I tried various ways to get data from all the UK cities, that however proved to be too time and space consuming. After about 2 hours my XML parse had only gone from 50% to 54%, while the dataset had increased in size from 2gb to about 650gb!

Needless to say, I had to give up on that idea and just borrow prepared data from Nils Macher, which had gotten data from just London.

With the prepared data in hand, I tried many different ways of visualising things and most of them failed completely.

After a lot of iterations, I finally got to this visualisation. I’m really looking forward to the new density feature in Tableau, which would have been really useful for the map.

DW3viz