So DS6 wrapped up last weekend and here are some tips, based on my experiences, for anyone about to start or interested in joining the information labs data school.
As a follow up to my random forest model blog, the next logical step is to explain the theory behind Alteryx’s gradient boosted model (GBM). Both models typically use an ensemble of decision trees to create a strong predictor. However, they differ in how they act at each stage of the ensemble.
For day 3 of dashboard week, Andy gave us the challenge of creating a transport dashboard in relation to a City. As we were not allowed to viz on New York or London, the initial challenge was to find a City that had data granular enough to build an insightful viz.
This morning Andy greeted us with alcoholic data and the choice was between analysing wine or beer datasets. I chose the latter which turned out to be very challenging due to the lack of detailed statistical datasets.
I spent the morning searching online datasets and API’s to try and find a dataset suitable for tableau. Eventually, I came across Beeradvocate.com data on data.world that provided 1,586,614 reviews of around 45,000 beers.
Today was our first day of dashboard week and we had to create dashboards with NBA data. I drew lot number 4 and ended up with the task of creating a dashboard the illustrated NBA franchise history.