Today we were instructed to use The National Oceanic and Atmospheric Administration for our data. Andy told us to get the data at a weather station level and look at daily summaries. He then said the task “seemed simple” in a mocking tone, before promptly abandoning us for his holiday in Ireland.

The Data

The data was in a strange format for us to access (example search of “Virginia” below). Each station was represented by a point, and each point could be “added to the cart”, and you could download data concerning various weather elements in the time span of your choice.

 

 

However, this required the manual selection of each point, moreover, the cart would only let you request 5-10 stations without exceeding your download limit.

Upon this realisation, a few of us tried the API, but this yielded very little. Accordingly, we resorted to manually downloading each batch of stations that interested us. I selected all the stations within New York, Boston, Atlanta, New Orleans and Philadelphia, and the data was downloaded into a CSV format. After my selection, I had about 10 separate files of data, so I went into Alteryx to bring them together.

 

Alteryx

 

 

Thankfully, the files were in the same format, so I could simply union them together.

 

 

I decided that I wanted to add “City” to my data, however, the name of each station was in a different format, and the position of the city name was inconsistent in each one. I.e. “Atlanta 2.3” and “JFK Airport, NY”.

As I couldn’t use a Left/Right or a form of Regex, I summarised my data to only the weather station names so I could filter them based upon the city names they contained.

After filtering them into my five city/state groups, I used a formula to add the name of each city/state into a new column, before unioning the data together once more, and joining this back to the original union output which contained the weather data.

 

Tableau

Soon after getting into Tableau, Andy decided to make us downgrade to Tableau 9, for little reason other than his personal enjoyment.

After seeing the initial interface and how reminiscent it was of the Windows I used in primary school, I was anticipating it’s use to be much less palatable than it was. There few a few annoying things, such as not being able to use LODs and it taking me a few minutes to be able to find sheet titles. However, there was one particular issue that led me to briefly want to throw my laptop out of a window.

When looking at some big numbers, I wanted to remove all the borders on the sheet, so it would look half decent when put onto my dashboard. Whilst  finding the correct lines to format in recent versions of Tableau are enough to drive you to despair, I found this version even more hellish.

 

 

After removing every single line I could find in each of the many windows in the format pane, these lines still remained on my sheets. I rechecked each of the windows several times, ensuring that all of the lines have been in removed. They had. At this point, I gave up and put it onto my dashboard, accepting that the lines were here to stay.

 

 

 

However, it turns out that these lines disappear when placed onto the dashboard, as they simply mark the margins of each sheet, which are conveniently coloured exactly the same as chart borders. This confusion is probably down to my own stupidity, but it was annoying nonetheless.

 

 

 

The Dashboard

After creating dozens of sheets and not finding anything too interesting, I made a unit chart of the different stations within each city/state between 1894 and 2018, and I noticed several fluctuations in numbers; an increase in the 1940s and decline in the 1990s. After some research I found out that some sources attributed this 1990s drop in stations to organisations such as the NOAA and NASA closing weather stations at higher elevations, higher latitudes and those in rural areas, as these are often in cooler areas. it was postulated that the NOAA and NASA were culling these cooler stations to overestimate global warming, however, within my data the average elevation of each station could be seeing increasing despite the decrease in station number. Accordingly, I based my dashboard around disputing this theory.

I’m currently unable to decide which version I prefer, but here are the two finished products:

(I also hate that in this version the words in text boxes are separated, rather than just starting on a new line)

(And can you pad text boxes?)

 

 

 

See my viz here