Today our task was to extract data from NOAA’s daily summaries interface and generate a dashboard of climate data. Lately I’ve been reading a lot about spaceflight so I decided to keep it simple with a dashboard comparing climate data from the two main launch sites in the US, which are Cape Canaveral in Florida and Vandenberg Air Force Base in California.

Data Preparation

The NOAA daily summary data is collected from many different sources for thousands of locations around the world, some of which in the United States are managed by volunteers. This becomes apparent when extracting the data, because it’s rather patchy. Extracting files from the NOAA is not difficult for one or two stations, but long histories of multiple stations in a given area (e.g. a city) will quickly reach the download limit of the online tool and may result in requiring API and parsing expertise. As I’d had quite enough of that yesterday, I kept it simple and downloaded five files – two for VAFB, one for Cape Canaveral, and one each for the two nearest cities to these locations which have much longer historical records, Titusville FL and Lompoc CA.

NOAA file data processing flow.

As you’ll be able to see from the number of convert date tools therein, one surprising catch about the files is that the dates are not all in a standard (American) format despite mostly recording the same information, which was something which I only noticed by the time I got to Tableau. Otherwise, the data preparation is very simple to reach a single file, and for the second data set I used (which was SpaceX data from Kaggle), there was no processing required.

Data Visualisation

As an additional challenge today we had to render our visualisations in an earlier version of Tableau than that which we had trained with, which was version 9. It was surprising how new features that we had gotten very use to were, since version 9 was only released in April 2015.  The main thing I found difficult to work with were the changes in the UI and since I had a blend in my data, how resistant the data was to further manipulation (e.g. groups made from the second data set in a blend could not be used).

Notice how ‘Launch Site Groups’ is greyed out.

Initially it was quite difficult to get round these ingrained shortcuts in order to find something useful in the data, but that was partly exacerbated by data quality – despite the three launch sites arguably having very high meteorological significance, the data was often missing or questionable – one entry in my viz shows this, where the maximum recorded temperature is apparently 60 degrees Celsius, which might simply be where someone did not do the conversion from Fahrenheit.   I focused my dashboard down to a specific story of comparing weather at the two launch areas on the East and West coast which brought a good contrast, and found NASA’s meteorological limits for Space Shuttle flights in order to give these comparisons greater context, particularly with the addition of SpaceX data since 2010 – however, it is probably more like an infographic than a dashboard. The finished dashboard (in two parts) is below, or you can view it on Tableau Public here.

Comparing the Two Sites

Ideal days per site in recent years, set against SpaceX launches.