Dashboard Week Day 1 - EPA Environmental Regulations

by Andy Kriebel

When DS17 had dashboard week, the focus was primarily on analysis and design, with analysis being the most important. I wanted them finding insights in the data and learning to explore the “why” in the data.

There was some feedback, though (especially from previous cohorts), that it should have been harder and more data prep would have been useful. So I enlisted Ellie Mason (DS11) and Soha Elghany (DS9) to help me find data sets that provide a good balance of prep and analysis for DS18. They had some evil, devious grins on their faces when I asked for their help. To say they were excited would be a massive understatement. Thank you Soha and Ellie for your help!!

The basic idea for Dashboard Week is that the team is given a data source, they prep the data, they do analysis, create a visualization, and write a blog post. Most of DS18 has done little blogging, so this should be a good way for them to get back into the flow.

Here are the rules that are common across each day:

  1. They MUST work independently.
  2. Everything MUST be done by 5pm (so that I can write a recap blog post).
  3. They MUST leave their laptops at work in the evening.
  4. The next morning, they present back the previous day’s work. This time will change a bit through the week.
  5. No complaining!
  6. Pay attention; requirements may change day-to-day.

The focus for day 1 is big data and the subject is information the EPA (in the US) collects about facilities or sites subject to environmental regulation. The data is available on the EPA website.

There are a LOT of fields, so they’d be wise to read the documentation before they get started. I’m most curious to see how they approach analyzing the data. There are lots of rabbit holes and traps to fall into that WILL significantly hinder their progress and the quality of their work. Let’s see who can avoid those.