Day 4? Completed it.
Today, we were told to extract data from the Cal fire site and explore the California Wildfires that have become more prevalent in the news over the past couple of years. We were told to pair up with someone from an earlier cohort or member of the core team today. I have to say my partner, Louisa, came through and really helped keep up morale. It's interesting to work with someone who thinks differently to you or interprets other elements of the task first. I really enjoyed working with Louisa, and learned so much along the way.
We started with a discussion and plan. From Andy's post, I became immediately drawn to the idea of climate change as a driver in weather change thus, impacting the increasing amount of forest fires. I also noted other bits as such:
- Length of the fire season - has this changed? What is the typical length? How has this changed?
- When are fires occurring?
- Which county has the greatest length of fire season?
- How do you calculate length of a fire season?
- Hours burned? Days? Some Fires last for short periods whilst some, days or even weeks. What scale can encompass all of this?
- How does snowmelt influence this all?
- More intense dry seasons? Seasonal weather data?
- Increase moisture stress on vegetation - what does this truly mean?
- Mapbox - Where are these fires happening (density map)
This was my starting point. It wasn't entirely clear at first what I wanted to look at. Louisa brought up some interesting information on inmates and the % of firefighters who happen to be inmates. These inmates got paid pennies as their criminal records prevented them from getting the state license necessary for employment. This really interested me but I thought that finding data on this would be incredibly difficult. I then decided to focus on the length of the fire season and whether it is getting longer or not and if so, why?
So first, I needed to data prep and use an API to get the data from the site, into Alteryx and then turn it into a reasonable format ready for my analysis.
This was the tidy documented workflow output. Again, I'd have ignored this entirely had it not been for Louisa. I learned that it's actually incredibly useful for yourself and others to document your workflows. This is a typical way of doing so.
I had to import the API link for both active and inactive fires. This was downloaded and JSON parsed. I then had to clean and re-structure it. Next, I realised that the dates had to be converted as they included T's and Z's and so did this with a simple date time tool. I was ready to output before Louisa reminded me that it may be simpler to do my row level calculations within Alteryx than Tableau. After this point, I explored the different outcomes I wanted and how the data needed to be. You'll notice that there is data by Agency involvement that Louisa did actually find in the end! What's annoying is the lack of time that I had left to explore the idea. I did actually find some extra data on Snow Melt Anomalies and joined this in my 'untidy' workflow I guess.
Now that you can see both, workflows, I'm sure you'll agree that documentation is the way to go.
Data done and into Tableau.
I started off with some simple big numbers for context and a heatmap below to see whether these fires are in fact getting longer (seasonally) or not? The heatmap to me works very well in depicting this and presents an almost triangle getting wider at the bottom. This width shows very clearly that the fire season goes beyond what it is typically known to be and can therefore be getting longer. This means more forest fires. I supplemented this with information and my findings.
Initially, I got so carried away with length, I realised we don't just care about fires that have been happening for a long time but also the destruction they cause in the acres burned. A fire has many elements to it. I tried to spend some time researching whether there was a standardised method known to others but instead found spreadsheets with complex equations instead. Therefore, I left it to stick to a scatter graph instead. I plotted this as you would usually and found a massive cluster to the left and the odd 2 fires far out to the right. I got back to Louisa about this and said that I think it's the best way to present this but the graph isn't the most useful. She then suggested that I use a log scale. I hadn't used it before and so had to read up on it some more and ask her why. This definition helped me.
A logarithmic scale (or log scale) is a way of displaying numerical data over a very wide range of values in a compact way—typically the largest numbers in the data are hundreds or even thousands of times larger than the smallest numbers. Such a scale is nonlinear: the numbers 10 and 20, and 60 and 70, are not the same distance apart on a log scale. Rather, the numbers 10 and 100, and 60 and 600 are equally spaced. Thus moving a unit of distance along the scale means the number has been multiplied by 10 (or some other fixed factor). - Wikipedia
I included some interactivity so that when you click on a point in the scatter graph, you can get some key information about each fire.
Finally, I wanted to explore why. This has been the focus for dashboard week. I know I lack it most of the time. Today, my why reflected on my findings above and why could forest fires be increasing. Having looked at land temperature increases over the last few days, I thought that I'd look at snow melt. I found it incredibly interesting that these two polar opposite seasons and temperatures can have drastic effects on the balance of things and in particular, the rise in forest fires. The data I found was on anomalies. What was even more interesting to me was the fact that where snow melt decreased, so did the amount of forest fires soon after. This was the same for snow melt increasing, which can be seen in the shaded area - forest fires increase drastically soon after.
I have to say that I have a love hate relationship with dashboard week. I've learned a lot about myself as well as Tableau and Alteryx. The datasets have been incredibly frightening (the findings more so) but so full of information and interesting too.
Here's my final viz: