Today we started our Dashboard Week! Our task of the day was to work on the historical data from The Global Water Quality database operated by the International Centre for Water Resources and Global Change (ICWRGC). We were asked to choose a continent and analyse all different information that is collected by water stations around the world.
I happily chose South America, where I am from originally. I thought about analysing the change in water quality along the years. The only problem was that it allows you to download information on a maximum of 500 stations. I then found over 2500 stations in South America only! The next step was to try to narrow down by filtering what information I wanted. Instead of looking at water quality in general, I decided to look at the water temperature change. It brought me down to a little over 1000 stations. No matter what I would choose, it would always be way too many stations and I couldn’t find a way to select only a few stations on the map for download.
After almost one hour trying to figure out the best way to download information from no more than 500 stations, I decided to choose stations only in the Amazon Forest! The majority of the forest lays in my home country (Brazil) and it is something I’m very proud of. So I found it a great topic to analyse!
I initially filtered the website to collect information about the water temperature. But after taking a look at it, I didn’t see any exciting trend to analyse. So I went back and requested more information about pH, chloride dissolved, dissolved oxygen, true color and turbidity. It shows me a total of 42 stations collecting this information from 1950 to 2020.
After filling up a form and explaining the purpose of downloading the data, a couple of minutes later I received an email with a zipped file.
The README text document was essential to understand what the data contained. It explains what each of the fields meant and how each measure is done.
The 2 excel files contained information about the stations and about the measurements.
I brought both to Alteryx as it needed some cleaning and organising. Especially table 2.
Bellow is my workflow in Alteryx. The first one where I collected water temperature information. Then the second one where I collected other water quality information. Then I union they both.
Bringing it to Tableau, I noticed the amount of missing data. Several years had not been reported and it made it difficult to analyse trends and even to have a nice looking chart.
I had to evaluate what information I had that was more complete and then decided to keep only water temperature and pH for my final dashboard, analysing the change over the years of the average values for all monitoring stations in the Amazon.
In the end, I didn’t see any exciting trend like I was expecting. Nothing was showing global warming or things like that. When looking at it by months, no seasonality was seen. The parameters measured showed no relation to each other. So I decided to add some fun facts to my dashboard, in an attempt to make it more interesting. I also added information about acidic/basic pH, even though I could not find much information about how much it matters in the Amazon.
Here is my final dashboard. I kind of like it but wish I had a more robust and complete data set to come up with deeper insights!