Dashboard Week - Day 2

by Frankie Benson

The Task

Our task for today was to download the 2002 - 2019 survey data from this website: https://cancercontrol.cancer.gov/brp/tcrb/tus-cps/questionnaires-data and create an insightful dashboard from it. I was excited for this challenge as analysing survey data is always interesting. I downloaded the zip file and was ready to load it into Tableau... however I realise the data file is a .DAT type, something I've not come across before.

Data Issues...

I tried multiple different methods to try and sort this data out, initially I loaded it into a text editor and tried to save it as different file types but a few failed due to how large the data was. I was able to save it as a text file so I tried loading this in Alteryx, the issue now is that the file isn't delimited by anything, and so when loaded into Alteryx, all of the columns merge into one and it's very difficult to do anything at this stage which is truly dynamic as the rows aren't uniform in format. After a few regex tools, I extracted some columns such as the year and month. At this stage I decided to extract these columns first and test them in Tableau. Immediately I see one of my years as '0202' so of course this is incorrect.

Realising at this point that the basic columns aren't working, I head back to googling how to convert .DAT files but still no luck so I find myself quite stuck with what to do. I have a data set which isn't usable and a few hours to create a dashboard with it, so I decide the best way forward is to find a similar data set online which is in a usable format.

New Data

I went and found a new data set which includes survey and tobacco data: https://www.cdc.gov/tobacco/data_statistics/surveys/nyts/index.htm

The section I chose was survey and demographic data about users of e-cigs in 2020. The data came in an excel format along with a data dictionary which contained all the codes for the question IDs and the answer mappings since they were all numbers in the excel.

Dashboard

Once I loaded in the data to Tableau, the first thing I did was to change the aliases of the answers to each question using the data dictionary. Then with the time I had remaining, I thought it might be nice to do a uniform viz for each question, so I decided to create a pie chart for each question showing the percentage of respondents who chose this as a reason for using e-cigs. The overall dashboard was simple but I think the message is clear and it's easy to understand. The largest percentage reason given was because of 'curiosity' which was a surprise. All in all, dashboard day 2 was not at all what I was expecting, it certainly was a challenge, but I was happy to have a dashboard published by 3pm.

Avatar

Frankie Benson

Fri 31 Dec 2021

Wed 29 Dec 2021

Tue 21 Sep 2021

Thu 16 Sep 2021