For the first day of our Dashboard week, Andy challenged us to gather and analyse data on the spend over £25K at NHS Trusts.
The data was scattered over 391 different URLs, each containing multiple files (often in different formats)
Given a large amount of data prep required I started by creating an Alteryx workflow. My initial idea was to create an iterative macro which would list all the different trusts URLs and then supplement this with a batch macro that would download the individual files stored in each page.
The process seemed pretty straightforward, at least since I got the list of URLs results. Despite I was expecting 391 results, I kept getting 400, suggesting that something wasn’t quite working in my workflow. Moreover, the URLs weren’t matching what I was expecting and seemed to return random results on the NHS database.
In order to solve this issue, I have tried several different solutions, like manually generating the stating URLs, and even manually copy and paste them (shame on me!) as a last resort.
None of these methods worked, and unfortunately, I was stacked until around 2 pm.
At that point, The Information Lab Alteryx Guru, Ben Moss, explained to us that the problem was caused by a single checkbox in the download tool, Encode URL Text.
Quoting the Alteryx documentation website “When checked, the specified URL will be encoded as needed where unsafe ASCII characters are converted into a format that can be transmitted over the internet. An example of this would be the substitution of %20 for a space.”
So the problem was caused by parsing some special character, which would somehow return a random generated URL.
Once discovered that I could finally start to work on the actual data gathering, I started to build a macro that would download all files. This task was quite time consuming as well, although I have managed to get some data from the site.
Given the situation, around 3 pm, Andy allowed us to focus on a single Trust and to build a dashboard only on that.
I have chosen the Leicestershire Partnership NHS. Analysing these files I have found an interesting insight, over 50% of the spend over £25K were used to pay agency nurses.
Unfortunately, I did not have a lot of time left, so I had to keep the dashboard quite basic. Today’s challenge was quite concentrated on data prep, I hope that during the rest of the week we may have a task more related to design.