A Fool's Guide to Data Preparation

Hello everyone!

I’m Mandy, and I’m a part of the latest cohort of the Data School, DS52! After a couple of days filled with so, so many introductions, tech troubleshooting, and countless cups of coffee, we wasted no time getting stuck into the good stuff: data preparation.

Admittedly, it is not the most glamorous or exciting step of the data analysis process (although some may vehemently disagree). However, whether you like it or not, taking the time to explore the data and plan how you will tackle it will greatly help you understand its ins and outs and ultimately, help you derive more meaningful insights later on. 

Here are my top tips for making sense of the madness:

  • Before diving in, it can be useful to sketch out the OG dataset that you are dealing with, identifying what is actually in each column and row. Then, sketch out the dataset that you need to answer the questions you have about the data (The Ideal Dataset). It is at this point that you should determine what fields (units sold, revenue, product name, etc.), data types (string, integer, boolean, etc.), and values you want. Now, you can work through what changes are needed to the OG to get it like The Ideal Dataset and also outline what order they need to happen. This step-by-step process can help simplify the otherwise overwhelming process and also help you spot mistakes early so they don’t cause bigger problems down the road.
  • If you don’t know what a particular field means, ASK! Of course, it never hurts to do a bit of your own research on the subject matter to make sure you’re on the right track, but don't be afraid to ask whoever created/provided the dataset for additional clarification - a quick chat over coffee/tea/hot chocolate can go a long way!
  • Practice makes perfect! A really good resource to test your data preparation brain muscles is Preppin’ Data (run by our very own Carl Allchin), which gives you a new challenge every week in Tableau Prep to keep those skills fresh. Even if you don’t get it on the first try (or the second or the third or…), as long as you continue putting your theory into practice, you too will have that ‘Aha!’ moment everyone talks about. 

Overall, whilst it might look intimidating, data preparation is really not that bad! Taking the time to fully understand the data and making sure it is in the correct format is just as important as whatever analysis you will run - trust me, your future self will thank you!

Author:
Mandy Wan
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab