Three Stages of Normalisation: How Far is Too Far?

Normalising Superstore: How Far is Too Far?

The default Superstore dataset in Tableau is a handfull of denormalised tables - big spreadsheets with everything in one place. Easy to use, but not always the most efficient or reliable format especially for storage. That’s where normalisation comes in.


Normalised vs Denormalised

  • A denormalised dataset stores everything in one large table. This makes it easier to analyse in tools like Tableau, but comes with drawbacks: repeated values, harder updates, and more risk of inconsistency.
  • A normalised dataset splits data into smaller related tables using a schema. This improves efficiency, makes updates safer, and avoids repeating the same information in multiple places - e.g. the extract date of a table would be better stored in a separate table rather than duplicated across every row of your main table!

Normalised data is generally better for transactional, row focused systems (OLTP). Denormalised data is more suitable for reporting and analytics (OLAP). OLTP vs OLAP.


The First Three Stages of Normalisation

First Normal Form (1NF)

  • Every column must contain only atomic (single unit, e.g. one data type) values.
  • Each row must be unique.
  • Columns must not contain lists or repeated fields.
  • For Superstore: don’t store multiple items in one cell. Product data, for example, should be split clearly and consistently.

Second Normal Form (2NF)

  • Builds on 1NF.
  • All non-key fields must depend on only the primary key - there should be no partial dependencies.
  • For Superstore: don’t mix unrelated data (e.g. customer names and product info) in the same table — split them into lookup tables.

Third Normal Form (3NF)

  • Builds on 2NF.
  • No transitive dependencies.
  • No hierarchical or derived information stored alongside raw data.
  • For Superstore: Category and Sub-Category are hierarchical. Category should not be stored in the same table as the transactional data. It belongs in a separate product category table.

When Normalisation Goes Too Far

We created a fully normalised version of Superstore. The result was technically correct, but difficult to use. Even simple tasks became cumbersome, as the data was split into too many pieces.

Normalisation improves structure, but too much separation can hurt usability — especially in tools like Tableau where relationships between tables need to be clearly defined.


Conclusion

The ideal level of normalisation depends on the use case. For analysis and reporting, a lightly denormalised structure is often more practical. Storage costs are low, and usability often matters more than strict efficiency.

A balanced approach works best: reduce redundancy where it makes sense, but don’t make things harder to work with than they need to be.

Author:
Jeffrey Brian Thompson
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab