Evolutionary Analysis: Reconstructing the LEGO Animal Kingdom with Snowflake and Power BI
For the second day of Dashboard Week, the focus shifted to the intersection of historical data management and advanced UI design. The task involved analysing the Lego’s datasets from Rebrickable, a dataset spanning over five decades of part moulds, set releases, and theme expansions.
The Engineering Challenge: Snowflake and Secure Connectivity
The process began with the technical setup in Snowflake, which presented the first major hurdle of the day: environment configuration. Beyond simply moving data, I had to establish a dedicated schema from scratch to house the various dimensions of the LEGO ecosystem. This involved a manual loading process, transforming raw files into structured tables for Themes, Sets, and Animal Genetics.
The primary technical challenge surfaced when connecting the Snowflake server to Power BI. Without a standard service account and private key already in place, the connection was initially blocked. To resolve this, I had to get a pass key successfully push the data through to Power BI. This stage was critical; it ensured that the data engineering was handled in a secure, scalable environment before any visualisation began.
Data Modelling and Dashboard Architecture
Once the connection was established, I implemented a Star Schema architecture to optimise the model. I separated the core "genetics", the individual part numbers representing unique animal moulds into a Fact table. Surrounding this were Dimension tables for Themes and Sets. This architectural decision was essential for maintaining performance; by pre-structuring these relationships, I ensured that the complex filtering logic in the final dashboard remained responsive even when handling over 50 years of data.
A primary design challenge was avoiding visual clutter while maintaining analytical depth. I designed a dual-state UI using a "Macro vs. Micro" approach:
- Macro-Level Overview: This view illustrates the overall acceleration of LEGO animal production, focusing on the cumulative growth of the ecosystem and broad historical trends.
- Comparative Micro-Analysis: This view is designed for head-to-head comparisons, allowing the user to isolate specific families such as Dinosaurs versus Dragons and see how they competed for "market share" within the catalogue over time.
To achieve this, I utilised ‘Edit Interactions’ to selectively disconnect specific visuals from the global filters. In the "Overview" state, the line charts ignore species-level filters to provide a consistent historical baseline. In the "Comparison" state, these same charts are activated by the Legend well, splitting the data into distinct series for direct analysis.
Technical Refinements and UI Design
The UI was designed to mimic a standalone application, incorporating several high-fidelity refinements to improve the user experience:
- KPI Precision via DAX: To maintain professional standards, I bypassed Power BI’s default rounding (which displayed the species count as "1K"). By using a custom formatting measure, I forced the display of the exact figure (1,014) with proper thousands separators, ensuring the user is presented with the absolute precision of the dataset.
- Dynamic Peak Identification: I developed a calculation to identify the "Peak Evolution Year", the specific point in time where the highest volume of sets was released for a selected species. This provides an immediate focal point, highlighting the "Golden Age" of any given animal family.
- Layered Navigation: The transition between the two dashboard states was handled via the Selection Pane and Bookmarks. By stacking visual groups and toggling visibility, I created a seamless toggle system that allows two entirely different reporting styles to exist on a single canvas.
Final Thoughts and Feedback
The finished dashboard effectively bridges the gap between historical documentation and interactive analysis. A key insight revealed during the build was the clear shift in LEGO’s production strategy: moving from the generic animal moulds of the late 90s to the highly specialised, multi-part figures seen in modern licensed themes like Jurassic World.
This project reinforced the importance of planning for user intent. By engineering the data to support specific analytical states, Overview and Comparison, the dashboard provides a clear narrative while still allowing the user the freedom to explore fifty years of evolutionary history.
