I recently completed the dbt Fundamentals course and found it really engaging. As someone who works primarily in BI tools like Tableau, learning about dbt was a valuable deep dive into how data transformations are handled in the modern data stack.
As with any new tool, there’s always a learning curve—figuring out syntax, best practices, and the command line interface takes a little time. So this post is part-cheat sheet, part-learning log: for you, and honestly, mostly for me.
dbt CLI Commands
dbt run
Builds (materializes) your models based on the SQL you've written.
dbt test
Runs tests defined in your project (e.g., unique, not null, relationships).
dbt build
Runs tests, snapshots, seeds, and models in DAG (dependency) order. It combines multiple steps: it first runs tests on sources, then builds the model, tests it, builds the next model, and so on.
dbt build --select model_name
OR dbt test --select model_name
Lets you run a command on a specific model. Add +
to run models upstream (+model_name
) or downstream (model_name+
) of the selected node.
dbt test --select source:source_name
Tests a specific source using source-level tests (e.g., freshness or validity).
dbt source freshness
Runs a freshness check
dbt docs generate
Creates a browsable documentation site that visualizes your DAG and includes descriptions of models, sources, and tests.
Syntax & Jinja in dbt
{{ config(materialized='table') }}
Materialize as a table
{{ config(materialized='view') }}
Materialize as a view
{{ ref('model_name') }}
Reference another model
{{ source('source_name', 'table_name') }}
Reference a source (e.g., a raw table from the warehouse)
Documentation Blocks
description: "{{ doc('documentation_name') }}"
Adds a reference to a documentation block in your .yml
file.
Make the Markdown file (.md) doc block as such:
{% doc documentation_name %
This column shows the current stats of an order (e.g., pending, shipped, delivered).
{% enddocs %}escription: "{{ doc('order_status') }}"
Testing Models and Sources
Model tests are defined in .yml
files next to your models. Example:
version: 2
models:
- name: customers
description: "Customer data from our application"
columns:
- name: customer_id
tests:
- not_null
- unique
- name: email
tests:
- not_null
To test a source table, define it under sources
:
version: 2
sources:
- name: stripe
tables:
- name: payments
columns:
- name: id
tests:
- not_null
- unique
Defining Sources
Sources are defined in a .yml
file so dbt knows where to find raw data:
version: 2
sources:
- name: stripe
database: raw
schema: stripe_data
tables:
- name: payments
description: "Raw payment data from Stripe"
Source Freshness
Freshness tests ensure your source data is up to date:
version: 2
}
sources:
- name: stripe
freshness:
warn_after: { count: 12, period: hour error_after: { count: 24, period: hour
} tables:
- name: payments
loaded_at_field: _loaded_at
Shortcuts
Typing __
(double underscore) in dbt Cloud brings up the list of available keyboard shortcuts.