dbt-core
dbt Core is the open-source CLI that transforms data in the warehouse: versioned SQL models, tests, docs, and a DAG executed with dbt build. We rate it trial for analytics engineering in the warehouse; pair orchestration with Apache Airflow or Argo Workflows when schedules and cross-system DAGs sit outside dbt.
Blurb
dbt is a SQL-first data transformation workflow for teams who already know SQL.
Summary
Role: transform data inside the warehouse (models, tests, docs). Not a general CI-CD Tools runner; not Apache Airflow (though Airflow often triggers dbt run).
When to trial:
- Warehouse-native analytics engineering (Snowflake, BigQuery, Redshift, Postgres, etc.)
- Git-reviewed SQL models,
dbt test, and generated lineage docs - Team comfortable with Jinja-templated SQL and
dbt_project.ymllayout
When to skip:
- No warehouse or transforms belong in an app service, not SQL models
- Need only light ETL scripts; a smaller tool may suffice
- Org will not adopt project structure (
models/,macros/,seeds/, etc.)
Typical flows:
| Goal | Commands |
|---|---|
| Fresh deps + run | dbt deps then dbt build (or dbt run + dbt test) |
| Docs locally | dbt docs generate then dbt docs serve |
| Clean artifacts | dbt clean before deps when debugging compile state |
Details
Project layout
The project is the unit of work. It must include dbt_project.yml (model paths, profiles, vars). Main artifact types:
| Artifact | Purpose |
|---|---|
| Models | Transforms; nodes in the execution DAG |
| Snapshots | SCD-style history for mutable sources |
| Seeds | CSV loads into the warehouse |
| Tests | Data quality on models and sources |
| Macros | Reusable Jinja/SQL |
| Sources | Upstream tables loaded by other tools |
| Exposures | Downstream consumers of the project |
| Analyses | Ad hoc SQL (not materialized on run) |
(Semantic models, metrics, and saved queries apply when using the metrics layer; see current dbt docs for your version.)
CLI reference (common)
build: run DAG in order (models, tests, seeds, snapshots as configured)run: execute models onlytest: run tests (usually afterrun)compile: render SQL without executingdeps: install package dependenciesdebug: connection and profile diagnosticsfreshness: source freshness checkssnapshot/seed: run those node types
Incremental models on existing tables
When a table is owned elsewhere but dbt should merge new rows:
- Model name matches the table (case-sensitive per adapter)
materialized='incremental'unique_keyset for merge/upsertincremental_strategy(appendormergeper adapter)- Use
is_incremental()and{{ this }}for watermark logic
| |
See Incremental models.
References