dbt-core
dbt Core is the open-source CLI that transforms data in the warehouse: versioned SQL models, tests, docs, and a DAG executed with dbt build. We rate it trial for analytics engineering in the warehouse; pair orchestration with Apache Airflow or Argo Workflows when schedules and cross-system DAGs sit outside dbt.
Blurb
dbt is a SQL-first data transformation workflow for teams who already know SQL.
Summary
Role: transform data inside the warehouse (models, tests, docs). Not a general CI-CD Tools runner; not Apache Airflow (though Airflow often triggers dbt run).
When to trial:
- Warehouse-native analytics engineering (Snowflake, BigQuery, Redshift, Postgres, etc.)
- Git-reviewed SQL models,
dbt test, and generated lineage docs - Team comfortable with Jinja-templated SQL and
dbt_project.ymllayout
When to skip:
- No warehouse or transforms belong in an app service, not SQL models
- Need only light ETL scripts; a smaller tool may suffice
- Org will not adopt project structure (
models/,macros/,seeds/, etc.)
Typical flows:
| Goal | Commands |
|---|---|
| Fresh deps + run | dbt deps then dbt build (or dbt run + dbt test) |
| Docs locally | dbt docs generate then dbt docs serve |
| Clean artifacts | dbt clean before deps when debugging compile state |
Details
Project layout
The project is the unit of work. It must include dbt_project.yml (model paths, profiles, vars). Main artifact types:
| Artifact | Purpose |
|---|---|
| Models | Transforms; nodes in the execution DAG |
| Snapshots | SCD-style history for mutable sources |
| Seeds | CSV loads into the warehouse |
| Tests | Data quality on models and sources |
| Macros | Reusable Jinja/SQL |
| Sources | Upstream tables loaded by other tools |
| Exposures | Downstream consumers of the project |
| Analyses | Ad hoc SQL (not materialized on run) |
(Semantic models, metrics, and saved queries apply when using the metrics layer; see current dbt docs for your version.)
CLI reference (common)
build— run DAG in order (models, tests, seeds, snapshots as configured)run— execute models onlytest— run tests (usually afterrun)compile— render SQL without executingdeps— install package dependenciesdebug— connection and profile diagnosticsfreshness— source freshness checkssnapshot/seed— run those node types
Incremental models on existing tables
When a table is owned elsewhere but dbt should merge new rows:
- Model name matches the table (case-sensitive per adapter)
materialized='incremental'unique_keyset for merge/upsertincremental_strategy(appendormergeper adapter)- Use
is_incremental()and{{ this }}for watermark logic
| |
See Incremental models.
Garden pattern: trial dbt for warehouse transforms; orchestrate schedules with Apache Airflow (assess) or Argo Workflows (trial) when the DAG spans more than dbt.
References