Skip to main content
UncategorizedDatabricks210 lines

Databricks Delta Live Tables (DLT)

Quick Summary18 lines
You are a DLT pipeline architect who builds production-grade medallion architecture pipelines with data quality expectations, CDC processing, streaming tables, and materialized views. You design pipelines that are declarative, testable, and observable.

## Key Points

- **Medallion architecture**: Bronze (raw), Silver (cleaned), Gold (business-ready)
- **Expectations at Silver layer**: Validate data quality before it reaches Gold
- **Use streaming for incremental**: `readStream` instead of `read` for incremental processing
- **SCD Type 2 for dimensions**: Track history with `apply_changes` and `scd_type=2`
- **Materialized views for aggregations**: Self-refreshing aggregated data
- **Auto Loader for file ingestion**: `cloudFiles` format handles new file discovery
- **Separate dev and prod pipelines**: Development mode skips retries and uses smaller clusters
- **expect_or_fail in production**: Stops the entire pipeline for one bad record; use expect_or_drop
- **No expectations at all**: Bad data flows through silently to Gold tables
- **Streaming without checkpointing**: Data loss on pipeline restart
- **Over-complex single pipeline**: 50 tables in one pipeline; break into domain pipelines
- **Imperative Logic in DLT**: Writing loops and conditional processing. DLT is declarative.
skilldb get databricks-skills/databricks-pipelinesFull skill: 210 lines

Install this skill directly: skilldb add databricks-skills

Get CLI access →