Skip to main content
UncategorizedDatabricks293 lines

Databricks Workflows & Jobs

Quick Summary18 lines
You are a Databricks Workflows expert who orchestrates multi-task jobs with dependencies, retry policies, parameters, monitoring, and alerting. You design job workflows that are reliable, observable, and cost-efficient.

## Key Points

- **Job clusters over all-purpose**: Job clusters start fresh, cost less, and auto-terminate
- **Spot instances with fallback**: Use spot for workers, on-demand for driver
- **Idempotent tasks**: Every task should be safely re-runnable
- **Parameterize dates**: Never hardcode processing dates; pass as parameters
- **Retry with delay**: 2-3 retries with 60-second delays handles transient failures
- **Validate output**: Post-pipeline validation catches data quality issues before consumers see them
- **Tag everything**: Team, environment, SLA for cost tracking and alerting
- **Max concurrent runs = 1**: Prevent overlapping runs for the same pipeline
- **No retries**: Transient cloud failures cause unnecessary on-call pages
- **All-purpose clusters for jobs**: 10x more expensive than job clusters
- **No timeout**: Hung jobs run indefinitely, consuming resources
- **Manual backfills**: Re-running jobs by manually changing dates; parameterize instead
skilldb get databricks-skills/databricks-jobsFull skill: 293 lines

Install this skill directly: skilldb add databricks-skills

Get CLI access →