Dockerfile Optimization
Multi-stage builds, layer caching, and image size optimization for production Docker images
You are an expert in Dockerfile optimization for containerized application development and deployment.
## Key Points
- Use `-alpine` or `-slim` base images instead of full distributions.
- Combine `RUN` commands with `&&` to reduce layer count.
- Remove package manager caches in the same layer that installs packages.
- Pin base image versions explicitly (e.g., `node:20.11.1-alpine3.19`) to ensure reproducible builds.
- Run the application as a non-root user with the `USER` directive.
- Use `COPY` instead of `ADD` unless you specifically need tar extraction or URL fetching.
- Installing build tools in the final stage, which bloats the runtime image and increases attack surface.
- Placing `COPY . .` before dependency installation, which invalidates the dependency cache on every source code change.
## Quick Example
```dockerfile
RUN apt-get update && \
apt-get install -y --no-install-recommends curl ca-certificates && \
rm -rf /var/lib/apt/lists/*
```
```
node_modules
.git
.env
*.md
dist
```skilldb get containerization-skills/Dockerfile OptimizationFull skill: 147 linesDockerfile Optimization — Containerization
You are an expert in Dockerfile optimization for containerized application development and deployment.
Overview
Optimized Dockerfiles produce smaller, faster, and more secure images. Multi-stage builds separate build-time dependencies from the final runtime image, while careful layer ordering maximizes cache reuse and minimizes rebuild times.
Core Concepts
Multi-Stage Builds
Multi-stage builds use multiple FROM statements to create intermediate build stages. Only the final stage becomes the shipped image.
# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production=false
COPY . .
RUN npm run build
# Stage 2: Runtime
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
USER node
CMD ["node", "dist/index.js"]
Layer Caching
Docker caches each layer. When a layer changes, all subsequent layers are invalidated. Order instructions from least to most frequently changing:
FROM python:3.12-slim
WORKDIR /app
# Rarely changes — cached
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Changes often — rebuild only this layer
COPY . .
CMD ["python", "main.py"]
Minimizing Image Size
- Use
-alpineor-slimbase images instead of full distributions. - Combine
RUNcommands with&&to reduce layer count. - Remove package manager caches in the same layer that installs packages.
RUN apt-get update && \
apt-get install -y --no-install-recommends curl ca-certificates && \
rm -rf /var/lib/apt/lists/*
.dockerignore
Exclude unnecessary files from the build context to speed up builds and avoid leaking secrets:
node_modules
.git
.env
*.md
dist
Implementation Patterns
Distroless Final Images
Use Google's distroless images for minimal attack surface:
FROM golang:1.22 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /server .
FROM gcr.io/distroless/static-debian12
COPY --from=builder /server /server
ENTRYPOINT ["/server"]
Build Arguments for Flexibility
ARG GO_VERSION=1.22
FROM golang:${GO_VERSION}-alpine AS builder
ARG APP_VERSION=dev
RUN go build -ldflags "-X main.version=${APP_VERSION}" -o /app .
Cache Mounts for Package Managers
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
Best Practices
- Pin base image versions explicitly (e.g.,
node:20.11.1-alpine3.19) to ensure reproducible builds. - Run the application as a non-root user with the
USERdirective. - Use
COPYinstead ofADDunless you specifically need tar extraction or URL fetching.
Core Philosophy
A Dockerfile is a build recipe, and like any recipe, the quality of the output depends on the care put into each step. The goal is not just a working image but a small, secure, fast-to-build image that contains only what is needed to run the application. Every unnecessary package, file, or build artifact included in the final image increases attack surface, pull time, and storage cost.
Multi-stage builds are the single most impactful optimization. By separating the build environment (compilers, dev dependencies, source code) from the runtime environment (the compiled binary or bundled assets plus minimal runtime dependencies), you can reduce image sizes by 80-90%. The build stage can be as large and feature-rich as needed without affecting the final image. This is not a premature optimization; it is a fundamental best practice that should be the default for every production Dockerfile.
Layer caching is the key to fast builds, and it requires intentional instruction ordering. Docker caches each layer and invalidates all subsequent layers when one changes. By placing rarely-changing instructions (installing system dependencies, copying lockfiles, installing packages) before frequently-changing ones (copying application source code), you ensure that most builds only rebuild the final layers. A well-ordered Dockerfile can reduce rebuild times from minutes to seconds.
Anti-Patterns
-
Using full base images when slim or alpine variants exist. Starting from
node:20(1GB+) instead ofnode:20-alpine(150MB) ornode:20-slimadds hundreds of megabytes of unused packages to every image. Use the smallest base image that provides what your application needs. -
Copying source code before installing dependencies. Placing
COPY . .beforeRUN npm install(or equivalent) invalidates the dependency cache on every source code change, forcing a full reinstall on every build. Always copy the lockfile first, install dependencies, then copy the rest. -
Running as root in the final image. Omitting the
USERdirective means the container process runs as root, which gives an attacker full filesystem access if the application is compromised. Add a non-root user and switch to it before theCMDinstruction. -
Using
ADDwhenCOPYsuffices.ADDhas implicit behaviors (tar extraction, URL fetching) that can introduce unexpected results. UseCOPYfor straightforward file copying and reserveADDfor when you explicitly need its special features. -
Multiple
RUNcommands for related operations. EachRUNcreates a new layer. RunningRUN apt-get updatefollowed by a separateRUN apt-get installmeans the update layer is cached independently and can become stale. Combine related commands with&&and clean up caches in the same layer.
Common Pitfalls
- Installing build tools in the final stage, which bloats the runtime image and increases attack surface.
- Placing
COPY . .before dependency installation, which invalidates the dependency cache on every source code change.
Install this skill directly: skilldb add containerization-skills
Related Skills
Container Registries
Container registry setup, authentication, and image management for ECR, GCR, GHCR, and Docker Hub
Container Security
Container image scanning, runtime hardening, and security best practices for production workloads
Docker Compose
Docker Compose configuration for multi-service development, testing, and local orchestration
Docker Networking
Docker networking modes, custom networks, DNS resolution, and multi-host connectivity patterns
Helm Charts
Helm chart creation, templating, dependency management, and release lifecycle for Kubernetes
Kubernetes Autoscaling
Kubernetes autoscaling with HPA, VPA, Cluster Autoscaler, and event-driven scaling with KEDA