Skip to main content
Technology & EngineeringContainerization147 lines

Dockerfile Optimization

Multi-stage builds, layer caching, and image size optimization for production Docker images

Quick Summary30 lines
You are an expert in Dockerfile optimization for containerized application development and deployment.

## Key Points

- Use `-alpine` or `-slim` base images instead of full distributions.
- Combine `RUN` commands with `&&` to reduce layer count.
- Remove package manager caches in the same layer that installs packages.
- Pin base image versions explicitly (e.g., `node:20.11.1-alpine3.19`) to ensure reproducible builds.
- Run the application as a non-root user with the `USER` directive.
- Use `COPY` instead of `ADD` unless you specifically need tar extraction or URL fetching.
- Installing build tools in the final stage, which bloats the runtime image and increases attack surface.
- Placing `COPY . .` before dependency installation, which invalidates the dependency cache on every source code change.

## Quick Example

```dockerfile
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl ca-certificates && \
    rm -rf /var/lib/apt/lists/*
```

```
node_modules
.git
.env
*.md
dist
```
skilldb get containerization-skills/Dockerfile OptimizationFull skill: 147 lines
Paste into your CLAUDE.md or agent config

Dockerfile Optimization — Containerization

You are an expert in Dockerfile optimization for containerized application development and deployment.

Overview

Optimized Dockerfiles produce smaller, faster, and more secure images. Multi-stage builds separate build-time dependencies from the final runtime image, while careful layer ordering maximizes cache reuse and minimizes rebuild times.

Core Concepts

Multi-Stage Builds

Multi-stage builds use multiple FROM statements to create intermediate build stages. Only the final stage becomes the shipped image.

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production=false
COPY . .
RUN npm run build

# Stage 2: Runtime
FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
USER node
CMD ["node", "dist/index.js"]

Layer Caching

Docker caches each layer. When a layer changes, all subsequent layers are invalidated. Order instructions from least to most frequently changing:

FROM python:3.12-slim
WORKDIR /app

# Rarely changes — cached
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Changes often — rebuild only this layer
COPY . .
CMD ["python", "main.py"]

Minimizing Image Size

  • Use -alpine or -slim base images instead of full distributions.
  • Combine RUN commands with && to reduce layer count.
  • Remove package manager caches in the same layer that installs packages.
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl ca-certificates && \
    rm -rf /var/lib/apt/lists/*

.dockerignore

Exclude unnecessary files from the build context to speed up builds and avoid leaking secrets:

node_modules
.git
.env
*.md
dist

Implementation Patterns

Distroless Final Images

Use Google's distroless images for minimal attack surface:

FROM golang:1.22 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /server .

FROM gcr.io/distroless/static-debian12
COPY --from=builder /server /server
ENTRYPOINT ["/server"]

Build Arguments for Flexibility

ARG GO_VERSION=1.22
FROM golang:${GO_VERSION}-alpine AS builder
ARG APP_VERSION=dev
RUN go build -ldflags "-X main.version=${APP_VERSION}" -o /app .

Cache Mounts for Package Managers

RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

Best Practices

  • Pin base image versions explicitly (e.g., node:20.11.1-alpine3.19) to ensure reproducible builds.
  • Run the application as a non-root user with the USER directive.
  • Use COPY instead of ADD unless you specifically need tar extraction or URL fetching.

Core Philosophy

A Dockerfile is a build recipe, and like any recipe, the quality of the output depends on the care put into each step. The goal is not just a working image but a small, secure, fast-to-build image that contains only what is needed to run the application. Every unnecessary package, file, or build artifact included in the final image increases attack surface, pull time, and storage cost.

Multi-stage builds are the single most impactful optimization. By separating the build environment (compilers, dev dependencies, source code) from the runtime environment (the compiled binary or bundled assets plus minimal runtime dependencies), you can reduce image sizes by 80-90%. The build stage can be as large and feature-rich as needed without affecting the final image. This is not a premature optimization; it is a fundamental best practice that should be the default for every production Dockerfile.

Layer caching is the key to fast builds, and it requires intentional instruction ordering. Docker caches each layer and invalidates all subsequent layers when one changes. By placing rarely-changing instructions (installing system dependencies, copying lockfiles, installing packages) before frequently-changing ones (copying application source code), you ensure that most builds only rebuild the final layers. A well-ordered Dockerfile can reduce rebuild times from minutes to seconds.

Anti-Patterns

  • Using full base images when slim or alpine variants exist. Starting from node:20 (1GB+) instead of node:20-alpine (150MB) or node:20-slim adds hundreds of megabytes of unused packages to every image. Use the smallest base image that provides what your application needs.

  • Copying source code before installing dependencies. Placing COPY . . before RUN npm install (or equivalent) invalidates the dependency cache on every source code change, forcing a full reinstall on every build. Always copy the lockfile first, install dependencies, then copy the rest.

  • Running as root in the final image. Omitting the USER directive means the container process runs as root, which gives an attacker full filesystem access if the application is compromised. Add a non-root user and switch to it before the CMD instruction.

  • Using ADD when COPY suffices. ADD has implicit behaviors (tar extraction, URL fetching) that can introduce unexpected results. Use COPY for straightforward file copying and reserve ADD for when you explicitly need its special features.

  • Multiple RUN commands for related operations. Each RUN creates a new layer. Running RUN apt-get update followed by a separate RUN apt-get install means the update layer is cached independently and can become stale. Combine related commands with && and clean up caches in the same layer.

Common Pitfalls

  • Installing build tools in the final stage, which bloats the runtime image and increases attack surface.
  • Placing COPY . . before dependency installation, which invalidates the dependency cache on every source code change.

Install this skill directly: skilldb add containerization-skills

Get CLI access →