Skip to main content
Technology & EngineeringDeployment Hosting Services338 lines

Fly.io Deployment

Fly.io platform expertise — container deployment, global edge distribution, Dockerfiles, volumes, secrets, scaling, PostgreSQL, and multi-region patterns

Quick Summary18 lines
Fly.io runs containers on bare-metal servers distributed across 30+ regions worldwide. Instead of abstracting away infrastructure, it gives you full control over a lightweight VM (Firecracker) running your Docker image. The platform excels at placing your application close to users — deploy once and replicate globally. Persistent volumes, private networking between apps, and managed Postgres make it a full-stack platform. The key insight: treat your app as a globally distributed system from day one.

## Key Points

- **Use `fly-replay` for write requests in multi-region** — replay non-GET requests to the primary region where the database lives to avoid consistency issues.
- **Enable auto-stop and auto-start** — machines scale to zero when idle and wake on incoming requests, saving costs during low traffic.
- **Use release commands for migrations** — run `prisma migrate deploy` or equivalent as a release command so migrations run once per deploy, not per machine.
- **Set meaningful health checks** — health endpoints should verify database connectivity and downstream dependencies, not just return 200.
- **Use internal networking** — communicate between services over `.internal` DNS instead of the public internet for lower latency and better security.
- **Pin Dockerfile base images** — use `node:20.11-slim` not `node:latest` to ensure reproducible builds.
- **Use multi-stage Docker builds** — separate dependency installation, build, and runtime stages to produce smaller, more secure images.
- **Monitor with `fly logs` and `fly status`** — check real-time logs and machine status during and after deployments.
- **Running a database without volumes** — container filesystems are ephemeral. Always attach a volume for any persistent data, or use managed Postgres.
- **Deploying to many regions without considering data locality** — spreading compute globally while the database sits in one region adds latency to every write. Use read replicas or fly-replay.
- **Ignoring the `fly-replay` header for writes** — without replaying writes to the primary region, you risk writing to a read replica or hitting stale data.
- **Using large Docker images** — bloated images slow deployments and increase cold start times. Strip dev dependencies and use slim base images.
skilldb get deployment-hosting-services-skills/Fly.io DeploymentFull skill: 338 lines
Paste into your CLAUDE.md or agent config

Fly.io Deployment

Core Philosophy

Fly.io runs containers on bare-metal servers distributed across 30+ regions worldwide. Instead of abstracting away infrastructure, it gives you full control over a lightweight VM (Firecracker) running your Docker image. The platform excels at placing your application close to users — deploy once and replicate globally. Persistent volumes, private networking between apps, and managed Postgres make it a full-stack platform. The key insight: treat your app as a globally distributed system from day one.

Setup

Project Initialization

// Install flyctl CLI
// $ curl -L https://fly.io/install.sh | sh
// $ fly auth login

// Launch a new app (interactive)
// $ fly launch

// fly.toml — generated configuration
// app = "my-app"
// primary_region = "iad"
//
// [build]
//   dockerfile = "Dockerfile"
//
// [env]
//   NODE_ENV = "production"
//   PORT = "3000"
//
// [http_service]
//   internal_port = 3000
//   force_https = true
//   auto_stop_machines = true
//   auto_start_machines = true
//   min_machines_running = 1
//
// [[vm]]
//   cpu_kind = "shared"
//   cpus = 1
//   memory_mb = 512

Dockerfile for Node.js/TypeScript

// Dockerfile — multi-stage build for production
// FROM node:20-slim AS base
// RUN apt-get update && apt-get install -y openssl && rm -rf /var/lib/apt/lists/*
// WORKDIR /app
//
// FROM base AS deps
// COPY package.json package-lock.json ./
// RUN npm ci --omit=dev
//
// FROM base AS build
// COPY package.json package-lock.json ./
// RUN npm ci
// COPY . .
// RUN npm run build
//
// FROM base AS runtime
// COPY --from=deps /app/node_modules ./node_modules
// COPY --from=build /app/dist ./dist
// COPY --from=build /app/package.json ./
//
// ENV NODE_ENV=production
// EXPOSE 3000
// CMD ["node", "dist/server.js"]

Secrets Management

// Set secrets (encrypted, injected as env vars at runtime)
// $ fly secrets set DATABASE_URL="postgres://..." JWT_SECRET="supersecret"
// $ fly secrets list
// $ fly secrets unset OLD_SECRET

// Access in application code — they appear as standard env vars
// src/config.ts
export const config = {
  databaseUrl: process.env.DATABASE_URL!,
  jwtSecret: process.env.JWT_SECRET!,
  region: process.env.FLY_REGION ?? "unknown",
  machineId: process.env.FLY_MACHINE_ID ?? "local",
  appName: process.env.FLY_APP_NAME ?? "dev",
};

Key Techniques

Multi-Region Deployment

// Deploy to multiple regions
// $ fly regions set iad cdg nrt sin
// $ fly scale count 2 --region iad
// $ fly scale count 1 --region cdg,nrt,sin

// fly.toml — multi-region configuration
// [http_service]
//   internal_port = 3000
//   force_https = true
//   auto_stop_machines = true
//   auto_start_machines = true
//   min_machines_running = 1
//
// [[services.http_checks]]
//   interval = "15s"
//   timeout = "5s"
//   path = "/health"

// src/middleware/region.ts — region-aware request handling
import { Request, Response, NextFunction } from "express";

export function regionMiddleware(req: Request, res: Response, next: NextFunction) {
  const flyRegion = process.env.FLY_REGION ?? "local";
  const primaryRegion = process.env.PRIMARY_REGION ?? "iad";

  res.setHeader("X-Fly-Region", flyRegion);

  // Replay write requests to primary region
  if (req.method !== "GET" && flyRegion !== primaryRegion) {
    res.setHeader("fly-replay", `region=${primaryRegion}`);
    return res.status(409).send("Replaying to primary region");
  }

  next();
}

Volumes (Persistent Storage)

// Create and attach a volume
// $ fly volumes create data --region iad --size 10
// $ fly volumes list

// fly.toml — mount the volume
// [mounts]
//   source = "data"
//   destination = "/data"

// src/storage.ts — use the mounted volume
import { readFile, writeFile, mkdir } from "fs/promises";
import { join } from "path";

const DATA_DIR = "/data";

export async function saveUpload(filename: string, content: Buffer): Promise<string> {
  const uploadsDir = join(DATA_DIR, "uploads");
  await mkdir(uploadsDir, { recursive: true });

  const filepath = join(uploadsDir, filename);
  await writeFile(filepath, content);
  return filepath;
}

export async function readUpload(filename: string): Promise<Buffer> {
  return readFile(join(DATA_DIR, "uploads", filename));
}

Managed PostgreSQL

// Create a Postgres cluster
// $ fly postgres create --name my-app-db --region iad --vm-size shared-cpu-1x
// $ fly postgres attach my-app-db --app my-app

// This sets DATABASE_URL automatically as a secret

// Add read replicas for multi-region reads
// $ fly postgres create --name my-app-db-replica --region cdg --vm-size shared-cpu-1x

// src/db.ts — connection with read replica awareness
import { Pool } from "pg";

const primaryPool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  idleTimeoutMillis: 30000,
});

const replicaPool = process.env.READ_REPLICA_URL
  ? new Pool({
      connectionString: process.env.READ_REPLICA_URL,
      max: 20,
      idleTimeoutMillis: 30000,
    })
  : primaryPool;

export async function query(sql: string, params: unknown[], readOnly = false) {
  const pool = readOnly ? replicaPool : primaryPool;
  const client = await pool.connect();
  try {
    return await client.query(sql, params);
  } finally {
    client.release();
  }
}

Scaling Configuration

// Vertical scaling
// $ fly scale vm shared-cpu-2x --memory 1024

// Horizontal scaling
// $ fly scale count 3 --region iad
// $ fly scale count 2 --region cdg

// Autoscaling via fly.toml
// [http_service]
//   auto_stop_machines = true
//   auto_start_machines = true
//   min_machines_running = 1
//
// [http_service.concurrency]
//   type = "requests"
//   soft_limit = 150
//   hard_limit = 200

// src/health.ts — health check endpoint for scaling decisions
import { Router } from "express";

const health = Router();

health.get("/health", async (req, res) => {
  try {
    await db.query("SELECT 1", []);
    res.json({
      status: "healthy",
      region: process.env.FLY_REGION,
      machine: process.env.FLY_MACHINE_ID,
      uptime: process.uptime(),
    });
  } catch (error) {
    res.status(503).json({ status: "unhealthy", error: String(error) });
  }
});

export default health;

Private Networking

// Apps in the same organization can communicate over private IPv6
// DNS format: <app-name>.internal

// src/services/internal-api.ts
const INTERNAL_BASE = "http://auth-service.internal:3000";

export async function verifyToken(token: string): Promise<{ userId: string }> {
  const response = await fetch(`${INTERNAL_BASE}/verify`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ token }),
  });

  if (!response.ok) throw new Error("Token verification failed");
  return response.json();
}

// Connect to internal Postgres (no public exposure needed)
// DATABASE_URL=postgres://user:pass@my-app-db.internal:5432/mydb

Deploy and Release Commands

// fly.toml — run migrations before starting the app
// [deploy]
//   release_command = "npx prisma migrate deploy"
//   strategy = "rolling"

// Custom health check
// [[services.tcp_checks]]
//   interval = "10s"
//   timeout = "3s"

// Blue-green deployment
// $ fly deploy --strategy immediate   # replace all at once
// $ fly deploy --strategy rolling     # one-by-one (default)
// $ fly deploy --strategy canary      # deploy to one machine first

Process Groups (Multiple Services)

// fly.toml — run different processes in separate machine groups
// [processes]
//   web = "node dist/server.js"
//   worker = "node dist/worker.js"
//
// [[http_service]]
//   processes = ["web"]
//   internal_port = 3000
//
// [[vm]]
//   processes = ["web"]
//   cpus = 1
//   memory_mb = 512
//
// [[vm]]
//   processes = ["worker"]
//   cpus = 2
//   memory_mb = 1024

Best Practices

  • Use fly-replay for write requests in multi-region — replay non-GET requests to the primary region where the database lives to avoid consistency issues.
  • Enable auto-stop and auto-start — machines scale to zero when idle and wake on incoming requests, saving costs during low traffic.
  • Use release commands for migrations — run prisma migrate deploy or equivalent as a release command so migrations run once per deploy, not per machine.
  • Set meaningful health checks — health endpoints should verify database connectivity and downstream dependencies, not just return 200.
  • Use internal networking — communicate between services over .internal DNS instead of the public internet for lower latency and better security.
  • Pin Dockerfile base images — use node:20.11-slim not node:latest to ensure reproducible builds.
  • Use multi-stage Docker builds — separate dependency installation, build, and runtime stages to produce smaller, more secure images.
  • Monitor with fly logs and fly status — check real-time logs and machine status during and after deployments.

Anti-Patterns

  • Running a database without volumes — container filesystems are ephemeral. Always attach a volume for any persistent data, or use managed Postgres.
  • Deploying to many regions without considering data locality — spreading compute globally while the database sits in one region adds latency to every write. Use read replicas or fly-replay.
  • Ignoring the fly-replay header for writes — without replaying writes to the primary region, you risk writing to a read replica or hitting stale data.
  • Using large Docker images — bloated images slow deployments and increase cold start times. Strip dev dependencies and use slim base images.
  • Not configuring min_machines_running — setting it to 0 means cold starts for the first request after idle. Set to 1 for always-on availability.
  • Hardcoding regions — use FLY_REGION and PRIMARY_REGION environment variables to make region-aware logic portable.
  • Skipping connection pooling — Fly machines can scale rapidly, overwhelming the database. Always use connection pools with sensible limits.

Install this skill directly: skilldb add deployment-hosting-services-skills

Get CLI access →

Related Skills

AWS Lightsail

AWS Lightsail provides a simplified way to launch virtual private servers (VPS), containers, databases, and more. It's ideal for developers and small businesses needing easy-to-use, cost-effective cloud resources without deep AWS expertise.

Deployment Hosting Services264L

Cloudflare Pages Deployment

Cloudflare Pages and Workers expertise — edge-first deployments, full-stack apps with Workers functions, KV/D1/R2 bindings, preview URLs, custom domains, and global CDN distribution

Deployment Hosting Services312L

Coolify Deployment

Coolify self-hosted PaaS expertise — Docker-based deployments, Git integration, automatic SSL, database provisioning, server management, and Heroku/Netlify alternative on your own hardware

Deployment Hosting Services227L

Digital Ocean App Platform

DigitalOcean App Platform is a fully managed Platform-as-a-Service (PaaS) that allows you to quickly build, deploy, and scale web applications, static sites, APIs, and background services. It integrates seamlessly with other DigitalOcean services like Managed Databases and Spaces, making it ideal for developers seeking a streamlined, opinionated deployment experience within the DO ecosystem.

Deployment Hosting Services248L

Google Cloud Run

Google Cloud Run is a fully managed serverless platform for containerized applications. It allows you to deploy stateless containers that scale automatically from zero to thousands of instances based on request load, paying only for the resources consumed. Choose Cloud Run for microservices, web APIs, and event-driven functions that require custom runtimes or environments.

Deployment Hosting Services223L

Kamal

Kamal (formerly MRSK) simplifies deploying web applications to servers via SSH, leveraging Docker and Traefik (or Caddy) for zero-downtime, rolling updates. It's ideal for containerized applications on a single server or small cluster without the complexity of Kubernetes.

Deployment Hosting Services145L