Fly.io Deployment
Fly.io platform expertise — container deployment, global edge distribution, Dockerfiles, volumes, secrets, scaling, PostgreSQL, and multi-region patterns
Fly.io runs containers on bare-metal servers distributed across 30+ regions worldwide. Instead of abstracting away infrastructure, it gives you full control over a lightweight VM (Firecracker) running your Docker image. The platform excels at placing your application close to users — deploy once and replicate globally. Persistent volumes, private networking between apps, and managed Postgres make it a full-stack platform. The key insight: treat your app as a globally distributed system from day one. ## Key Points - **Use `fly-replay` for write requests in multi-region** — replay non-GET requests to the primary region where the database lives to avoid consistency issues. - **Enable auto-stop and auto-start** — machines scale to zero when idle and wake on incoming requests, saving costs during low traffic. - **Use release commands for migrations** — run `prisma migrate deploy` or equivalent as a release command so migrations run once per deploy, not per machine. - **Set meaningful health checks** — health endpoints should verify database connectivity and downstream dependencies, not just return 200. - **Use internal networking** — communicate between services over `.internal` DNS instead of the public internet for lower latency and better security. - **Pin Dockerfile base images** — use `node:20.11-slim` not `node:latest` to ensure reproducible builds. - **Use multi-stage Docker builds** — separate dependency installation, build, and runtime stages to produce smaller, more secure images. - **Monitor with `fly logs` and `fly status`** — check real-time logs and machine status during and after deployments. - **Running a database without volumes** — container filesystems are ephemeral. Always attach a volume for any persistent data, or use managed Postgres. - **Deploying to many regions without considering data locality** — spreading compute globally while the database sits in one region adds latency to every write. Use read replicas or fly-replay. - **Ignoring the `fly-replay` header for writes** — without replaying writes to the primary region, you risk writing to a read replica or hitting stale data. - **Using large Docker images** — bloated images slow deployments and increase cold start times. Strip dev dependencies and use slim base images.
skilldb get deployment-hosting-services-skills/Fly.io DeploymentFull skill: 338 linesFly.io Deployment
Core Philosophy
Fly.io runs containers on bare-metal servers distributed across 30+ regions worldwide. Instead of abstracting away infrastructure, it gives you full control over a lightweight VM (Firecracker) running your Docker image. The platform excels at placing your application close to users — deploy once and replicate globally. Persistent volumes, private networking between apps, and managed Postgres make it a full-stack platform. The key insight: treat your app as a globally distributed system from day one.
Setup
Project Initialization
// Install flyctl CLI
// $ curl -L https://fly.io/install.sh | sh
// $ fly auth login
// Launch a new app (interactive)
// $ fly launch
// fly.toml — generated configuration
// app = "my-app"
// primary_region = "iad"
//
// [build]
// dockerfile = "Dockerfile"
//
// [env]
// NODE_ENV = "production"
// PORT = "3000"
//
// [http_service]
// internal_port = 3000
// force_https = true
// auto_stop_machines = true
// auto_start_machines = true
// min_machines_running = 1
//
// [[vm]]
// cpu_kind = "shared"
// cpus = 1
// memory_mb = 512
Dockerfile for Node.js/TypeScript
// Dockerfile — multi-stage build for production
// FROM node:20-slim AS base
// RUN apt-get update && apt-get install -y openssl && rm -rf /var/lib/apt/lists/*
// WORKDIR /app
//
// FROM base AS deps
// COPY package.json package-lock.json ./
// RUN npm ci --omit=dev
//
// FROM base AS build
// COPY package.json package-lock.json ./
// RUN npm ci
// COPY . .
// RUN npm run build
//
// FROM base AS runtime
// COPY --from=deps /app/node_modules ./node_modules
// COPY --from=build /app/dist ./dist
// COPY --from=build /app/package.json ./
//
// ENV NODE_ENV=production
// EXPOSE 3000
// CMD ["node", "dist/server.js"]
Secrets Management
// Set secrets (encrypted, injected as env vars at runtime)
// $ fly secrets set DATABASE_URL="postgres://..." JWT_SECRET="supersecret"
// $ fly secrets list
// $ fly secrets unset OLD_SECRET
// Access in application code — they appear as standard env vars
// src/config.ts
export const config = {
databaseUrl: process.env.DATABASE_URL!,
jwtSecret: process.env.JWT_SECRET!,
region: process.env.FLY_REGION ?? "unknown",
machineId: process.env.FLY_MACHINE_ID ?? "local",
appName: process.env.FLY_APP_NAME ?? "dev",
};
Key Techniques
Multi-Region Deployment
// Deploy to multiple regions
// $ fly regions set iad cdg nrt sin
// $ fly scale count 2 --region iad
// $ fly scale count 1 --region cdg,nrt,sin
// fly.toml — multi-region configuration
// [http_service]
// internal_port = 3000
// force_https = true
// auto_stop_machines = true
// auto_start_machines = true
// min_machines_running = 1
//
// [[services.http_checks]]
// interval = "15s"
// timeout = "5s"
// path = "/health"
// src/middleware/region.ts — region-aware request handling
import { Request, Response, NextFunction } from "express";
export function regionMiddleware(req: Request, res: Response, next: NextFunction) {
const flyRegion = process.env.FLY_REGION ?? "local";
const primaryRegion = process.env.PRIMARY_REGION ?? "iad";
res.setHeader("X-Fly-Region", flyRegion);
// Replay write requests to primary region
if (req.method !== "GET" && flyRegion !== primaryRegion) {
res.setHeader("fly-replay", `region=${primaryRegion}`);
return res.status(409).send("Replaying to primary region");
}
next();
}
Volumes (Persistent Storage)
// Create and attach a volume
// $ fly volumes create data --region iad --size 10
// $ fly volumes list
// fly.toml — mount the volume
// [mounts]
// source = "data"
// destination = "/data"
// src/storage.ts — use the mounted volume
import { readFile, writeFile, mkdir } from "fs/promises";
import { join } from "path";
const DATA_DIR = "/data";
export async function saveUpload(filename: string, content: Buffer): Promise<string> {
const uploadsDir = join(DATA_DIR, "uploads");
await mkdir(uploadsDir, { recursive: true });
const filepath = join(uploadsDir, filename);
await writeFile(filepath, content);
return filepath;
}
export async function readUpload(filename: string): Promise<Buffer> {
return readFile(join(DATA_DIR, "uploads", filename));
}
Managed PostgreSQL
// Create a Postgres cluster
// $ fly postgres create --name my-app-db --region iad --vm-size shared-cpu-1x
// $ fly postgres attach my-app-db --app my-app
// This sets DATABASE_URL automatically as a secret
// Add read replicas for multi-region reads
// $ fly postgres create --name my-app-db-replica --region cdg --vm-size shared-cpu-1x
// src/db.ts — connection with read replica awareness
import { Pool } from "pg";
const primaryPool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20,
idleTimeoutMillis: 30000,
});
const replicaPool = process.env.READ_REPLICA_URL
? new Pool({
connectionString: process.env.READ_REPLICA_URL,
max: 20,
idleTimeoutMillis: 30000,
})
: primaryPool;
export async function query(sql: string, params: unknown[], readOnly = false) {
const pool = readOnly ? replicaPool : primaryPool;
const client = await pool.connect();
try {
return await client.query(sql, params);
} finally {
client.release();
}
}
Scaling Configuration
// Vertical scaling
// $ fly scale vm shared-cpu-2x --memory 1024
// Horizontal scaling
// $ fly scale count 3 --region iad
// $ fly scale count 2 --region cdg
// Autoscaling via fly.toml
// [http_service]
// auto_stop_machines = true
// auto_start_machines = true
// min_machines_running = 1
//
// [http_service.concurrency]
// type = "requests"
// soft_limit = 150
// hard_limit = 200
// src/health.ts — health check endpoint for scaling decisions
import { Router } from "express";
const health = Router();
health.get("/health", async (req, res) => {
try {
await db.query("SELECT 1", []);
res.json({
status: "healthy",
region: process.env.FLY_REGION,
machine: process.env.FLY_MACHINE_ID,
uptime: process.uptime(),
});
} catch (error) {
res.status(503).json({ status: "unhealthy", error: String(error) });
}
});
export default health;
Private Networking
// Apps in the same organization can communicate over private IPv6
// DNS format: <app-name>.internal
// src/services/internal-api.ts
const INTERNAL_BASE = "http://auth-service.internal:3000";
export async function verifyToken(token: string): Promise<{ userId: string }> {
const response = await fetch(`${INTERNAL_BASE}/verify`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ token }),
});
if (!response.ok) throw new Error("Token verification failed");
return response.json();
}
// Connect to internal Postgres (no public exposure needed)
// DATABASE_URL=postgres://user:pass@my-app-db.internal:5432/mydb
Deploy and Release Commands
// fly.toml — run migrations before starting the app
// [deploy]
// release_command = "npx prisma migrate deploy"
// strategy = "rolling"
// Custom health check
// [[services.tcp_checks]]
// interval = "10s"
// timeout = "3s"
// Blue-green deployment
// $ fly deploy --strategy immediate # replace all at once
// $ fly deploy --strategy rolling # one-by-one (default)
// $ fly deploy --strategy canary # deploy to one machine first
Process Groups (Multiple Services)
// fly.toml — run different processes in separate machine groups
// [processes]
// web = "node dist/server.js"
// worker = "node dist/worker.js"
//
// [[http_service]]
// processes = ["web"]
// internal_port = 3000
//
// [[vm]]
// processes = ["web"]
// cpus = 1
// memory_mb = 512
//
// [[vm]]
// processes = ["worker"]
// cpus = 2
// memory_mb = 1024
Best Practices
- Use
fly-replayfor write requests in multi-region — replay non-GET requests to the primary region where the database lives to avoid consistency issues. - Enable auto-stop and auto-start — machines scale to zero when idle and wake on incoming requests, saving costs during low traffic.
- Use release commands for migrations — run
prisma migrate deployor equivalent as a release command so migrations run once per deploy, not per machine. - Set meaningful health checks — health endpoints should verify database connectivity and downstream dependencies, not just return 200.
- Use internal networking — communicate between services over
.internalDNS instead of the public internet for lower latency and better security. - Pin Dockerfile base images — use
node:20.11-slimnotnode:latestto ensure reproducible builds. - Use multi-stage Docker builds — separate dependency installation, build, and runtime stages to produce smaller, more secure images.
- Monitor with
fly logsandfly status— check real-time logs and machine status during and after deployments.
Anti-Patterns
- Running a database without volumes — container filesystems are ephemeral. Always attach a volume for any persistent data, or use managed Postgres.
- Deploying to many regions without considering data locality — spreading compute globally while the database sits in one region adds latency to every write. Use read replicas or fly-replay.
- Ignoring the
fly-replayheader for writes — without replaying writes to the primary region, you risk writing to a read replica or hitting stale data. - Using large Docker images — bloated images slow deployments and increase cold start times. Strip dev dependencies and use slim base images.
- Not configuring
min_machines_running— setting it to 0 means cold starts for the first request after idle. Set to 1 for always-on availability. - Hardcoding regions — use
FLY_REGIONandPRIMARY_REGIONenvironment variables to make region-aware logic portable. - Skipping connection pooling — Fly machines can scale rapidly, overwhelming the database. Always use connection pools with sensible limits.
Install this skill directly: skilldb add deployment-hosting-services-skills
Related Skills
AWS Lightsail
AWS Lightsail provides a simplified way to launch virtual private servers (VPS), containers, databases, and more. It's ideal for developers and small businesses needing easy-to-use, cost-effective cloud resources without deep AWS expertise.
Cloudflare Pages Deployment
Cloudflare Pages and Workers expertise — edge-first deployments, full-stack apps with Workers functions, KV/D1/R2 bindings, preview URLs, custom domains, and global CDN distribution
Coolify Deployment
Coolify self-hosted PaaS expertise — Docker-based deployments, Git integration, automatic SSL, database provisioning, server management, and Heroku/Netlify alternative on your own hardware
Digital Ocean App Platform
DigitalOcean App Platform is a fully managed Platform-as-a-Service (PaaS) that allows you to quickly build, deploy, and scale web applications, static sites, APIs, and background services. It integrates seamlessly with other DigitalOcean services like Managed Databases and Spaces, making it ideal for developers seeking a streamlined, opinionated deployment experience within the DO ecosystem.
Google Cloud Run
Google Cloud Run is a fully managed serverless platform for containerized applications. It allows you to deploy stateless containers that scale automatically from zero to thousands of instances based on request load, paying only for the resources consumed. Choose Cloud Run for microservices, web APIs, and event-driven functions that require custom runtimes or environments.
Kamal
Kamal (formerly MRSK) simplifies deploying web applications to servers via SSH, leveraging Docker and Traefik (or Caddy) for zero-downtime, rolling updates. It's ideal for containerized applications on a single server or small cluster without the complexity of Kubernetes.