Technology & EngineeringNetworking Infrastructure238 lines

Load Balancing

Load balancing patterns, algorithms, and tools for distributing traffic across backend services

Quick Summary9 lines

You are an expert in load balancing patterns and tools for building reliable networked systems.

## Key Points

- **Always configure health checks** with appropriate intervals and thresholds — unhealthy backends should be removed from rotation quickly but not flap on transient errors.
- **Use connection draining** (deregistration delay) so in-flight requests complete gracefully before a backend is removed during deployments or scale-in events.
- **Terminate TLS at the load balancer** and use plain HTTP to backends over a private network — this simplifies certificate management and offloads CPU-intensive encryption from application servers.

skilldb get networking-infrastructure-skills/Load BalancingFull skill: 238 lines

Paste into your CLAUDE.md or agent config

Load Balancing — Networking & Infrastructure

You are an expert in load balancing patterns and tools for building reliable networked systems.

Core Philosophy

Overview

Load balancing distributes incoming network traffic across multiple backend servers to ensure no single server is overwhelmed, improving availability, throughput, and fault tolerance. This skill covers load balancing algorithms, L4 vs L7 balancing, health checks, session persistence, and configuration for common tools (HAProxy, Nginx, AWS ALB/NLB, Envoy).

Core Concepts

L4 vs L7 Load Balancing

L4 (Transport Layer)                    L7 (Application Layer)
┌─────────────────────┐                 ┌─────────────────────┐
│ Operates on TCP/UDP │                 │ Operates on HTTP    │
│ Fast, low overhead  │                 │ Content-aware       │
│ No request inspect  │                 │ Path/header routing │
│ Connection-based    │                 │ SSL termination     │
│ e.g., AWS NLB       │                 │ e.g., AWS ALB, Nginx│
└─────────────────────┘                 └─────────────────────┘

Load Balancing Algorithms

Algorithm	Behavior	Best For
Round Robin	Sequential rotation	Equal-capacity servers
Weighted Round Robin	Proportional distribution	Mixed-capacity servers
Least Connections	Route to server with fewest active connections	Varying request durations
IP Hash	Consistent mapping of client IP to server	Simple session affinity
Random	Random server selection	Large server pools
Least Response Time	Route to fastest-responding server	Latency-sensitive apps

Health Checks

Active Health Check:
  LB ──HTTP GET /health──→ Backend
  LB ←── 200 OK ──────── Backend    → mark healthy
  LB ←── 503 / timeout ─ Backend    → mark unhealthy after N failures

Passive Health Check:
  LB monitors real traffic responses
  Too many 5xx errors → mark unhealthy
  Successful responses resume → mark healthy

Implementation Patterns

Nginx Load Balancer

upstream api_backends {
    least_conn;

    server 10.0.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;
    server 10.0.1.11:8080 weight=2 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 weight=1 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:8080 backup;  # only used when all others are down

    keepalive 32;  # persistent connections to backends
}

server {
    listen 443 ssl;
    server_name api.example.com;

    location / {
        proxy_pass http://api_backends;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        proxy_connect_timeout 5s;
        proxy_read_timeout 30s;
        proxy_next_upstream error timeout http_502 http_503;
        proxy_next_upstream_tries 2;
    }

    location /health {
        access_log off;
        return 200 "OK";
    }
}

HAProxy Configuration

global
    maxconn 50000
    log stdout format raw local0

defaults
    mode http
    timeout connect 5s
    timeout client 30s
    timeout server 30s
    option httplog
    option dontlognull
    retries 3

frontend http_front
    bind *:443 ssl crt /etc/haproxy/certs/example.pem
    default_backend api_servers

    # Content-based routing
    acl is_api path_beg /api
    acl is_websocket hdr(Upgrade) -i websocket
    use_backend api_servers if is_api
    use_backend ws_servers if is_websocket
    default_backend web_servers

backend api_servers
    balance leastconn
    option httpchk GET /health
    http-check expect status 200

    server api1 10.0.1.10:8080 check inter 5s fall 3 rise 2 weight 100
    server api2 10.0.1.11:8080 check inter 5s fall 3 rise 2 weight 100
    server api3 10.0.1.12:8080 check inter 5s fall 3 rise 2 weight 50

backend web_servers
    balance roundrobin
    cookie SERVERID insert indirect nocache
    server web1 10.0.2.10:3000 check cookie web1
    server web2 10.0.2.11:3000 check cookie web2

backend ws_servers
    balance source
    server ws1 10.0.3.10:8080 check
    server ws2 10.0.3.11:8080 check

listen stats
    bind *:9000
    stats enable
    stats uri /stats
    stats auth admin:secret

AWS ALB with Terraform

resource "aws_lb" "api" {
  name               = "api-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = var.public_subnet_ids

  enable_deletion_protection = true
}

resource "aws_lb_target_group" "api" {
  name     = "api-targets"
  port     = 8080
  protocol = "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 3
    interval            = 15
    timeout             = 5
    matcher             = "200"
  }

  stickiness {
    type            = "lb_cookie"
    cookie_duration = 3600
    enabled         = true
  }
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.api.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = var.certificate_arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api.arn
  }
}

# Path-based routing
resource "aws_lb_listener_rule" "api_v2" {
  listener_arn = aws_lb_listener.https.arn
  priority     = 100

  action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.api_v2.arn
  }

  condition {
    path_pattern {
      values = ["/api/v2/*"]
    }
  }
}

Best Practices

Always configure health checks with appropriate intervals and thresholds — unhealthy backends should be removed from rotation quickly but not flap on transient errors.
Use connection draining (deregistration delay) so in-flight requests complete gracefully before a backend is removed during deployments or scale-in events.
Terminate TLS at the load balancer and use plain HTTP to backends over a private network — this simplifies certificate management and offloads CPU-intensive encryption from application servers.

Common Pitfalls

Sticky sessions without fallback: If a server goes down, all sessions pinned to it are lost. Design applications to be stateless or store session data externally (Redis, database) so any backend can serve any request.
Ignoring connection limits: If your backend servers have limited connection capacity (e.g., database connection pools), a load balancer sending unlimited connections will exhaust resources. Set maxconn per server and use queue-based algorithms.

Anti-Patterns

Over-engineering for hypothetical requirements. Building for scenarios that may never materialize adds complexity without value. Solve the problem in front of you first.

Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide wastes time and introduces risk.

Premature abstraction. Creating elaborate frameworks before having enough concrete cases to know what the abstraction should look like produces the wrong abstraction.

Neglecting error handling at system boundaries. Internal code can trust its inputs, but boundaries with external systems require defensive validation.

Skipping documentation. What is obvious to you today will not be obvious to your colleague next month or to you next year.

Install this skill directly: skilldb add networking-infrastructure-skills

Get CLI access →

Related Skills

CDN Setup

CDN setup and optimization with Cloudflare, Fastly, and CloudFront for global content delivery

Networking Infrastructure•214L

DNS Architecture

This skill elucidates the hierarchical, distributed nature of the Domain Name System, covering its core components and resolution process. Activate it when designing, deploying, or troubleshooting reliable and performant name resolution infrastructure.

Networking Infrastructure•74L