Load Balancing
Load balancing patterns, algorithms, and tools for distributing traffic across backend services
You are an expert in load balancing patterns and tools for building reliable networked systems. ## Key Points - **Always configure health checks** with appropriate intervals and thresholds — unhealthy backends should be removed from rotation quickly but not flap on transient errors. - **Use connection draining** (deregistration delay) so in-flight requests complete gracefully before a backend is removed during deployments or scale-in events. - **Terminate TLS at the load balancer** and use plain HTTP to backends over a private network — this simplifies certificate management and offloads CPU-intensive encryption from application servers.
skilldb get networking-infrastructure-skills/Load BalancingFull skill: 238 linesLoad Balancing — Networking & Infrastructure
You are an expert in load balancing patterns and tools for building reliable networked systems.
Core Philosophy
Overview
Load balancing distributes incoming network traffic across multiple backend servers to ensure no single server is overwhelmed, improving availability, throughput, and fault tolerance. This skill covers load balancing algorithms, L4 vs L7 balancing, health checks, session persistence, and configuration for common tools (HAProxy, Nginx, AWS ALB/NLB, Envoy).
Core Concepts
L4 vs L7 Load Balancing
L4 (Transport Layer) L7 (Application Layer)
┌─────────────────────┐ ┌─────────────────────┐
│ Operates on TCP/UDP │ │ Operates on HTTP │
│ Fast, low overhead │ │ Content-aware │
│ No request inspect │ │ Path/header routing │
│ Connection-based │ │ SSL termination │
│ e.g., AWS NLB │ │ e.g., AWS ALB, Nginx│
└─────────────────────┘ └─────────────────────┘
Load Balancing Algorithms
| Algorithm | Behavior | Best For |
|---|---|---|
| Round Robin | Sequential rotation | Equal-capacity servers |
| Weighted Round Robin | Proportional distribution | Mixed-capacity servers |
| Least Connections | Route to server with fewest active connections | Varying request durations |
| IP Hash | Consistent mapping of client IP to server | Simple session affinity |
| Random | Random server selection | Large server pools |
| Least Response Time | Route to fastest-responding server | Latency-sensitive apps |
Health Checks
Active Health Check:
LB ──HTTP GET /health──→ Backend
LB ←── 200 OK ──────── Backend → mark healthy
LB ←── 503 / timeout ─ Backend → mark unhealthy after N failures
Passive Health Check:
LB monitors real traffic responses
Too many 5xx errors → mark unhealthy
Successful responses resume → mark healthy
Implementation Patterns
Nginx Load Balancer
upstream api_backends {
least_conn;
server 10.0.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;
server 10.0.1.11:8080 weight=2 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 weight=1 max_fails=3 fail_timeout=30s;
server 10.0.1.13:8080 backup; # only used when all others are down
keepalive 32; # persistent connections to backends
}
server {
listen 443 ssl;
server_name api.example.com;
location / {
proxy_pass http://api_backends;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_next_upstream error timeout http_502 http_503;
proxy_next_upstream_tries 2;
}
location /health {
access_log off;
return 200 "OK";
}
}
HAProxy Configuration
global
maxconn 50000
log stdout format raw local0
defaults
mode http
timeout connect 5s
timeout client 30s
timeout server 30s
option httplog
option dontlognull
retries 3
frontend http_front
bind *:443 ssl crt /etc/haproxy/certs/example.pem
default_backend api_servers
# Content-based routing
acl is_api path_beg /api
acl is_websocket hdr(Upgrade) -i websocket
use_backend api_servers if is_api
use_backend ws_servers if is_websocket
default_backend web_servers
backend api_servers
balance leastconn
option httpchk GET /health
http-check expect status 200
server api1 10.0.1.10:8080 check inter 5s fall 3 rise 2 weight 100
server api2 10.0.1.11:8080 check inter 5s fall 3 rise 2 weight 100
server api3 10.0.1.12:8080 check inter 5s fall 3 rise 2 weight 50
backend web_servers
balance roundrobin
cookie SERVERID insert indirect nocache
server web1 10.0.2.10:3000 check cookie web1
server web2 10.0.2.11:3000 check cookie web2
backend ws_servers
balance source
server ws1 10.0.3.10:8080 check
server ws2 10.0.3.11:8080 check
listen stats
bind *:9000
stats enable
stats uri /stats
stats auth admin:secret
AWS ALB with Terraform
resource "aws_lb" "api" {
name = "api-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = var.public_subnet_ids
enable_deletion_protection = true
}
resource "aws_lb_target_group" "api" {
name = "api-targets"
port = 8080
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
path = "/health"
healthy_threshold = 2
unhealthy_threshold = 3
interval = 15
timeout = 5
matcher = "200"
}
stickiness {
type = "lb_cookie"
cookie_duration = 3600
enabled = true
}
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.api.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = var.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.api.arn
}
}
# Path-based routing
resource "aws_lb_listener_rule" "api_v2" {
listener_arn = aws_lb_listener.https.arn
priority = 100
action {
type = "forward"
target_group_arn = aws_lb_target_group.api_v2.arn
}
condition {
path_pattern {
values = ["/api/v2/*"]
}
}
}
Best Practices
- Always configure health checks with appropriate intervals and thresholds — unhealthy backends should be removed from rotation quickly but not flap on transient errors.
- Use connection draining (deregistration delay) so in-flight requests complete gracefully before a backend is removed during deployments or scale-in events.
- Terminate TLS at the load balancer and use plain HTTP to backends over a private network — this simplifies certificate management and offloads CPU-intensive encryption from application servers.
Common Pitfalls
- Sticky sessions without fallback: If a server goes down, all sessions pinned to it are lost. Design applications to be stateless or store session data externally (Redis, database) so any backend can serve any request.
- Ignoring connection limits: If your backend servers have limited connection capacity (e.g., database connection pools), a load balancer sending unlimited connections will exhaust resources. Set
maxconnper server and use queue-based algorithms.
Anti-Patterns
Over-engineering for hypothetical requirements. Building for scenarios that may never materialize adds complexity without value. Solve the problem in front of you first.
Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide wastes time and introduces risk.
Premature abstraction. Creating elaborate frameworks before having enough concrete cases to know what the abstraction should look like produces the wrong abstraction.
Neglecting error handling at system boundaries. Internal code can trust its inputs, but boundaries with external systems require defensive validation.
Skipping documentation. What is obvious to you today will not be obvious to your colleague next month or to you next year.
Install this skill directly: skilldb add networking-infrastructure-skills
Related Skills
CDN Setup
CDN setup and optimization with Cloudflare, Fastly, and CloudFront for global content delivery
DNS Architecture
This skill elucidates the hierarchical, distributed nature of the Domain Name System, covering its core components and resolution process. Activate it when designing, deploying, or troubleshooting reliable and performant name resolution infrastructure.
DNS Management
DNS configuration, record management, and resolution strategies for reliable domain infrastructure
Network Security
Network security patterns including firewalls, DDoS protection, WAFs, and intrusion detection
Reverse Proxy
Reverse proxy configuration with Nginx and Caddy for routing, TLS termination, and request handling
Service Mesh
Service mesh patterns with Istio and Linkerd for observability, traffic management, and mTLS