Skip to main content
Technology & EngineeringSystem Design120 lines

Database Scaling

Database scaling patterns including sharding, replication, and read replicas for high-throughput systems

Quick Summary18 lines
You are an expert in Database Scaling for designing scalable distributed systems.

## Key Points

- **Single-Leader**: One primary accepts writes, replicas serve reads. Simple, but replication lag means reads may be stale.
- **Multi-Leader**: Multiple nodes accept writes. Better write availability but introduces conflict resolution complexity.
- **Leaderless**: Any node accepts reads and writes (Cassandra, DynamoDB). Uses quorum reads/writes for consistency.
- **Range-Based**: Shard by value ranges (e.g., user IDs 1-1M on shard 1). Simple but can create hotspots.
- **Hash-Based**: Hash the shard key and distribute by hash range. Even distribution but range queries span all shards.
- **Directory-Based**: A lookup service maps keys to shards. Flexible but the directory is a single point of failure.
- Exhaust vertical scaling and read replicas before sharding — sharding introduces permanent architectural complexity that is very difficult to undo.
- Choose the shard key based on query patterns, not just even distribution; the most frequent queries should be servable from a single shard.
- Monitor replication lag and set up alerts; stale reads from lagging replicas are a common source of subtle bugs in read-heavy systems.
- Selecting a shard key with low cardinality (e.g., country code) that creates a few massive shards and many empty ones, defeating the purpose of horizontal scaling.
- Ignoring the need for a resharding strategy — as data grows, the initial shard count becomes insufficient, and rebalancing without downtime requires careful planning from the start.
- **Premature Sharding**: Sharding a database before exhausting vertical scaling and read replicas. This introduces permanent operational complexity for a scaling problem that may not yet exist.
skilldb get system-design-skills/Database ScalingFull skill: 120 lines
Paste into your CLAUDE.md or agent config

Database Scaling — System Design

You are an expert in Database Scaling for designing scalable distributed systems.

Core Philosophy

Database scaling should be driven by measured bottlenecks, not anticipated ones. The most expensive mistake in database architecture is sharding too early — it introduces permanent complexity, makes cross-entity queries harder, and locks you into a data distribution model that is painful to change. Vertical scaling and read replicas solve the majority of real-world scaling problems, and they should be exhausted before reaching for horizontal partitioning.

When horizontal scaling does become necessary, the shard key decision is the single most consequential choice in the system. It determines which queries are efficient (single-shard) and which are expensive (scatter-gather). This decision should be driven by actual query patterns, not theoretical uniformity of distribution.

Data gravity is real: once data is distributed across shards, every subsequent feature must account for that distribution. Schema changes, migrations, backups, and debugging all become harder. The goal is to delay this complexity as long as possible and, when it becomes unavoidable, to choose a partitioning strategy that aligns with the system's dominant access patterns.

Overview

Database scaling addresses the problem of a single database becoming a bottleneck as traffic, data volume, or query complexity grows. The two fundamental approaches are vertical scaling (bigger hardware) and horizontal scaling (more nodes). Horizontal scaling involves replication, read replicas, and sharding — each with distinct trade-offs in consistency, complexity, and operational cost.

Core Concepts

Scaling Strategies

Vertical Scaling:
[App] --> [Bigger DB Server (more CPU, RAM, SSD)]

Read Replicas:
[App Writes] --> [Primary DB] --replication--> [Replica 1] <-- [App Reads]
                                           --> [Replica 2] <-- [App Reads]

Sharding:
[App] --> [Router/Proxy]
              |
     +--------+--------+
     v        v        v
  [Shard 1] [Shard 2] [Shard 3]
  (users    (users    (users
   A-H)      I-P)      Q-Z)

Replication Models

  • Single-Leader: One primary accepts writes, replicas serve reads. Simple, but replication lag means reads may be stale.
  • Multi-Leader: Multiple nodes accept writes. Better write availability but introduces conflict resolution complexity.
  • Leaderless: Any node accepts reads and writes (Cassandra, DynamoDB). Uses quorum reads/writes for consistency.

Sharding (Horizontal Partitioning)

Data is split across multiple database instances by a shard key. Each shard holds a subset of the data and handles queries for that subset independently.

  • Range-Based: Shard by value ranges (e.g., user IDs 1-1M on shard 1). Simple but can create hotspots.
  • Hash-Based: Hash the shard key and distribute by hash range. Even distribution but range queries span all shards.
  • Directory-Based: A lookup service maps keys to shards. Flexible but the directory is a single point of failure.

Connection Pooling

Database connections are expensive. Use connection poolers (PgBouncer, ProxySQL) to multiplex application connections over a smaller set of database connections. Critical at scale where thousands of application instances may each want a connection.

Implementation Patterns

Read/Write Splitting

Route write queries to the primary and read queries to replicas. Use application-level routing or a proxy (ProxySQL, HAProxy). Account for replication lag — critical reads (e.g., immediately after a write) should go to the primary.

Shard Key Selection

The shard key determines data distribution and query routing. A good shard key has high cardinality, even distribution, and aligns with the most common query patterns. Poor shard key choices lead to hotspots or cross-shard queries.

Cross-Shard Queries and Joins

Queries that span multiple shards require scatter-gather: send the query to all relevant shards, collect results, merge. This is expensive. Design the schema and shard key to minimize cross-shard operations.

Online Schema Migration

At scale, ALTER TABLE can lock a table for hours. Tools like gh-ost (GitHub), pt-online-schema-change (Percona), or LHM create a shadow table, copy data incrementally, then swap. Plan schema changes as carefully as code changes.

Vitess / Citus / CockroachDB

Middleware and NewSQL databases add sharding capabilities on top of existing databases (Vitess for MySQL, Citus for PostgreSQL) or build it natively (CockroachDB, TiDB). These reduce the operational burden of manual sharding.

Trade-offs

FactorRead ReplicasShardingNewSQL
Write scalingNo (single primary)YesYes
Read scalingYesYesYes
ComplexityLowHighModerate
Cross-shard queriesN/AExpensiveHandled internally
ConsistencyEventual (replicas)Per-shard strongConfigurable
Operational burdenLowHigh (rebalancing)Moderate

Start with vertical scaling and read replicas. Move to sharding only when write throughput or data volume demands it. Consider NewSQL if you want horizontal scaling with less operational overhead.

Best Practices

  • Exhaust vertical scaling and read replicas before sharding — sharding introduces permanent architectural complexity that is very difficult to undo.
  • Choose the shard key based on query patterns, not just even distribution; the most frequent queries should be servable from a single shard.
  • Monitor replication lag and set up alerts; stale reads from lagging replicas are a common source of subtle bugs in read-heavy systems.

Common Pitfalls

  • Selecting a shard key with low cardinality (e.g., country code) that creates a few massive shards and many empty ones, defeating the purpose of horizontal scaling.
  • Ignoring the need for a resharding strategy — as data grows, the initial shard count becomes insufficient, and rebalancing without downtime requires careful planning from the start.

Anti-Patterns

  • Premature Sharding: Sharding a database before exhausting vertical scaling and read replicas. This introduces permanent operational complexity for a scaling problem that may not yet exist.

  • Cross-Shard Joins as Standard Practice: Designing schemas that routinely require scatter-gather queries across all shards. If most queries cannot be satisfied by a single shard, the shard key is wrong.

  • Shared Database Between Services: Multiple services reading and writing to the same database tables. This creates hidden coupling and makes independent scaling impossible — one service's query patterns degrade another's performance.

  • Ignoring Replication Lag: Reading from replicas immediately after a write and expecting consistent results. Critical read-after-write paths must be routed to the primary, or the application must tolerate staleness explicitly.

  • Manual Shard Routing in Application Code: Scattering shard-selection logic throughout the codebase instead of centralizing it in a routing layer or using a sharding middleware. This makes shard rebalancing a codebase-wide change.

Install this skill directly: skilldb add system-design-skills

Get CLI access →