Skip to main content
Technology & EngineeringLocal First454 lines

crdt-fundamentals

Teaches Conflict-free Replicated Data Types (CRDTs), the mathematical foundation for local-first sync. Covers how CRDTs guarantee eventual consistency without coordination, the difference between state-based and operation-based CRDTs, and practical implementations of G-Counter, PN-Counter, LWW-Register, OR-Set, G-Set, and RGA (Replicated Growable Array). Includes causal ordering, vector clocks, and guidance on choosing the right CRDT for your data model.

Quick Summary18 lines
Conflict-free Replicated Data Types let distributed replicas converge to the same state without coordination.

## Key Points

- Last-write-wins discards one edit
- Manual conflict resolution burdens users
- Operational transform requires a central server
- All edits are preserved
- Merge is automatic
- No central server required
- States form a **join-semilattice** (a partial order with a least upper bound)
- The merge function is commutative, associative, and idempotent
1. **Tombstone bloat**: Deleted items leave tombstones that grow forever. Implement garbage collection with causal stability.
2. **Clock skew with LWW**: Wall-clock timestamps can cause data loss. Use Hybrid Logical Clocks.
3. **Metadata overhead**: CRDT metadata can exceed the size of the actual data. Choose compact representations.
4. **Not all data fits CRDTs**: Some operations (e.g., "set balance to exactly $100") are inherently non-mergeable. Use server authority for those.
skilldb get local-first-skills/crdt-fundamentalsFull skill: 454 lines
Paste into your CLAUDE.md or agent config

CRDT Fundamentals

Conflict-free Replicated Data Types let distributed replicas converge to the same state without coordination.


What CRDTs Are

A CRDT is a data structure that can be replicated across multiple devices, updated independently and concurrently, and merged automatically into a consistent state — without needing a central coordinator or consensus protocol.

The key guarantee: all replicas that have received the same set of updates will be in the same state, regardless of the order in which updates were received.

Device A: increment counter → value = 1
Device B: increment counter → value = 1
                    |
               Merge states
                    |
         Both devices: value = 2

No conflicts. No lost updates. No coordination needed.


Why CRDTs Matter for Local-First

In a local-first app, every device is a replica that can be modified independently. When devices reconnect, their states must merge. CRDTs make this merge automatic and lossless.

Without CRDTs:

  • Last-write-wins discards one edit
  • Manual conflict resolution burdens users
  • Operational transform requires a central server

With CRDTs:

  • All edits are preserved
  • Merge is automatic
  • No central server required

Two Flavors: State-Based vs Operation-Based

State-Based CRDTs (CvRDTs)

Replicas exchange their full state. A merge function combines two states into one.

Requirements:

  • States form a join-semilattice (a partial order with a least upper bound)
  • The merge function is commutative, associative, and idempotent
// State-based G-Counter: each replica tracks its own count
class GCounter {
  constructor(replicaId) {
    this.id = replicaId;
    this.counts = {}; // { replicaId: count }
  }

  increment() {
    this.counts[this.id] = (this.counts[this.id] || 0) + 1;
  }

  value() {
    return Object.values(this.counts).reduce((sum, n) => sum + n, 0);
  }

  // Merge: take the max of each replica's count
  merge(other) {
    const merged = new GCounter(this.id);
    const allKeys = new Set([
      ...Object.keys(this.counts),
      ...Object.keys(other.counts),
    ]);
    for (const key of allKeys) {
      merged.counts[key] = Math.max(
        this.counts[key] || 0,
        other.counts[key] || 0
      );
    }
    return merged;
  }
}

Operation-Based CRDTs (CmRDTs)

Replicas exchange individual operations. Operations must be delivered reliably (at least once) and in causal order.

// Operation-based G-Counter
class OpGCounter {
  constructor() {
    this.value = 0;
  }

  // Generate operation
  increment() {
    this.value += 1;
    return { type: 'increment' }; // broadcast this op
  }

  // Apply remote operation
  applyRemote(op) {
    if (op.type === 'increment') {
      this.value += 1;
    }
  }
}

Trade-offs:

State-BasedOperation-Based
NetworkSends full state (larger)Sends operations (smaller)
DeliveryTolerates duplicates, reorderingNeeds causal delivery
ComplexitySimpler protocolSimpler data structure

Core CRDT Types

G-Counter (Grow-Only Counter)

A counter that can only be incremented. Each replica has its own slot.

// See GCounter class above
const a = new GCounter('A');
const b = new GCounter('B');

a.increment(); // A's slot: 1
a.increment(); // A's slot: 2
b.increment(); // B's slot: 1

const merged = a.merge(b);
console.log(merged.value()); // 3

Use case: view counts, like counts, event counters.

PN-Counter (Positive-Negative Counter)

A counter that supports both increment and decrement, built from two G-Counters.

class PNCounter {
  constructor(replicaId) {
    this.positive = new GCounter(replicaId);
    this.negative = new GCounter(replicaId);
  }

  increment() {
    this.positive.increment();
  }

  decrement() {
    this.negative.increment();
  }

  value() {
    return this.positive.value() - this.negative.value();
  }

  merge(other) {
    const merged = new PNCounter(this.positive.id);
    merged.positive = this.positive.merge(other.positive);
    merged.negative = this.negative.merge(other.negative);
    return merged;
  }
}

Use case: inventory counts, upvote/downvote, bidirectional counters.

LWW-Register (Last-Writer-Wins Register)

A single value where the most recent write wins, using timestamps.

class LWWRegister {
  constructor() {
    this.value = null;
    this.timestamp = 0;
  }

  set(value, timestamp = Date.now()) {
    if (timestamp > this.timestamp) {
      this.value = value;
      this.timestamp = timestamp;
    }
  }

  merge(other) {
    const merged = new LWWRegister();
    if (this.timestamp >= other.timestamp) {
      merged.value = this.value;
      merged.timestamp = this.timestamp;
    } else {
      merged.value = other.value;
      merged.timestamp = other.timestamp;
    }
    return merged;
  }
}

Warning: LWW discards concurrent writes. Use a Hybrid Logical Clock (HLC) instead of wall-clock timestamps to avoid clock-skew issues.

// Hybrid Logical Clock for safer timestamps
class HLC {
  constructor() {
    this.ts = 0;    // physical time
    this.counter = 0; // logical counter
  }

  now() {
    const pt = Date.now();
    if (pt > this.ts) {
      this.ts = pt;
      this.counter = 0;
    } else {
      this.counter++;
    }
    return { ts: this.ts, counter: this.counter };
  }

  merge(remote) {
    const pt = Date.now();
    if (pt > this.ts && pt > remote.ts) {
      this.ts = pt;
      this.counter = 0;
    } else if (this.ts === remote.ts) {
      this.counter = Math.max(this.counter, remote.counter) + 1;
    } else if (remote.ts > this.ts) {
      this.ts = remote.ts;
      this.counter = remote.counter + 1;
    } else {
      this.counter++;
    }
    return { ts: this.ts, counter: this.counter };
  }
}

Use case: user profile fields, settings, any single-value state.

G-Set (Grow-Only Set)

A set where elements can only be added, never removed.

class GSet {
  constructor() {
    this.items = new Set();
  }

  add(item) {
    this.items.add(item);
  }

  has(item) {
    return this.items.has(item);
  }

  merge(other) {
    const merged = new GSet();
    merged.items = new Set([...this.items, ...other.items]);
    return merged;
  }
}

Use case: tags applied to items, seen message IDs, bookmarks.

OR-Set (Observed-Remove Set)

A set that supports both add and remove. Each add is tagged with a unique ID. Remove only removes the tags that were observed.

class ORSet {
  constructor() {
    this.elements = new Map(); // value -> Set of unique tags
    this.tombstones = new Set(); // removed tags
    this.tagCounter = 0;
  }

  add(value, replicaId) {
    const tag = `${replicaId}:${++this.tagCounter}`;
    if (!this.elements.has(value)) {
      this.elements.set(value, new Set());
    }
    this.elements.get(value).add(tag);
    return tag;
  }

  remove(value) {
    const tags = this.elements.get(value);
    if (tags) {
      for (const tag of tags) {
        this.tombstones.add(tag);
      }
      this.elements.delete(value);
    }
  }

  has(value) {
    const tags = this.elements.get(value);
    if (!tags) return false;
    for (const tag of tags) {
      if (!this.tombstones.has(tag)) return true;
    }
    return false;
  }

  values() {
    const result = [];
    for (const [value, tags] of this.elements) {
      for (const tag of tags) {
        if (!this.tombstones.has(tag)) {
          result.push(value);
          break;
        }
      }
    }
    return result;
  }

  merge(other) {
    const merged = new ORSet();
    merged.tombstones = new Set([...this.tombstones, ...other.tombstones]);

    const allValues = new Set([
      ...this.elements.keys(),
      ...other.elements.keys(),
    ]);

    for (const value of allValues) {
      const tagsA = this.elements.get(value) || new Set();
      const tagsB = other.elements.get(value) || new Set();
      const allTags = new Set([...tagsA, ...tagsB]);
      const liveTags = new Set();
      for (const tag of allTags) {
        if (!merged.tombstones.has(tag)) {
          liveTags.add(tag);
        }
      }
      if (liveTags.size > 0) {
        merged.elements.set(value, liveTags);
      }
    }
    return merged;
  }
}

Use case: todo lists, collaborator lists, any set with add/remove.


Causal Ordering and Version Vectors

CRDTs often need to know the causal relationship between operations: did operation A happen before B, or were they concurrent?

Version Vectors

A version vector tracks the latest operation seen from each replica.

class VersionVector {
  constructor() {
    this.clock = {}; // replicaId -> counter
  }

  increment(replicaId) {
    this.clock[replicaId] = (this.clock[replicaId] || 0) + 1;
  }

  // Does this version vector dominate (happened after) the other?
  dominates(other) {
    for (const [id, count] of Object.entries(other.clock)) {
      if ((this.clock[id] || 0) < count) return false;
    }
    return true;
  }

  // Are these concurrent? (neither dominates)
  isConcurrent(other) {
    return !this.dominates(other) && !other.dominates(this);
  }

  merge(other) {
    const merged = new VersionVector();
    const allKeys = new Set([
      ...Object.keys(this.clock),
      ...Object.keys(other.clock),
    ]);
    for (const key of allKeys) {
      merged.clock[key] = Math.max(
        this.clock[key] || 0,
        other.clock[key] || 0
      );
    }
    return merged;
  }
}

Choosing the Right CRDT

Data ShapeCRDTConcurrent Behavior
Counter (increment only)G-CounterAll increments preserved
Counter (inc + dec)PN-CounterAll changes preserved
Single valueLWW-RegisterLast writer wins
Set (add only)G-SetUnion of all adds
Set (add + remove)OR-SetAdd wins over concurrent remove
Ordered listRGA / Yjs ArrayInterleaves concurrent inserts
Rich textYjs Text / PeritextMerges concurrent edits
Map / ObjectLWW-Map or OR-MapPer-key merge semantics
JSON documentAutomerge / Yjs DocRecursive CRDT structure

Libraries That Implement CRDTs

LibraryLanguageFocus
YjsJavaScriptDocuments, text, general-purpose
AutomergeJS / RustJSON documents, immutable history
LoroRust / JSHigh-perf CRDTs, rich text
Diamond TypesRustText CRDTs, minimal overhead
cr-sqliteSQLite extCRDTs embedded in SQLite

Common Pitfalls

  1. Tombstone bloat: Deleted items leave tombstones that grow forever. Implement garbage collection with causal stability.
  2. Clock skew with LWW: Wall-clock timestamps can cause data loss. Use Hybrid Logical Clocks.
  3. Metadata overhead: CRDT metadata can exceed the size of the actual data. Choose compact representations.
  4. Not all data fits CRDTs: Some operations (e.g., "set balance to exactly $100") are inherently non-mergeable. Use server authority for those.
  5. Testing: Always test with simulated network partitions, concurrent edits, and delayed sync. Property-based testing works well for CRDTs.

Install this skill directly: skilldb add local-first-skills

Get CLI access →

Related Skills

electric-sql

Teaches ElectricSQL, a Postgres-backed local-first sync framework. Covers the Electric architecture where Postgres is the source of truth and data syncs to local SQLite databases on client devices via shape-based partial replication. Includes shape definitions, live queries, offline-first patterns, conflict resolution with rich CRDTs, integration with React and Expo (React Native), deployment patterns, and migration strategies.

Local First433L

indexeddb-patterns

Teaches IndexedDB patterns for local-first web applications, using Dexie.js as the primary wrapper library. Covers schema design and versioning, creating indexes for efficient queries, transaction patterns, performance optimization (bulk operations, pagination, lazy loading), migration strategies for schema evolution, storage quota management, data export and import, and integration patterns with sync engines and reactive frameworks.

Local First556L

local-first-auth

Teaches authentication and authorization patterns for local-first applications that must work offline. Covers offline-capable auth with cached tokens, permission sync and local enforcement, encrypted local storage for sensitive data, key management with device-bound keys, device authorization and revocation, multi-device identity linking, end-to-end encryption for synced data, and secure patterns for handling auth in disconnected environments.

Local First606L

local-first-fundamentals

Teaches the local-first software paradigm where applications store data on the user's device, work fully offline, and sync to peers or servers when connectivity is available. Covers the spectrum from cloud-first to offline-first to local-first, core benefits (instant UX, offline capability, data ownership, privacy), key challenges (conflict resolution, sync complexity, storage limits), architectural patterns, and decision frameworks for when local-first is the right choice.

Local First285L

sync-engine-architecture

Teaches how to design and build a sync engine for local-first applications. Covers the operation log as the foundation, conflict resolution strategies (last-write-wins, operational transform, CRDTs), server reconciliation patterns, partial sync for large datasets, bandwidth optimization techniques, version vectors and causal consistency, clock synchronization, and practical implementation patterns with code examples.

Local First572L

yjs-sync

Teaches building local-first collaborative applications with Yjs, the most widely adopted CRDT library for JavaScript. Covers the Y.Doc document model, shared types (Y.Map, Y.Array, Y.Text, Y.XmlFragment), the awareness protocol for presence and cursors, persistence and sync providers (WebSocket, WebRTC, IndexedDB), integrating with editors like ProseMirror/TipTap/CodeMirror/Monaco, undo/redo management, and performance optimization patterns.

Local First471L