crdt-fundamentals
Teaches Conflict-free Replicated Data Types (CRDTs), the mathematical foundation for local-first sync. Covers how CRDTs guarantee eventual consistency without coordination, the difference between state-based and operation-based CRDTs, and practical implementations of G-Counter, PN-Counter, LWW-Register, OR-Set, G-Set, and RGA (Replicated Growable Array). Includes causal ordering, vector clocks, and guidance on choosing the right CRDT for your data model.
Conflict-free Replicated Data Types let distributed replicas converge to the same state without coordination. ## Key Points - Last-write-wins discards one edit - Manual conflict resolution burdens users - Operational transform requires a central server - All edits are preserved - Merge is automatic - No central server required - States form a **join-semilattice** (a partial order with a least upper bound) - The merge function is commutative, associative, and idempotent 1. **Tombstone bloat**: Deleted items leave tombstones that grow forever. Implement garbage collection with causal stability. 2. **Clock skew with LWW**: Wall-clock timestamps can cause data loss. Use Hybrid Logical Clocks. 3. **Metadata overhead**: CRDT metadata can exceed the size of the actual data. Choose compact representations. 4. **Not all data fits CRDTs**: Some operations (e.g., "set balance to exactly $100") are inherently non-mergeable. Use server authority for those.
skilldb get local-first-skills/crdt-fundamentalsFull skill: 454 linesCRDT Fundamentals
Conflict-free Replicated Data Types let distributed replicas converge to the same state without coordination.
What CRDTs Are
A CRDT is a data structure that can be replicated across multiple devices, updated independently and concurrently, and merged automatically into a consistent state — without needing a central coordinator or consensus protocol.
The key guarantee: all replicas that have received the same set of updates will be in the same state, regardless of the order in which updates were received.
Device A: increment counter → value = 1
Device B: increment counter → value = 1
|
Merge states
|
Both devices: value = 2
No conflicts. No lost updates. No coordination needed.
Why CRDTs Matter for Local-First
In a local-first app, every device is a replica that can be modified independently. When devices reconnect, their states must merge. CRDTs make this merge automatic and lossless.
Without CRDTs:
- Last-write-wins discards one edit
- Manual conflict resolution burdens users
- Operational transform requires a central server
With CRDTs:
- All edits are preserved
- Merge is automatic
- No central server required
Two Flavors: State-Based vs Operation-Based
State-Based CRDTs (CvRDTs)
Replicas exchange their full state. A merge function combines two states into one.
Requirements:
- States form a join-semilattice (a partial order with a least upper bound)
- The merge function is commutative, associative, and idempotent
// State-based G-Counter: each replica tracks its own count
class GCounter {
constructor(replicaId) {
this.id = replicaId;
this.counts = {}; // { replicaId: count }
}
increment() {
this.counts[this.id] = (this.counts[this.id] || 0) + 1;
}
value() {
return Object.values(this.counts).reduce((sum, n) => sum + n, 0);
}
// Merge: take the max of each replica's count
merge(other) {
const merged = new GCounter(this.id);
const allKeys = new Set([
...Object.keys(this.counts),
...Object.keys(other.counts),
]);
for (const key of allKeys) {
merged.counts[key] = Math.max(
this.counts[key] || 0,
other.counts[key] || 0
);
}
return merged;
}
}
Operation-Based CRDTs (CmRDTs)
Replicas exchange individual operations. Operations must be delivered reliably (at least once) and in causal order.
// Operation-based G-Counter
class OpGCounter {
constructor() {
this.value = 0;
}
// Generate operation
increment() {
this.value += 1;
return { type: 'increment' }; // broadcast this op
}
// Apply remote operation
applyRemote(op) {
if (op.type === 'increment') {
this.value += 1;
}
}
}
Trade-offs:
| State-Based | Operation-Based | |
|---|---|---|
| Network | Sends full state (larger) | Sends operations (smaller) |
| Delivery | Tolerates duplicates, reordering | Needs causal delivery |
| Complexity | Simpler protocol | Simpler data structure |
Core CRDT Types
G-Counter (Grow-Only Counter)
A counter that can only be incremented. Each replica has its own slot.
// See GCounter class above
const a = new GCounter('A');
const b = new GCounter('B');
a.increment(); // A's slot: 1
a.increment(); // A's slot: 2
b.increment(); // B's slot: 1
const merged = a.merge(b);
console.log(merged.value()); // 3
Use case: view counts, like counts, event counters.
PN-Counter (Positive-Negative Counter)
A counter that supports both increment and decrement, built from two G-Counters.
class PNCounter {
constructor(replicaId) {
this.positive = new GCounter(replicaId);
this.negative = new GCounter(replicaId);
}
increment() {
this.positive.increment();
}
decrement() {
this.negative.increment();
}
value() {
return this.positive.value() - this.negative.value();
}
merge(other) {
const merged = new PNCounter(this.positive.id);
merged.positive = this.positive.merge(other.positive);
merged.negative = this.negative.merge(other.negative);
return merged;
}
}
Use case: inventory counts, upvote/downvote, bidirectional counters.
LWW-Register (Last-Writer-Wins Register)
A single value where the most recent write wins, using timestamps.
class LWWRegister {
constructor() {
this.value = null;
this.timestamp = 0;
}
set(value, timestamp = Date.now()) {
if (timestamp > this.timestamp) {
this.value = value;
this.timestamp = timestamp;
}
}
merge(other) {
const merged = new LWWRegister();
if (this.timestamp >= other.timestamp) {
merged.value = this.value;
merged.timestamp = this.timestamp;
} else {
merged.value = other.value;
merged.timestamp = other.timestamp;
}
return merged;
}
}
Warning: LWW discards concurrent writes. Use a Hybrid Logical Clock (HLC) instead of wall-clock timestamps to avoid clock-skew issues.
// Hybrid Logical Clock for safer timestamps
class HLC {
constructor() {
this.ts = 0; // physical time
this.counter = 0; // logical counter
}
now() {
const pt = Date.now();
if (pt > this.ts) {
this.ts = pt;
this.counter = 0;
} else {
this.counter++;
}
return { ts: this.ts, counter: this.counter };
}
merge(remote) {
const pt = Date.now();
if (pt > this.ts && pt > remote.ts) {
this.ts = pt;
this.counter = 0;
} else if (this.ts === remote.ts) {
this.counter = Math.max(this.counter, remote.counter) + 1;
} else if (remote.ts > this.ts) {
this.ts = remote.ts;
this.counter = remote.counter + 1;
} else {
this.counter++;
}
return { ts: this.ts, counter: this.counter };
}
}
Use case: user profile fields, settings, any single-value state.
G-Set (Grow-Only Set)
A set where elements can only be added, never removed.
class GSet {
constructor() {
this.items = new Set();
}
add(item) {
this.items.add(item);
}
has(item) {
return this.items.has(item);
}
merge(other) {
const merged = new GSet();
merged.items = new Set([...this.items, ...other.items]);
return merged;
}
}
Use case: tags applied to items, seen message IDs, bookmarks.
OR-Set (Observed-Remove Set)
A set that supports both add and remove. Each add is tagged with a unique ID. Remove only removes the tags that were observed.
class ORSet {
constructor() {
this.elements = new Map(); // value -> Set of unique tags
this.tombstones = new Set(); // removed tags
this.tagCounter = 0;
}
add(value, replicaId) {
const tag = `${replicaId}:${++this.tagCounter}`;
if (!this.elements.has(value)) {
this.elements.set(value, new Set());
}
this.elements.get(value).add(tag);
return tag;
}
remove(value) {
const tags = this.elements.get(value);
if (tags) {
for (const tag of tags) {
this.tombstones.add(tag);
}
this.elements.delete(value);
}
}
has(value) {
const tags = this.elements.get(value);
if (!tags) return false;
for (const tag of tags) {
if (!this.tombstones.has(tag)) return true;
}
return false;
}
values() {
const result = [];
for (const [value, tags] of this.elements) {
for (const tag of tags) {
if (!this.tombstones.has(tag)) {
result.push(value);
break;
}
}
}
return result;
}
merge(other) {
const merged = new ORSet();
merged.tombstones = new Set([...this.tombstones, ...other.tombstones]);
const allValues = new Set([
...this.elements.keys(),
...other.elements.keys(),
]);
for (const value of allValues) {
const tagsA = this.elements.get(value) || new Set();
const tagsB = other.elements.get(value) || new Set();
const allTags = new Set([...tagsA, ...tagsB]);
const liveTags = new Set();
for (const tag of allTags) {
if (!merged.tombstones.has(tag)) {
liveTags.add(tag);
}
}
if (liveTags.size > 0) {
merged.elements.set(value, liveTags);
}
}
return merged;
}
}
Use case: todo lists, collaborator lists, any set with add/remove.
Causal Ordering and Version Vectors
CRDTs often need to know the causal relationship between operations: did operation A happen before B, or were they concurrent?
Version Vectors
A version vector tracks the latest operation seen from each replica.
class VersionVector {
constructor() {
this.clock = {}; // replicaId -> counter
}
increment(replicaId) {
this.clock[replicaId] = (this.clock[replicaId] || 0) + 1;
}
// Does this version vector dominate (happened after) the other?
dominates(other) {
for (const [id, count] of Object.entries(other.clock)) {
if ((this.clock[id] || 0) < count) return false;
}
return true;
}
// Are these concurrent? (neither dominates)
isConcurrent(other) {
return !this.dominates(other) && !other.dominates(this);
}
merge(other) {
const merged = new VersionVector();
const allKeys = new Set([
...Object.keys(this.clock),
...Object.keys(other.clock),
]);
for (const key of allKeys) {
merged.clock[key] = Math.max(
this.clock[key] || 0,
other.clock[key] || 0
);
}
return merged;
}
}
Choosing the Right CRDT
| Data Shape | CRDT | Concurrent Behavior |
|---|---|---|
| Counter (increment only) | G-Counter | All increments preserved |
| Counter (inc + dec) | PN-Counter | All changes preserved |
| Single value | LWW-Register | Last writer wins |
| Set (add only) | G-Set | Union of all adds |
| Set (add + remove) | OR-Set | Add wins over concurrent remove |
| Ordered list | RGA / Yjs Array | Interleaves concurrent inserts |
| Rich text | Yjs Text / Peritext | Merges concurrent edits |
| Map / Object | LWW-Map or OR-Map | Per-key merge semantics |
| JSON document | Automerge / Yjs Doc | Recursive CRDT structure |
Libraries That Implement CRDTs
| Library | Language | Focus |
|---|---|---|
| Yjs | JavaScript | Documents, text, general-purpose |
| Automerge | JS / Rust | JSON documents, immutable history |
| Loro | Rust / JS | High-perf CRDTs, rich text |
| Diamond Types | Rust | Text CRDTs, minimal overhead |
| cr-sqlite | SQLite ext | CRDTs embedded in SQLite |
Common Pitfalls
- Tombstone bloat: Deleted items leave tombstones that grow forever. Implement garbage collection with causal stability.
- Clock skew with LWW: Wall-clock timestamps can cause data loss. Use Hybrid Logical Clocks.
- Metadata overhead: CRDT metadata can exceed the size of the actual data. Choose compact representations.
- Not all data fits CRDTs: Some operations (e.g., "set balance to exactly $100") are inherently non-mergeable. Use server authority for those.
- Testing: Always test with simulated network partitions, concurrent edits, and delayed sync. Property-based testing works well for CRDTs.
Install this skill directly: skilldb add local-first-skills
Related Skills
electric-sql
Teaches ElectricSQL, a Postgres-backed local-first sync framework. Covers the Electric architecture where Postgres is the source of truth and data syncs to local SQLite databases on client devices via shape-based partial replication. Includes shape definitions, live queries, offline-first patterns, conflict resolution with rich CRDTs, integration with React and Expo (React Native), deployment patterns, and migration strategies.
indexeddb-patterns
Teaches IndexedDB patterns for local-first web applications, using Dexie.js as the primary wrapper library. Covers schema design and versioning, creating indexes for efficient queries, transaction patterns, performance optimization (bulk operations, pagination, lazy loading), migration strategies for schema evolution, storage quota management, data export and import, and integration patterns with sync engines and reactive frameworks.
local-first-auth
Teaches authentication and authorization patterns for local-first applications that must work offline. Covers offline-capable auth with cached tokens, permission sync and local enforcement, encrypted local storage for sensitive data, key management with device-bound keys, device authorization and revocation, multi-device identity linking, end-to-end encryption for synced data, and secure patterns for handling auth in disconnected environments.
local-first-fundamentals
Teaches the local-first software paradigm where applications store data on the user's device, work fully offline, and sync to peers or servers when connectivity is available. Covers the spectrum from cloud-first to offline-first to local-first, core benefits (instant UX, offline capability, data ownership, privacy), key challenges (conflict resolution, sync complexity, storage limits), architectural patterns, and decision frameworks for when local-first is the right choice.
sync-engine-architecture
Teaches how to design and build a sync engine for local-first applications. Covers the operation log as the foundation, conflict resolution strategies (last-write-wins, operational transform, CRDTs), server reconciliation patterns, partial sync for large datasets, bandwidth optimization techniques, version vectors and causal consistency, clock synchronization, and practical implementation patterns with code examples.
yjs-sync
Teaches building local-first collaborative applications with Yjs, the most widely adopted CRDT library for JavaScript. Covers the Y.Doc document model, shared types (Y.Map, Y.Array, Y.Text, Y.XmlFragment), the awareness protocol for presence and cursors, persistence and sync providers (WebSocket, WebRTC, IndexedDB), integrating with editors like ProseMirror/TipTap/CodeMirror/Monaco, undo/redo management, and performance optimization patterns.