sync-engine-architecture
Teaches how to design and build a sync engine for local-first applications. Covers the operation log as the foundation, conflict resolution strategies (last-write-wins, operational transform, CRDTs), server reconciliation patterns, partial sync for large datasets, bandwidth optimization techniques, version vectors and causal consistency, clock synchronization, and practical implementation patterns with code examples.
How to design and implement the sync layer that keeps local-first replicas consistent.
## Key Points
1. **What changed?** — detecting local mutations since last sync
2. **What do I need?** — determining which remote changes are missing locally
3. **How do I merge?** — resolving conflicts when the same data was modified on multiple replicas
- You cannot tell which fields changed
- You overwrite concurrent edits
- Bandwidth grows with total data, not with change volume
- Each op describes exactly one change
- Concurrent ops can be merged intelligently
- Bandwidth is proportional to activity, not data size
## Quick Example
```typescript
// Instead of sending the full document on every change
{ id: '1', title: 'Updated', content: '...10KB of text...', status: 'draft' }
// Send only what changed
{ documentId: '1', fields: { title: 'Updated' } }
```skilldb get local-first-skills/sync-engine-architectureFull skill: 572 linesBuilding a Sync Engine
How to design and implement the sync layer that keeps local-first replicas consistent.
What a Sync Engine Does
A sync engine moves changes between replicas (devices, servers, peers) and ensures they converge to the same state. It answers three questions:
- What changed? — detecting local mutations since last sync
- What do I need? — determining which remote changes are missing locally
- How do I merge? — resolving conflicts when the same data was modified on multiple replicas
┌──────────┐ ┌──────────┐
│ Device A │ │ Device B │
│ │ ┌───────────────────┐ │ │
│ Local │──►│ Sync Engine │◄──│ Local │
│ Store │◄──│ │──►│ Store │
│ │ │ - Detect changes │ │ │
│ Op Log │ │ - Exchange ops │ │ Op Log │
│ │ │ - Merge state │ │ │
└──────────┘ └───────────────────┘ └──────────┘
The Operation Log
The operation log (op log) is the foundation of every sync engine. Instead of syncing the current state, you sync the operations that produced that state.
Why Ops, Not State
Syncing full state has problems:
- You cannot tell which fields changed
- You overwrite concurrent edits
- Bandwidth grows with total data, not with change volume
Syncing operations solves these:
- Each op describes exactly one change
- Concurrent ops can be merged intelligently
- Bandwidth is proportional to activity, not data size
Op Log Structure
interface Operation {
id: string; // Globally unique (UUID or ULID)
replicaId: string; // Which device created this op
timestamp: number; // Hybrid logical clock or wall clock
collection: string; // Which data collection (e.g., 'todos')
documentId: string; // Which document within the collection
type: 'create' | 'update' | 'delete';
fields?: Record<string, any>; // For create/update: the changed fields
version: number; // Monotonically increasing per replica
}
Writing to the Op Log
Every local mutation creates an op before touching the local store.
class SyncEngine {
private replicaId: string;
private localVersion: number = 0;
mutate(collection: string, docId: string, type: Operation['type'], fields?: Record<string, any>) {
const op: Operation = {
id: generateULID(),
replicaId: this.replicaId,
timestamp: this.clock.now(),
collection,
documentId: docId,
type,
fields,
version: ++this.localVersion,
};
// 1. Persist the op
this.opLog.append(op);
// 2. Apply to local store
this.applyOp(op);
// 3. Queue for sync
this.syncQueue.enqueue(op);
}
}
Op Log Compaction
The op log grows forever unless compacted. Compaction merges old ops into snapshots.
async function compactOpLog(opLog: OpLog, threshold: number) {
const ops = await opLog.getOpsOlderThan(threshold);
// Group by document
const byDoc = groupBy(ops, op => `${op.collection}:${op.documentId}`);
for (const [key, docOps] of Object.entries(byDoc)) {
// Replay ops to get final state
const finalState = docOps.reduce((state, op) => applyOp(state, op), {});
// Replace many ops with one snapshot op
const snapshotOp: Operation = {
id: generateULID(),
replicaId: 'compaction',
timestamp: docOps[docOps.length - 1].timestamp,
collection: docOps[0].collection,
documentId: docOps[0].documentId,
type: 'create',
fields: finalState,
version: 0,
};
await opLog.replaceOps(docOps.map(o => o.id), snapshotOp);
}
}
Conflict Resolution Strategies
Last-Write-Wins (LWW)
The simplest strategy. The operation with the latest timestamp wins.
function mergeLastWriteWins(local: Operation, remote: Operation): Operation {
if (remote.timestamp > local.timestamp) {
return remote;
}
if (remote.timestamp === local.timestamp) {
// Tiebreaker: lexicographic comparison of replica IDs
return remote.replicaId > local.replicaId ? remote : local;
}
return local;
}
Pros: Simple to implement, easy to understand, deterministic.
Cons: Silently discards one edit. If Alice renames a file to "Report" and Bob renames it to "Summary" at the same time, one rename is lost without either user knowing.
Use when: Data is not collaboratively edited, or losing one concurrent edit is acceptable (settings, preferences, status fields).
Last-Write-Wins Per Field
A refinement: apply LWW at the field level, not the document level. If Alice changes title and Bob changes status, both edits survive.
interface FieldTimestamp {
[field: string]: { value: any; timestamp: number; replicaId: string };
}
function mergePerField(local: FieldTimestamp, remote: FieldTimestamp): FieldTimestamp {
const merged = { ...local };
for (const [field, remoteEntry] of Object.entries(remote)) {
const localEntry = merged[field];
if (!localEntry ||
remoteEntry.timestamp > localEntry.timestamp ||
(remoteEntry.timestamp === localEntry.timestamp &&
remoteEntry.replicaId > localEntry.replicaId)) {
merged[field] = remoteEntry;
}
}
return merged;
}
Operational Transform (OT)
Transforms operations so they apply correctly regardless of the order they arrive. Originally designed for collaborative text editing.
// Two concurrent inserts into a text document
// Alice inserts "X" at position 3
// Bob inserts "Y" at position 1
// Without OT:
// "abcde" + insert(3, "X") = "abcXde"
// "abcXde" + insert(1, "Y") = "aYbcXde" (correct in this order)
//
// "abcde" + insert(1, "Y") = "aYbcde"
// "aYbcde" + insert(3, "X") = "aYbXcde" (wrong position!)
// With OT: transform Bob's op against Alice's
function transformInsert(op: InsertOp, against: InsertOp): InsertOp {
if (op.position <= against.position) {
return op; // No change needed
}
return { ...op, position: op.position + against.text.length };
}
// Bob's insert(3, "X") becomes insert(4, "X") after Alice's insert(1, "Y")
Pros: Preserves all edits, well-understood for text.
Cons: Requires a central server to determine operation ordering. Transform functions are complex and bug-prone for anything beyond plain text.
Use when: Building collaborative text editing with a central server (Google Docs model).
CRDTs (Conflict-Free Replicated Data Types)
Mathematical data structures that merge deterministically without coordination. No central server needed.
// G-Counter: a grow-only counter that merges correctly
class GCounter {
private counts: Map<string, number> = new Map();
constructor(private replicaId: string) {}
increment() {
const current = this.counts.get(this.replicaId) || 0;
this.counts.set(this.replicaId, current + 1);
}
value(): number {
let sum = 0;
for (const count of this.counts.values()) {
sum += count;
}
return sum;
}
merge(other: GCounter) {
for (const [replica, count] of other.counts) {
const local = this.counts.get(replica) || 0;
this.counts.set(replica, Math.max(local, count));
}
}
}
// Two replicas can increment independently and merge
const a = new GCounter('A');
const b = new GCounter('B');
a.increment(); a.increment(); // A sees 2
b.increment(); // B sees 1
a.merge(b); // A sees 3
b.merge(a); // B sees 3 — converged!
Pros: No central server, guaranteed convergence, works offline indefinitely.
Cons: Higher storage overhead (metadata per field per replica). Some data types are complex to implement as CRDTs. Deletes require tombstones.
Use when: Peer-to-peer sync, offline-heavy apps, or when you cannot guarantee a central server.
Choosing a Strategy
Central server?
/ \
Yes No
/ \
Need text editing? CRDTs
/ \ (Yjs, Automerge)
Yes No
/ \
OT LWW per field
(Google Docs) (simpler, good enough
for most apps)
Server Reconciliation
When a server is in the loop, it acts as the canonical ordering authority.
Push-Pull Pattern
// Client pushes local ops, server responds with missed remote ops
async function sync(engine: SyncEngine, serverUrl: string) {
// 1. Gather unsent local ops
const localOps = await engine.getUnsentOps();
const lastServerVersion = await engine.getLastServerVersion();
// 2. Send to server and receive missed ops
const response = await fetch(`${serverUrl}/sync`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
clientId: engine.replicaId,
ops: localOps,
since: lastServerVersion,
}),
});
const { serverOps, serverVersion } = await response.json();
// 3. Apply remote ops locally
for (const op of serverOps) {
await engine.applyRemoteOp(op);
}
// 4. Update sync cursor
await engine.setLastServerVersion(serverVersion);
// 5. Mark local ops as sent
await engine.markOpsSent(localOps.map(o => o.id));
}
Server Rejection
The server may reject ops that violate business rules. The client must rebase.
// Server-side validation
function processClientOps(ops: Operation[], db: Database): SyncResult {
const accepted: Operation[] = [];
const rejected: RejectedOp[] = [];
for (const op of ops) {
try {
validateBusinessRules(op, db);
db.applyOp(op);
accepted.push(op);
} catch (err) {
rejected.push({ op, reason: err.message });
}
}
return { accepted, rejected };
}
// Client-side rebase: undo rejected ops
async function handleRejections(rejections: RejectedOp[], engine: SyncEngine) {
for (const { op, reason } of rejections) {
await engine.rollbackOp(op);
engine.notifyUser(`Change rejected: ${reason}`);
}
}
Partial Sync
Not every device needs every document. Partial sync limits what each replica stores.
Subscription-Based Sync
Clients subscribe to subsets of data. The server only sends ops matching the subscription.
interface SyncSubscription {
collections: string[]; // Which collections to sync
filter?: Record<string, any>; // e.g., { workspaceId: '123' }
fields?: string[]; // Which fields to include
}
// Client subscribes
const subscription: SyncSubscription = {
collections: ['todos', 'projects'],
filter: { workspaceId: currentWorkspace },
};
// Server filters ops before sending
function filterOpsForClient(ops: Operation[], sub: SyncSubscription): Operation[] {
return ops.filter(op => {
if (!sub.collections.includes(op.collection)) return false;
if (sub.filter) {
for (const [key, value] of Object.entries(sub.filter)) {
if (op.fields?.[key] !== value) return false;
}
}
return true;
});
}
Eviction and Re-fetch
When a device runs low on storage, evict least-recently-used documents. Re-fetch them on demand.
async function evictOldDocuments(store: LocalStore, targetBytes: number) {
const docs = await store.getDocsByLastAccess();
let freedBytes = 0;
for (const doc of docs) {
if (freedBytes >= targetBytes) break;
if (doc.pinned) continue; // Never evict pinned docs
freedBytes += doc.sizeBytes;
await store.evict(doc.id);
await store.markNeedsRefetch(doc.id);
}
}
Bandwidth Optimization
Delta Encoding
Send only changed fields, not entire documents.
// Instead of sending the full document on every change
{ id: '1', title: 'Updated', content: '...10KB of text...', status: 'draft' }
// Send only what changed
{ documentId: '1', fields: { title: 'Updated' } }
Batching and Compression
async function syncBatch(ops: Operation[], serverUrl: string) {
// Batch multiple ops into one request
const payload = JSON.stringify(ops);
// Compress with gzip for large batches
const compressed = await compress(payload);
await fetch(`${serverUrl}/sync`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Content-Encoding': 'gzip',
},
body: compressed,
});
}
Debouncing
Do not sync on every keystroke. Debounce rapid mutations.
class DebouncedSync {
private timer: ReturnType<typeof setTimeout> | null = null;
private pendingOps: Operation[] = [];
enqueue(op: Operation) {
this.pendingOps.push(op);
if (this.timer) clearTimeout(this.timer);
this.timer = setTimeout(() => this.flush(), 500);
}
private async flush() {
const ops = this.pendingOps;
this.pendingOps = [];
await syncBatch(ops, this.serverUrl);
}
}
Version Vectors
A version vector tracks the latest known version from each replica. It answers: "Have I seen all operations from replica X up to version Y?"
type VersionVector = Record<string, number>;
// Example: { 'device-A': 42, 'device-B': 17, 'device-C': 5 }
// Means: I have seen all ops from A up to 42, B up to 17, C up to 5
function needsOp(localVV: VersionVector, op: Operation): boolean {
const knownVersion = localVV[op.replicaId] || 0;
return op.version > knownVersion;
}
function updateVersionVector(vv: VersionVector, op: Operation): VersionVector {
return {
...vv,
[op.replicaId]: Math.max(vv[op.replicaId] || 0, op.version),
};
}
// Sync negotiation: exchange version vectors to find what each side needs
function computeMissingOps(
localVV: VersionVector,
remoteVV: VersionVector
): { localNeeds: VersionVector; remoteNeeds: VersionVector } {
const allReplicas = new Set([...Object.keys(localVV), ...Object.keys(remoteVV)]);
const localNeeds: VersionVector = {};
const remoteNeeds: VersionVector = {};
for (const replica of allReplicas) {
const localV = localVV[replica] || 0;
const remoteV = remoteVV[replica] || 0;
if (remoteV > localV) {
localNeeds[replica] = localV; // I need ops from localV+1 to remoteV
}
if (localV > remoteV) {
remoteNeeds[replica] = remoteV; // They need ops from remoteV+1 to localV
}
}
return { localNeeds, remoteNeeds };
}
Hybrid Logical Clocks
Wall clocks are unreliable across devices. Hybrid logical clocks (HLCs) combine wall-clock time with a logical counter to provide a monotonically increasing, causally consistent timestamp.
class HLC {
private logical: number = 0;
private lastWall: number = 0;
now(): number {
const wall = Date.now();
if (wall > this.lastWall) {
this.lastWall = wall;
this.logical = 0;
} else {
this.logical++;
}
// Encode as a single number: wall time in upper bits, logical in lower
return this.lastWall * 1000 + this.logical;
}
receive(remoteTimestamp: number) {
const remoteWall = Math.floor(remoteTimestamp / 1000);
const remoteLogical = remoteTimestamp % 1000;
const wall = Date.now();
if (wall > this.lastWall && wall > remoteWall) {
this.lastWall = wall;
this.logical = 0;
} else if (remoteWall > this.lastWall) {
this.lastWall = remoteWall;
this.logical = remoteLogical + 1;
} else {
this.logical = Math.max(this.logical, remoteLogical) + 1;
}
}
}
Common Pitfalls
| Pitfall | Why It Happens | Fix |
|---|---|---|
| Clock skew breaks LWW | Device clocks are unreliable | Use hybrid logical clocks (HLC) |
| Tombstones grow forever | Every delete leaves a marker | Compact tombstones after all replicas have synced past them |
| Op log fills disk | No compaction strategy | Compact old ops into snapshots periodically |
| Sync loops | A sends to B, B sends back to A | Include origin replicaId, skip ops from self |
| Lost deletes | Delete op arrives before create op | Use causal ordering or version vectors |
| Bandwidth spikes on reconnect | Device was offline for days, sends everything | Paginate sync, send in batches with backpressure |
| Schema conflicts | Devices on different app versions | Version your op format, handle unknown fields gracefully |
Install this skill directly: skilldb add local-first-skills
Related Skills
crdt-fundamentals
Teaches Conflict-free Replicated Data Types (CRDTs), the mathematical foundation for local-first sync. Covers how CRDTs guarantee eventual consistency without coordination, the difference between state-based and operation-based CRDTs, and practical implementations of G-Counter, PN-Counter, LWW-Register, OR-Set, G-Set, and RGA (Replicated Growable Array). Includes causal ordering, vector clocks, and guidance on choosing the right CRDT for your data model.
electric-sql
Teaches ElectricSQL, a Postgres-backed local-first sync framework. Covers the Electric architecture where Postgres is the source of truth and data syncs to local SQLite databases on client devices via shape-based partial replication. Includes shape definitions, live queries, offline-first patterns, conflict resolution with rich CRDTs, integration with React and Expo (React Native), deployment patterns, and migration strategies.
indexeddb-patterns
Teaches IndexedDB patterns for local-first web applications, using Dexie.js as the primary wrapper library. Covers schema design and versioning, creating indexes for efficient queries, transaction patterns, performance optimization (bulk operations, pagination, lazy loading), migration strategies for schema evolution, storage quota management, data export and import, and integration patterns with sync engines and reactive frameworks.
local-first-auth
Teaches authentication and authorization patterns for local-first applications that must work offline. Covers offline-capable auth with cached tokens, permission sync and local enforcement, encrypted local storage for sensitive data, key management with device-bound keys, device authorization and revocation, multi-device identity linking, end-to-end encryption for synced data, and secure patterns for handling auth in disconnected environments.
local-first-fundamentals
Teaches the local-first software paradigm where applications store data on the user's device, work fully offline, and sync to peers or servers when connectivity is available. Covers the spectrum from cloud-first to offline-first to local-first, core benefits (instant UX, offline capability, data ownership, privacy), key challenges (conflict resolution, sync complexity, storage limits), architectural patterns, and decision frameworks for when local-first is the right choice.
yjs-sync
Teaches building local-first collaborative applications with Yjs, the most widely adopted CRDT library for JavaScript. Covers the Y.Doc document model, shared types (Y.Map, Y.Array, Y.Text, Y.XmlFragment), the awareness protocol for presence and cursors, persistence and sync providers (WebSocket, WebRTC, IndexedDB), integrating with editors like ProseMirror/TipTap/CodeMirror/Monaco, undo/redo management, and performance optimization patterns.