Projections
Build and maintain read model projections from event streams for optimized query performance
You are an expert in read model projections for building event-sourced systems. ## Key Points - **One projector per read model concern.** Keep projectors focused. An order summary projector should not also maintain a customer leaderboard. - **Use upserts (INSERT ... ON CONFLICT) for idempotency.** This ensures replayed events converge to the same state. - **Batch checkpoint writes.** Save the checkpoint after processing a batch of events, not after every single event, to reduce I/O overhead. - **Run projections in separate processes** from the write side. This prevents slow projections from blocking command processing. - **Design read model schemas for the query, not the domain.** Denormalize aggressively. Duplicate data across read models to avoid joins. - **Monitor projection lag.** Track the gap between the latest global position and each projector's checkpoint. Alert when lag grows beyond acceptable thresholds. - **Not handling missing event types gracefully.** As the system evolves, new event types will appear. Projectors should ignore unknown events rather than crash. - **Performing side effects in projectors.** Projections must be pure data transformations. Sending emails or calling external APIs from a projector breaks rebuild safety. - **Sharing a database transaction between the event store write and a projection update.** This couples write and read models and defeats the purpose of CQRS. - **Forgetting to rebuild after fixing a projector bug.** If the projection logic was wrong, existing read model data is also wrong. You must rebuild to correct it. - **Using auto-incrementing IDs in read models.** During rebuilds, auto-increment IDs will differ. Use deterministic, business-meaningful keys instead.
skilldb get event-sourcing-skills/ProjectionsFull skill: 191 linesRead Model Projections — Event Sourcing
You are an expert in read model projections for building event-sourced systems.
Core Philosophy
Overview
Projections transform raw domain events into denormalized read models optimized for specific query patterns. A single event stream can feed many projections, each shaped for a different consumer — a dashboard, a search index, a report, or an API response. Projections are the bridge between the append-only event store and the queryable views that users and services depend on.
Core Concepts
Projector: A function or class that handles events and writes to a read model store. Each projector subscribes to one or more event types and maintains its own checkpoint (the last global position it has processed).
Checkpoint: A persisted position marker that records how far a projector has read through the event stream. On restart, the projector resumes from its checkpoint.
Idempotency: Projectors must tolerate replayed events. If the same event is delivered twice, the resulting read model must be identical.
Rebuild: Because the event store is the source of truth, any projection can be deleted and rebuilt from scratch by replaying all events. This enables schema migrations and bug fixes in projection logic.
Live vs. Catch-up: A live projector processes events in near-real-time as they are appended. A catch-up projector replays historical events to build or rebuild a read model.
Implementation Patterns
Basic Projector
class OrderSummaryProjector:
"""Projects order events into an order_summaries read model."""
def __init__(self, read_db, checkpoint_store):
self._db = read_db
self._checkpoints = checkpoint_store
self._handlers = {
"OrderPlaced": self._on_order_placed,
"OrderItemAdded": self._on_item_added,
"OrderCancelled": self._on_order_cancelled,
}
def handle(self, event: dict) -> None:
handler = self._handlers.get(event["event_type"])
if handler:
handler(event)
def _on_order_placed(self, event):
data = event["data"]
self._db.execute(
"""
INSERT INTO order_summaries (order_id, customer_id, status, total, created_at)
VALUES (%s, %s, 'placed', %s, %s)
ON CONFLICT (order_id) DO UPDATE SET
customer_id = EXCLUDED.customer_id,
status = EXCLUDED.status,
total = EXCLUDED.total
""",
(data["order_id"], data["customer_id"], data["total"], event["created_at"])
)
def _on_item_added(self, event):
data = event["data"]
self._db.execute(
"UPDATE order_summaries SET total = total + %s WHERE order_id = %s",
(data["price"], data["order_id"])
)
def _on_order_cancelled(self, event):
self._db.execute(
"UPDATE order_summaries SET status = 'cancelled' WHERE order_id = %s",
(event["data"]["order_id"],)
)
Subscription Loop with Checkpointing
class ProjectionRunner:
def __init__(self, event_store: EventStore, projector, checkpoint_store):
self._store = event_store
self._projector = projector
self._checkpoints = checkpoint_store
self._projector_name = type(projector).__name__
def run(self, batch_size: int = 200, poll_interval: float = 0.5):
position = self._checkpoints.get(self._projector_name) or 0
while True:
events = self._store.read_all(from_position=position, batch_size=batch_size)
if not events:
time.sleep(poll_interval)
continue
for event in events:
self._projector.handle(event)
position = event["global_position"]
self._checkpoints.save(self._projector_name, position)
Rebuilding a Projection
def rebuild_projection(event_store, projector, read_db, checkpoint_store):
"""Drop and rebuild a projection from scratch."""
projector_name = type(projector).__name__
# 1. Clear the read model table
read_db.execute("TRUNCATE TABLE order_summaries")
# 2. Reset the checkpoint
checkpoint_store.save(projector_name, 0)
# 3. Replay all events
position = 0
while True:
events = event_store.read_all(from_position=position, batch_size=1000)
if not events:
break
for event in events:
projector.handle(event)
position = event["global_position"]
checkpoint_store.save(projector_name, position)
print(f"Rebuilt {projector_name} up to position {position}")
Multi-Table Projection (TypeScript / Knex)
class CustomerOrdersProjector {
constructor(private db: Knex) {}
async handle(event: DomainEvent): Promise<void> {
switch (event.eventType) {
case "CustomerRegistered":
await this.db("customers").insert({
customer_id: event.data.customerId,
name: event.data.name,
order_count: 0,
});
break;
case "OrderPlaced":
await this.db.transaction(async (trx) => {
await trx("customer_orders").insert({
customer_id: event.data.customerId,
order_id: event.data.orderId,
placed_at: event.createdAt,
});
await trx("customers")
.where("customer_id", event.data.customerId)
.increment("order_count", 1);
});
break;
}
}
}
Best Practices
- One projector per read model concern. Keep projectors focused. An order summary projector should not also maintain a customer leaderboard.
- Use upserts (INSERT ... ON CONFLICT) for idempotency. This ensures replayed events converge to the same state.
- Batch checkpoint writes. Save the checkpoint after processing a batch of events, not after every single event, to reduce I/O overhead.
- Run projections in separate processes from the write side. This prevents slow projections from blocking command processing.
- Design read model schemas for the query, not the domain. Denormalize aggressively. Duplicate data across read models to avoid joins.
- Monitor projection lag. Track the gap between the latest global position and each projector's checkpoint. Alert when lag grows beyond acceptable thresholds.
Common Pitfalls
- Not handling missing event types gracefully. As the system evolves, new event types will appear. Projectors should ignore unknown events rather than crash.
- Performing side effects in projectors. Projections must be pure data transformations. Sending emails or calling external APIs from a projector breaks rebuild safety.
- Sharing a database transaction between the event store write and a projection update. This couples write and read models and defeats the purpose of CQRS.
- Forgetting to rebuild after fixing a projector bug. If the projection logic was wrong, existing read model data is also wrong. You must rebuild to correct it.
- Using auto-incrementing IDs in read models. During rebuilds, auto-increment IDs will differ. Use deterministic, business-meaningful keys instead.
Anti-Patterns
Over-engineering for hypothetical scale. Building for millions of users when you have hundreds adds complexity without value. Solve today's problems first.
Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide well wastes time and introduces unnecessary risk.
Premature abstraction. Creating elaborate frameworks and utilities before you have enough concrete cases to know what the abstraction should look like produces the wrong abstraction.
Neglecting error handling at boundaries. Internal code can trust its inputs, but system boundaries (user input, APIs, file I/O) require defensive validation.
Skipping documentation for obvious code. What is obvious to you today will not be obvious to your colleague next month or to you next year.
Install this skill directly: skilldb add event-sourcing-skills
Related Skills
Cqrs
Implement Command Query Responsibility Segregation to separate write and read models in event-sourced architectures
Event Store
Design and implement event stores for persisting domain events with append-only semantics and optimistic concurrency
Event Versioning
Evolve event schemas safely over time using upcasting, weak schema strategies, and version negotiation
Eventual Consistency
Handle eventual consistency challenges in distributed event-sourced systems with practical patterns and strategies
Sagas
Coordinate long-running business processes across aggregates and services using sagas and process managers
Snapshots
Optimize aggregate loading with snapshot patterns to avoid replaying long event streams from the beginning