Chat Rooms
Chat room architecture covering message routing, history, moderation, and scalable room-based real-time systems
You are an expert in designing chat room systems for real-time applications. ## Key Points - **Public** — discoverable and open to join - **Private** — invite-only, membership controlled - **Direct messages** — two-party or small-group, often treated as a special room type - **Ephemeral** — no message persistence, used for live events - **Server timestamps** — simple but clock skew across servers can cause issues - **Sequence numbers per room** — monotonically increasing, gap-free, ideal for pagination - **Hybrid IDs** — combine a timestamp with a sequence (e.g., Snowflake IDs) for sortable, unique identifiers - **At-most-once** — fire and forget, acceptable for typing indicators - **At-least-once** — retry until acknowledged, may need deduplication - **Exactly-once** — requires idempotency keys and server-side deduplication - **Use per-room sequence numbers** for pagination and read tracking. Timestamps alone are insufficient for gap-free ordering. - **Implement optimistic UI** — display the message immediately and reconcile when the server acknowledges. This makes the app feel instant.
skilldb get websocket-skills/Chat RoomsFull skill: 306 linesChat Room Architecture — WebSockets & Real-Time
You are an expert in designing chat room systems for real-time applications.
Overview
Chat rooms are a foundational pattern in real-time applications — from team messaging (Slack, Discord) to live event chat and customer support. A well-designed chat system handles message delivery, persistence, ordering, history retrieval, user management, and moderation while remaining responsive under load.
Core Concepts
Room Model
A room (or channel) is a logical grouping where messages are visible to all members. Rooms can be:
- Public — discoverable and open to join
- Private — invite-only, membership controlled
- Direct messages — two-party or small-group, often treated as a special room type
- Ephemeral — no message persistence, used for live events
Message Ordering
Messages need a consistent order. Options include:
- Server timestamps — simple but clock skew across servers can cause issues
- Sequence numbers per room — monotonically increasing, gap-free, ideal for pagination
- Hybrid IDs — combine a timestamp with a sequence (e.g., Snowflake IDs) for sortable, unique identifiers
Message Delivery Guarantees
- At-most-once — fire and forget, acceptable for typing indicators
- At-least-once — retry until acknowledged, may need deduplication
- Exactly-once — requires idempotency keys and server-side deduplication
Implementation Patterns
Room and Message Data Model
-- Rooms
CREATE TABLE rooms (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(100) NOT NULL,
type VARCHAR(20) NOT NULL DEFAULT 'public', -- public, private, dm
created_by UUID REFERENCES users(id),
created_at TIMESTAMPTZ DEFAULT now(),
metadata JSONB DEFAULT '{}'
);
-- Room membership
CREATE TABLE room_members (
room_id UUID REFERENCES rooms(id) ON DELETE CASCADE,
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
role VARCHAR(20) DEFAULT 'member', -- owner, admin, member
joined_at TIMESTAMPTZ DEFAULT now(),
last_read_seq BIGINT DEFAULT 0,
PRIMARY KEY (room_id, user_id)
);
-- Messages with per-room sequence numbers
CREATE TABLE messages (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
room_id UUID REFERENCES rooms(id) ON DELETE CASCADE,
sender_id UUID REFERENCES users(id),
seq BIGINT NOT NULL, -- per-room sequence number
content TEXT NOT NULL,
type VARCHAR(20) DEFAULT 'text', -- text, image, system, reply
reply_to UUID REFERENCES messages(id),
created_at TIMESTAMPTZ DEFAULT now(),
edited_at TIMESTAMPTZ,
deleted BOOLEAN DEFAULT false,
UNIQUE (room_id, seq)
);
CREATE INDEX idx_messages_room_seq ON messages(room_id, seq DESC);
Server-Side Message Handling
import { nanoid } from 'nanoid';
// Room sequence counters (in production, use Redis INCR)
const roomSequences = new Map();
function getNextSeq(roomId) {
const current = roomSequences.get(roomId) || 0;
const next = current + 1;
roomSequences.set(roomId, next);
return next;
}
io.on('connection', (socket) => {
const userId = socket.data.user.id;
// Join rooms the user belongs to
socket.on('join-rooms', async () => {
const rooms = await getUserRooms(userId);
for (const room of rooms) {
socket.join(`room:${room.id}`);
}
socket.emit('rooms-joined', rooms.map((r) => r.id));
});
// Send a message
socket.on('send-message', async ({ roomId, content, type, replyTo, clientId }, ack) => {
try {
// Verify membership
const isMember = await checkMembership(roomId, userId);
if (!isMember) return ack({ error: 'Not a member of this room' });
// Assign sequence number atomically
const seq = await redis.incr(`room:${roomId}:seq`);
const message = {
id: nanoid(),
roomId,
senderId: userId,
senderName: socket.data.user.name,
seq,
content,
type: type || 'text',
replyTo: replyTo || null,
createdAt: new Date().toISOString(),
clientId, // For client-side deduplication
};
// Persist
await saveMessage(message);
// Broadcast to room
io.to(`room:${roomId}`).emit('new-message', message);
// Acknowledge to sender
ack({ ok: true, message });
} catch (err) {
ack({ error: 'Failed to send message' });
}
});
// Fetch history with cursor-based pagination
socket.on('fetch-history', async ({ roomId, beforeSeq, limit }, ack) => {
const messages = await db.query(
`SELECT * FROM messages
WHERE room_id = $1 AND seq < $2 AND deleted = false
ORDER BY seq DESC LIMIT $3`,
[roomId, beforeSeq || Number.MAX_SAFE_INTEGER, Math.min(limit || 50, 100)]
);
ack({ messages: messages.rows.reverse() });
});
// Mark messages as read
socket.on('mark-read', async ({ roomId, seq }) => {
await db.query(
`UPDATE room_members SET last_read_seq = GREATEST(last_read_seq, $1)
WHERE room_id = $2 AND user_id = $3`,
[seq, roomId, userId]
);
// Notify others about read receipt
socket.to(`room:${roomId}`).emit('read-receipt', { userId, roomId, seq });
});
// Edit a message
socket.on('edit-message', async ({ messageId, content }, ack) => {
const result = await db.query(
`UPDATE messages SET content = $1, edited_at = now()
WHERE id = $2 AND sender_id = $3 AND deleted = false
RETURNING room_id, seq`,
[content, messageId, userId]
);
if (result.rowCount === 0) return ack({ error: 'Cannot edit' });
const { room_id, seq } = result.rows[0];
io.to(`room:${room_id}`).emit('message-edited', { messageId, content, seq });
ack({ ok: true });
});
// Delete a message (soft delete)
socket.on('delete-message', async ({ messageId }, ack) => {
const result = await db.query(
`UPDATE messages SET deleted = true
WHERE id = $1 AND sender_id = $2
RETURNING room_id, seq`,
[messageId, userId]
);
if (result.rowCount === 0) return ack({ error: 'Cannot delete' });
const { room_id, seq } = result.rows[0];
io.to(`room:${room_id}`).emit('message-deleted', { messageId, seq });
ack({ ok: true });
});
});
Client-Side Message Handling
// Optimistic UI with deduplication
const messageStore = new Map(); // messageId -> message
function sendMessage(roomId, content) {
const clientId = nanoid(); // Temporary client-side ID
const optimistic = {
id: clientId,
roomId,
senderId: currentUser.id,
senderName: currentUser.name,
content,
createdAt: new Date().toISOString(),
pending: true,
};
// Show immediately
addToUI(optimistic);
socket.timeout(10000).emit(
'send-message',
{ roomId, content, clientId },
(err, response) => {
if (err || response.error) {
markAsFailed(clientId);
} else {
// Replace optimistic message with server-confirmed version
replaceInUI(clientId, response.message);
}
}
);
}
// Deduplicate incoming messages
socket.on('new-message', (message) => {
if (messageStore.has(message.id)) return; // Already have it
if (message.clientId && findByClientId(message.clientId)) {
// This is the server echo of our optimistic message; already handled
return;
}
messageStore.set(message.id, message);
addToUI(message);
});
Unread Counts
// Server: compute unread counts
async function getUnreadCounts(userId) {
const result = await db.query(
`SELECT rm.room_id,
(SELECT MAX(seq) FROM messages WHERE room_id = rm.room_id AND deleted = false) - rm.last_read_seq AS unread
FROM room_members rm
WHERE rm.user_id = $1`,
[userId]
);
return result.rows.reduce((acc, row) => {
acc[row.room_id] = Math.max(0, row.unread || 0);
return acc;
}, {});
}
// Send unread counts on connection
socket.on('join-rooms', async () => {
const unreads = await getUnreadCounts(userId);
socket.emit('unread-counts', unreads);
});
Best Practices
- Use per-room sequence numbers for pagination and read tracking. Timestamps alone are insufficient for gap-free ordering.
- Implement optimistic UI — display the message immediately and reconcile when the server acknowledges. This makes the app feel instant.
- Use cursor-based pagination (
beforeSeq) rather than offset-based (OFFSET N). Offset pagination breaks when new messages arrive. - Soft-delete messages — set a
deletedflag rather than removing the row. This preserves sequence continuity and allows audit trails. - Debounce read receipts — do not send a read event for every message. Batch them, sending the highest sequence number seen after a short delay.
- Separate message storage from real-time delivery — persist first, then broadcast. If broadcast fails, the message is still saved and will appear when the client fetches history.
Common Pitfalls
- Race conditions on sequence numbers — use Redis
INCRor a database sequence to assign sequence numbers atomically. Do not compute them in application code. - Unbounded room joins — if a user is in 500 rooms, joining all of them at connection time is expensive. Lazy-join: only subscribe to rooms the user is actively viewing.
- Missing deduplication — the sender will receive their own message back via the room broadcast. Without deduplication, it appears twice.
- Loading full history on room open — always paginate. Load the most recent 50 messages and fetch older ones on scroll.
- Ignoring message edits and deletes in cached state — if the client caches messages locally, it must also process
message-editedandmessage-deletedevents to stay consistent. - Not indexing by room + sequence — queries like "fetch the last 50 messages in room X" are extremely frequent. Without a composite index on
(room_id, seq DESC), performance degrades fast.
Core Philosophy
A chat system's reliability is measured by what users never notice: messages that always arrive in order, history that loads instantly on scroll, and read receipts that update without manual refresh. The foundation of this invisible reliability is a robust message ordering system built on per-room sequence numbers assigned atomically. Timestamps alone are insufficient because clock skew across servers creates ordering ambiguities; sequence numbers are gap-free, monotonic, and perfect for pagination.
Persist first, broadcast second. When a message arrives, write it to the database before broadcasting to the room. If the broadcast fails, the message is still saved and will appear when clients fetch history. If you broadcast first and the persist fails, users see a message that vanishes on the next page load — a trust-destroying experience.
Implement optimistic UI on the client. Display the user's own message immediately with a "sending" indicator, and reconcile when the server acknowledges. This makes the chat feel instant even on slow connections. But always deduplicate: the sender will receive their own message back via the room broadcast, and without deduplication it appears twice.
Anti-Patterns
-
Computing sequence numbers in application code — incrementing a counter in JavaScript or reading MAX(seq) and adding 1 creates race conditions under concurrent writes; use Redis
INCRor a database sequence for atomic assignment. -
Loading full message history when opening a room — fetching all messages at once is slow and wasteful; always paginate with cursor-based navigation (
beforeSeq) and load older messages on scroll. -
Sending read receipts for every individual message — emitting a read event on each message creates unnecessary network traffic; debounce and batch read receipts by sending the highest sequence number after a short delay.
-
Not handling message edits and deletes in cached state — if the client caches messages locally, ignoring
message-editedandmessage-deletedevents leaves stale content visible indefinitely. -
Joining all rooms eagerly at connection time — if a user belongs to 500 rooms, subscribing to all at once is expensive; lazy-join only the rooms the user is actively viewing and defer the rest.
Install this skill directly: skilldb add websocket-skills
Related Skills
Collaborative Editing
Real-time collaborative editing with CRDTs and Operational Transform for conflict-free concurrent document editing
Presence
User presence system design for tracking online/offline status, typing indicators, and activity in real-time apps
Reconnection
Reconnection and offline resilience patterns for WebSocket apps including retry strategies and state synchronization
Scaling Websockets
Scaling WebSocket applications with Redis pub/sub, sticky sessions, horizontal scaling, and load balancing strategies
Server Sent Events
Server-Sent Events (SSE) patterns for efficient unidirectional real-time streaming from server to client
Socket Io
Socket.IO patterns for event-driven real-time communication with automatic reconnection and room management