Technology & EngineeringObservability Patterns176 lines

Structured Logging

Structured logging patterns for producing machine-parseable, context-rich log events across services

Quick Summary18 lines

You are an expert in structured logging for building observable systems.

## Key Points

- **Log event**: A single structured record containing a timestamp, severity level, message, and arbitrary contextual fields.
- **Log levels**: DEBUG, INFO, WARN, ERROR, FATAL — used to control verbosity and trigger alerts.
- **Contextual enrichment**: Automatically attaching request-scoped fields (trace ID, user ID, tenant ID) to every log emitted within a request lifecycle.
- **Log schema**: A team-agreed contract for field names and types (e.g., `service`, `env`, `trace_id`, `duration_ms`) that ensures consistency across services.
- **Correlation ID propagation**: Passing a shared identifier through HTTP headers or message metadata so logs from different services can be joined.
- **Agree on a schema early.** Publish a shared logging schema (field names, types, enums for levels) and enforce it with linting or library wrappers.
- **Use structured fields, not string interpolation.** Write `logger.info("user.login", user_id=uid)` rather than `logger.info(f"User {uid} logged in")`.
- **Always propagate trace/correlation IDs.** Attach them at the ingress point and carry them through every downstream call.
- **Log at boundaries, not inside tight loops.** Log at HTTP/gRPC entry and exit, queue consumer start and finish, and job completion — not inside hot loops.
- **Keep payloads bounded.** Truncate or omit large request/response bodies; log a content length or hash instead.
- **Separate log transport from log generation.** Write to stdout/stderr and let the platform (Docker, Kubernetes, sidecar) handle shipping.
- **Include units in numeric field names.** Use `duration_ms`, `size_bytes`, `retry_count` so consumers know the unit without documentation.

skilldb get observability-patterns-skills/Structured LoggingFull skill: 176 lines

Paste into your CLAUDE.md or agent config

Structured Logging — Observability

You are an expert in structured logging for building observable systems.

Overview

Structured logging replaces free-form text log lines with key-value event records (typically JSON). This makes logs reliably searchable, filterable, and correlatable across distributed services. Every log event should carry contextual fields such as trace IDs, service names, and request metadata so that downstream systems can index and query them without fragile regex parsing.

Core Concepts

Log event: A single structured record containing a timestamp, severity level, message, and arbitrary contextual fields.
Log levels: DEBUG, INFO, WARN, ERROR, FATAL — used to control verbosity and trigger alerts.
Contextual enrichment: Automatically attaching request-scoped fields (trace ID, user ID, tenant ID) to every log emitted within a request lifecycle.
Log schema: A team-agreed contract for field names and types (e.g., service, env, trace_id, duration_ms) that ensures consistency across services.
Correlation ID propagation: Passing a shared identifier through HTTP headers or message metadata so logs from different services can be joined.

Implementation Patterns

Python — structlog

import structlog

structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer(),
    ],
)

logger = structlog.get_logger()

# Bind request-scoped context once
structlog.contextvars.clear_contextvars()
structlog.contextvars.bind_contextvars(
    trace_id="abc-123",
    service="payment-service",
    environment="production",
)

# Every subsequent log carries the bound context
logger.info("payment.processed", amount=49.99, currency="USD", customer_id="cust-42")

Output:

{
  "event": "payment.processed",
  "level": "info",
  "timestamp": "2026-03-17T10:22:01.123456Z",
  "trace_id": "abc-123",
  "service": "payment-service",
  "environment": "production",
  "amount": 49.99,
  "currency": "USD",
  "customer_id": "cust-42"
}

Node.js — pino

const pino = require("pino");

const logger = pino({
  level: "info",
  formatters: {
    level(label) {
      return { level: label };
    },
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

const reqLogger = logger.child({
  traceId: "abc-123",
  service: "order-service",
});

reqLogger.info({ orderId: "ord-99", itemCount: 3 }, "order.created");

Go — slog (standard library, Go 1.21+)

package main

import (
    "log/slog"
    "os"
)

func main() {
    handler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{Level: slog.LevelInfo})
    logger := slog.New(handler)

    logger.Info("payment.processed",
        slog.String("trace_id", "abc-123"),
        slog.String("service", "payment-service"),
        slog.Float64("amount", 49.99),
        slog.String("currency", "USD"),
    )
}

Middleware pattern — automatic request logging

# FastAPI middleware example
from starlette.middleware.base import BaseHTTPMiddleware
import structlog, time, uuid

class RequestLoggingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        trace_id = request.headers.get("x-trace-id", str(uuid.uuid4()))
        structlog.contextvars.clear_contextvars()
        structlog.contextvars.bind_contextvars(
            trace_id=trace_id,
            method=request.method,
            path=request.url.path,
        )
        start = time.perf_counter()
        response = await call_next(request)
        duration_ms = (time.perf_counter() - start) * 1000

        logger = structlog.get_logger()
        logger.info("http.request", status=response.status_code, duration_ms=round(duration_ms, 2))
        return response

Core Philosophy

Structured logging is the practice of treating log events as data, not prose. A free-form log line like "User 42 logged in from 10.0.0.1 at 14:02" is readable by humans but opaque to machines. A structured event with fields {"event": "user.login", "user_id": 42, "ip": "10.0.0.1"} is parseable, indexable, filterable, and aggregatable without fragile regex patterns. The shift from text to structured data is what makes log aggregation, automated alerting, and cross-service correlation possible at scale. Every team that starts with printf-style logging eventually migrates to structured logging; the only question is whether they do it proactively or under the pressure of an incident they cannot debug.

Context is what transforms a log event from a data point into a diagnostic tool. A log line that says "payment failed" is nearly useless during an incident. A log event that carries the trace ID, user ID, payment amount, error code, service name, and deployment version gives the on-call engineer everything needed to understand what happened, to whom, and in what context — without running a single additional query. The key insight is that context should be bound once at the request boundary (middleware, interceptor) and automatically attached to every subsequent log event within that request's lifecycle, not manually added to each log call.

Schema consistency across services is the unglamorous work that makes observability actually work. If one service logs traceId, another logs trace_id, and a third logs request_id, cross-service log correlation requires knowing the naming convention of every service — knowledge that does not exist during a 3 AM incident. A shared logging schema, published as a team standard and enforced through library wrappers or linting, ensures that {trace_id="abc-123"} returns results from every service in the request path, not just the ones that happened to use the right field name.

Anti-Patterns

String interpolation instead of structured fields. Writing logger.info(f"Order {order_id} created for user {user_id}") embeds data inside a string where it cannot be extracted without regex. Write logger.info("order.created", order_id=order_id, user_id=user_id) so each field is independently searchable and indexable.
Inconsistent field names across services. One service logs userId, another logs user_id, a third logs uid. This breaks cross-service queries and makes log aggregation dashboards unreliable. Publish a shared logging schema and enforce it through shared library configuration.
Logging inside hot loops. Emitting a log event for every iteration of a tight loop (every item in a batch, every row in a query result) generates massive volume that overwhelms the log pipeline and obscures meaningful events. Log at boundaries (start/end of the batch, summary of results), not inside loops.
Including sensitive data in logs. Logging full request bodies, authentication tokens, or personally identifiable information creates compliance violations and security exposure. Build a sanitization layer into your logging pipeline that redacts or hashes sensitive fields before they reach storage.
Logging as the only observability signal. Relying exclusively on logs for monitoring, alerting, and performance analysis means every question requires a full-text search through potentially terabytes of data. Use metrics for aggregates and trends, traces for latency analysis, and logs for detailed forensics. Each signal has a purpose; none is sufficient alone.

Best Practices

Agree on a schema early. Publish a shared logging schema (field names, types, enums for levels) and enforce it with linting or library wrappers.
Use structured fields, not string interpolation. Write logger.info("user.login", user_id=uid) rather than logger.info(f"User {uid} logged in").
Always propagate trace/correlation IDs. Attach them at the ingress point and carry them through every downstream call.
Log at boundaries, not inside tight loops. Log at HTTP/gRPC entry and exit, queue consumer start and finish, and job completion — not inside hot loops.
Keep payloads bounded. Truncate or omit large request/response bodies; log a content length or hash instead.
Separate log transport from log generation. Write to stdout/stderr and let the platform (Docker, Kubernetes, sidecar) handle shipping.
Include units in numeric field names. Use duration_ms, size_bytes, retry_count so consumers know the unit without documentation.

Common Pitfalls

Logging sensitive data. PII, tokens, passwords, and credit card numbers must be redacted or excluded. Build a sanitization processor into your logging pipeline.
Inconsistent field names across services. One service logs traceId, another logs trace_id, a third logs requestId. This breaks cross-service queries. Standardize and lint.
Over-logging at INFO level. Excessive volume increases storage costs and makes signal harder to find. Audit log volumes regularly and demote noisy events to DEBUG.
Treating logs as the only observability signal. Logs are rich but expensive to query at scale. Use metrics for aggregates and trends, traces for latency analysis, and logs for detailed forensics.
Stringifying complex objects. Logging str(request) produces unparseable blobs. Extract the specific fields you need.

Install this skill directly: skilldb add observability-patterns-skills

Get CLI access →

Structured Logging

Structured Logging — Observability

Overview

Core Concepts

Implementation Patterns

Python — structlog

Node.js — pino

Go — slog (standard library, Go 1.21+)

Middleware pattern — automatic request logging

Core Philosophy

Anti-Patterns

Best Practices

Common Pitfalls

Related Skills

Alerting Strategies

Distributed Tracing

Health Checks

Incident Response

Log Aggregation

Metrics Collection