Technology & EngineeringLogging Services195 lines

Datadog Logging

Datadog log management — agent setup, library integration, log pipelines, facets, monitors, and APM correlation

Quick Summary31 lines

You are an expert in integrating Datadog for application logging and observability.

## Key Points

- **Indexing all debug-level logs in production** -- Debug logs generate massive volume that balloons indexing costs. Use exclusion filters and only index warn-level and above in production indexes.
- type: file
- **Grok parser** — extract structured fields from unstructured text.
- **Date remapper** — map a field to the official log timestamp.
- **Status remapper** — map a field to the log severity.
- **Service remapper** — override the service tag from a log attribute.
- Always set `service`, `source`, and `env` tags on every log to enable seamless correlation across logs, metrics, and traces.
- Use structured JSON logging in your application so Datadog can auto-parse attributes without custom pipelines.
- Enable `logInjection` in dd-trace to get automatic trace-log correlation — this is the single highest-value Datadog logging feature.
- Use exclusion filters aggressively to control indexing costs; noisy health-check and debug logs should be excluded from indexes but remain visible in Live Tail.
- Create log-based metrics for KPIs (error rates, latency percentiles) so you can alert on them without querying raw logs.
- Forgetting to set `logs_enabled: true` in the agent config — logs will silently not be collected.

## Quick Example

```yaml
# /etc/datadog-agent/datadog.yaml
api_key: <YOUR_API_KEY>
site: datadoghq.com
logs_enabled: true
```

```bash
npm install dd-trace winston
```

skilldb get logging-services-skills/Datadog LoggingFull skill: 195 lines

Paste into your CLAUDE.md or agent config

Datadog — Logging Integration

You are an expert in integrating Datadog for application logging and observability.

Core Philosophy

Datadog's power lies in correlation. Logs alone tell you what happened. Metrics tell you the magnitude. Traces tell you the call chain. Datadog connects all three so that when an alert fires, you can jump from a metric spike to the correlated traces to the exact log lines that explain the root cause -- all within a single platform. The single most impactful configuration is enabling logInjection in dd-trace, which automatically stamps every log line with trace_id and span_id. Without this, logs and traces are disconnected silos.

Index only what you need to search. Datadog charges by indexed log volume, and the most common cost mistake is indexing everything at debug level. Use exclusion filters to prevent noisy logs (health checks, debug output, access logs for static assets) from being indexed. These excluded logs still appear in Live Tail for real-time debugging -- they just do not count against your indexing budget. Generate log-based metrics for KPIs (error rates, latency distributions) so you can alert on patterns without querying raw logs.

Structured JSON logging is the foundation of a good Datadog integration. When your application outputs structured JSON, Datadog auto-parses attributes into searchable fields without custom pipelines. When your application outputs unstructured text, you need grok parsers, regex extractors, and ongoing pipeline maintenance. Invest in structured logging at the application level and save yourself from a growing pile of parsing rules.

Anti-Patterns

Not enabling logInjection in dd-trace -- Without automatic trace-log correlation, debugging requires manually matching timestamps between the Logs and APM views. This is the highest-value Datadog feature and costs nothing to enable.
Indexing all debug-level logs in production -- Debug logs generate massive volume that balloons indexing costs. Use exclusion filters and only index warn-level and above in production indexes.
Logging sensitive data without Sensitive Data Scanner rules -- PII, API tokens, and credentials in indexed logs create compliance violations. Configure Scanner rules to redact or hash sensitive patterns before indexing.
Forgetting logs_enabled: true in the agent config -- The Datadog Agent does not collect logs by default. Deploying without this flag results in zero logs with no error message -- a silent failure.
Setting up overlapping pipelines without correct ordering -- Multiple pipelines matching the same log source process in order, and the first match wins. Mis-ordered pipelines produce unpredictable parsing results.

Overview

Datadog is a SaaS monitoring and analytics platform that unifies logs, metrics, and traces under a single pane of glass. Its log management product ingests logs from agents, libraries, or direct API calls, then indexes, parses, and correlates them with APM traces and infrastructure metrics. Datadog excels at high-cardinality search, log pipelines for parsing unstructured data, and alerting on log patterns.

Setup & Configuration

Datadog Agent (Infrastructure-Level Collection)

Install the Datadog Agent on your host or as a sidecar in Kubernetes. The agent tails log files and forwards them to Datadog.

# /etc/datadog-agent/datadog.yaml
api_key: <YOUR_API_KEY>
site: datadoghq.com
logs_enabled: true

# /etc/datadog-agent/conf.d/myapp.d/conf.yaml
logs:
  - type: file
    path: /var/log/myapp/*.log
    service: myapp
    source: nodejs
    tags:
      - env:production

Node.js Direct Integration (dd-trace + winston)

npm install dd-trace winston

// instrumentation.js — load before anything else
const tracer = require('dd-trace').init({
  logInjection: true,   // auto-injects trace_id and span_id into logs
  service: 'myapp',
  env: 'production',
});

module.exports = tracer;

// logger.js
const winston = require('winston');

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.combine(
    winston.format.timestamp(),
    winston.format.json()
  ),
  defaultMeta: { service: 'myapp' },
  transports: [
    new winston.transports.Console(),
    // The Datadog Agent tails stdout/files; alternatively use
    // a transport that ships directly to the Datadog Logs API.
  ],
});

module.exports = logger;

Python Integration

pip install ddtrace

import logging
from ddtrace import tracer, patch_all

patch_all()

FORMAT = '%(asctime)s %(levelname)s [dd.service=%(dd.service)s dd.trace_id=%(dd.trace_id)s dd.span_id=%(dd.span_id)s] %(message)s'
logging.basicConfig(format=FORMAT, level=logging.INFO)
logger = logging.getLogger(__name__)

logger.info("Order processed", extra={"order_id": "abc-123", "amount": 49.99})

Docker / Kubernetes

# docker-compose.yml snippet
services:
  datadog-agent:
    image: gcr.io/datadoghq/agent:7
    environment:
      - DD_API_KEY=${DD_API_KEY}
      - DD_LOGS_ENABLED=true
      - DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro

For Kubernetes, use the Datadog Operator or Helm chart:

helm install datadog-agent datadog/datadog \
  --set datadog.apiKey=$DD_API_KEY \
  --set datadog.logs.enabled=true \
  --set datadog.logs.containerCollectAll=true

Core Patterns

Log Pipelines and Processors

Datadog pipelines parse and enrich raw logs. Configure them in the Datadog UI under Logs > Configuration > Pipelines.

Grok parser — extract structured fields from unstructured text.
Date remapper — map a field to the official log timestamp.
Status remapper — map a field to the log severity.
Service remapper — override the service tag from a log attribute.

Example grok pattern for an Nginx access log:

%{ip:network.client.ip} - - \[%{date("dd/MMM/yyyy:HH:mm:ss Z"):date}\] "%{word:http.method} %{notSpace:http.url} HTTP/%{number:http.version}" %{integer:http.status_code} %{integer:network.bytes_written}

Facets and Measures

Create facets on frequently searched attributes (e.g., @user_id, @order_id) to enable fast filtering. Create measures on numeric attributes (e.g., @response_time_ms) for log-based metrics and analytics.

Correlating Logs with APM Traces

When logInjection is enabled in dd-trace, every log line includes dd.trace_id and dd.span_id. In the Datadog UI, clicking a trace shows correlated logs, and clicking a log navigates to the parent trace.

Log-Based Metrics

Generate custom metrics from log data without increasing indexing costs:

# In Datadog UI: Logs > Generate Metrics
count of logs where @http.status_code:5* grouped by service, env

Exclusion Filters

Reduce indexing volume by filtering out noisy logs at the index level:

# Exclude health check logs from indexing (they still appear in Live Tail)
source:nginx @http.url:"/health" @http.status_code:200

Best Practices

Always set service, source, and env tags on every log to enable seamless correlation across logs, metrics, and traces.
Use structured JSON logging in your application so Datadog can auto-parse attributes without custom pipelines.
Enable logInjection in dd-trace to get automatic trace-log correlation — this is the single highest-value Datadog logging feature.
Use exclusion filters aggressively to control indexing costs; noisy health-check and debug logs should be excluded from indexes but remain visible in Live Tail.
Create log-based metrics for KPIs (error rates, latency percentiles) so you can alert on them without querying raw logs.

Common Pitfalls

Forgetting to set logs_enabled: true in the agent config — logs will silently not be collected.
Logging sensitive data (PII, tokens) without configuring Sensitive Data Scanner rules, leading to compliance violations.
Over-indexing: sending all debug-level logs to Datadog indexes causes cost to balloon; use exclusion filters and only index warn+ in production.
Not using logInjection and then manually trying to correlate logs with traces by timestamp — this is fragile and error-prone.
Setting up multiple pipelines that match the same log source without ordering them correctly, causing unpredictable parsing results.

Install this skill directly: skilldb add logging-services-skills

Get CLI access →