Datadog Logging
Datadog log management — agent setup, library integration, log pipelines, facets, monitors, and APM correlation
You are an expert in integrating Datadog for application logging and observability. ## Key Points - **Indexing all debug-level logs in production** -- Debug logs generate massive volume that balloons indexing costs. Use exclusion filters and only index warn-level and above in production indexes. - type: file - **Grok parser** — extract structured fields from unstructured text. - **Date remapper** — map a field to the official log timestamp. - **Status remapper** — map a field to the log severity. - **Service remapper** — override the service tag from a log attribute. - Always set `service`, `source`, and `env` tags on every log to enable seamless correlation across logs, metrics, and traces. - Use structured JSON logging in your application so Datadog can auto-parse attributes without custom pipelines. - Enable `logInjection` in dd-trace to get automatic trace-log correlation — this is the single highest-value Datadog logging feature. - Use exclusion filters aggressively to control indexing costs; noisy health-check and debug logs should be excluded from indexes but remain visible in Live Tail. - Create log-based metrics for KPIs (error rates, latency percentiles) so you can alert on them without querying raw logs. - Forgetting to set `logs_enabled: true` in the agent config — logs will silently not be collected. ## Quick Example ```yaml # /etc/datadog-agent/datadog.yaml api_key: <YOUR_API_KEY> site: datadoghq.com logs_enabled: true ``` ```bash npm install dd-trace winston ```
skilldb get logging-services-skills/Datadog LoggingFull skill: 195 linesDatadog — Logging Integration
You are an expert in integrating Datadog for application logging and observability.
Core Philosophy
Datadog's power lies in correlation. Logs alone tell you what happened. Metrics tell you the magnitude. Traces tell you the call chain. Datadog connects all three so that when an alert fires, you can jump from a metric spike to the correlated traces to the exact log lines that explain the root cause -- all within a single platform. The single most impactful configuration is enabling logInjection in dd-trace, which automatically stamps every log line with trace_id and span_id. Without this, logs and traces are disconnected silos.
Index only what you need to search. Datadog charges by indexed log volume, and the most common cost mistake is indexing everything at debug level. Use exclusion filters to prevent noisy logs (health checks, debug output, access logs for static assets) from being indexed. These excluded logs still appear in Live Tail for real-time debugging -- they just do not count against your indexing budget. Generate log-based metrics for KPIs (error rates, latency distributions) so you can alert on patterns without querying raw logs.
Structured JSON logging is the foundation of a good Datadog integration. When your application outputs structured JSON, Datadog auto-parses attributes into searchable fields without custom pipelines. When your application outputs unstructured text, you need grok parsers, regex extractors, and ongoing pipeline maintenance. Invest in structured logging at the application level and save yourself from a growing pile of parsing rules.
Anti-Patterns
- Not enabling
logInjectionin dd-trace -- Without automatic trace-log correlation, debugging requires manually matching timestamps between the Logs and APM views. This is the highest-value Datadog feature and costs nothing to enable. - Indexing all debug-level logs in production -- Debug logs generate massive volume that balloons indexing costs. Use exclusion filters and only index warn-level and above in production indexes.
- Logging sensitive data without Sensitive Data Scanner rules -- PII, API tokens, and credentials in indexed logs create compliance violations. Configure Scanner rules to redact or hash sensitive patterns before indexing.
- Forgetting
logs_enabled: truein the agent config -- The Datadog Agent does not collect logs by default. Deploying without this flag results in zero logs with no error message -- a silent failure. - Setting up overlapping pipelines without correct ordering -- Multiple pipelines matching the same log source process in order, and the first match wins. Mis-ordered pipelines produce unpredictable parsing results.
Overview
Datadog is a SaaS monitoring and analytics platform that unifies logs, metrics, and traces under a single pane of glass. Its log management product ingests logs from agents, libraries, or direct API calls, then indexes, parses, and correlates them with APM traces and infrastructure metrics. Datadog excels at high-cardinality search, log pipelines for parsing unstructured data, and alerting on log patterns.
Setup & Configuration
Datadog Agent (Infrastructure-Level Collection)
Install the Datadog Agent on your host or as a sidecar in Kubernetes. The agent tails log files and forwards them to Datadog.
# /etc/datadog-agent/datadog.yaml
api_key: <YOUR_API_KEY>
site: datadoghq.com
logs_enabled: true
# /etc/datadog-agent/conf.d/myapp.d/conf.yaml
logs:
- type: file
path: /var/log/myapp/*.log
service: myapp
source: nodejs
tags:
- env:production
Node.js Direct Integration (dd-trace + winston)
npm install dd-trace winston
// instrumentation.js — load before anything else
const tracer = require('dd-trace').init({
logInjection: true, // auto-injects trace_id and span_id into logs
service: 'myapp',
env: 'production',
});
module.exports = tracer;
// logger.js
const winston = require('winston');
const logger = winston.createLogger({
level: 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.json()
),
defaultMeta: { service: 'myapp' },
transports: [
new winston.transports.Console(),
// The Datadog Agent tails stdout/files; alternatively use
// a transport that ships directly to the Datadog Logs API.
],
});
module.exports = logger;
Python Integration
pip install ddtrace
import logging
from ddtrace import tracer, patch_all
patch_all()
FORMAT = '%(asctime)s %(levelname)s [dd.service=%(dd.service)s dd.trace_id=%(dd.trace_id)s dd.span_id=%(dd.span_id)s] %(message)s'
logging.basicConfig(format=FORMAT, level=logging.INFO)
logger = logging.getLogger(__name__)
logger.info("Order processed", extra={"order_id": "abc-123", "amount": 49.99})
Docker / Kubernetes
# docker-compose.yml snippet
services:
datadog-agent:
image: gcr.io/datadoghq/agent:7
environment:
- DD_API_KEY=${DD_API_KEY}
- DD_LOGS_ENABLED=true
- DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
For Kubernetes, use the Datadog Operator or Helm chart:
helm install datadog-agent datadog/datadog \
--set datadog.apiKey=$DD_API_KEY \
--set datadog.logs.enabled=true \
--set datadog.logs.containerCollectAll=true
Core Patterns
Log Pipelines and Processors
Datadog pipelines parse and enrich raw logs. Configure them in the Datadog UI under Logs > Configuration > Pipelines.
- Grok parser — extract structured fields from unstructured text.
- Date remapper — map a field to the official log timestamp.
- Status remapper — map a field to the log severity.
- Service remapper — override the service tag from a log attribute.
Example grok pattern for an Nginx access log:
%{ip:network.client.ip} - - \[%{date("dd/MMM/yyyy:HH:mm:ss Z"):date}\] "%{word:http.method} %{notSpace:http.url} HTTP/%{number:http.version}" %{integer:http.status_code} %{integer:network.bytes_written}
Facets and Measures
Create facets on frequently searched attributes (e.g., @user_id, @order_id) to enable fast filtering. Create measures on numeric attributes (e.g., @response_time_ms) for log-based metrics and analytics.
Correlating Logs with APM Traces
When logInjection is enabled in dd-trace, every log line includes dd.trace_id and dd.span_id. In the Datadog UI, clicking a trace shows correlated logs, and clicking a log navigates to the parent trace.
Log-Based Metrics
Generate custom metrics from log data without increasing indexing costs:
# In Datadog UI: Logs > Generate Metrics
count of logs where @http.status_code:5* grouped by service, env
Exclusion Filters
Reduce indexing volume by filtering out noisy logs at the index level:
# Exclude health check logs from indexing (they still appear in Live Tail)
source:nginx @http.url:"/health" @http.status_code:200
Best Practices
- Always set
service,source, andenvtags on every log to enable seamless correlation across logs, metrics, and traces. - Use structured JSON logging in your application so Datadog can auto-parse attributes without custom pipelines.
- Enable
logInjectionin dd-trace to get automatic trace-log correlation — this is the single highest-value Datadog logging feature. - Use exclusion filters aggressively to control indexing costs; noisy health-check and debug logs should be excluded from indexes but remain visible in Live Tail.
- Create log-based metrics for KPIs (error rates, latency percentiles) so you can alert on them without querying raw logs.
Common Pitfalls
- Forgetting to set
logs_enabled: truein the agent config — logs will silently not be collected. - Logging sensitive data (PII, tokens) without configuring Sensitive Data Scanner rules, leading to compliance violations.
- Over-indexing: sending all debug-level logs to Datadog indexes causes cost to balloon; use exclusion filters and only index warn+ in production.
- Not using
logInjectionand then manually trying to correlate logs with traces by timestamp — this is fragile and error-prone. - Setting up multiple pipelines that match the same log source without ordering them correctly, causing unpredictable parsing results.
Install this skill directly: skilldb add logging-services-skills
Related Skills
Better Stack / Logtail
Better Stack (Logtail) logging — structured log ingestion, live tail, SQL-based querying, alerting, and uptime monitoring
Fluentd
Fluentd unified logging — input/output plugins, routing with tags, buffering, Kubernetes DaemonSet, and Fluent Bit
Logstash / ELK Stack
ELK Stack logging — Logstash pipelines, Elasticsearch indexing, Kibana dashboards, and Filebeat shippers
Papertrail
Papertrail cloud logging — syslog forwarding, live tail, search, alerts, and integration with app frameworks
Pino Logger
Pino: fast JSON logger for Node.js — child loggers, serializers, transports (pino-pretty, pino-http), redaction, Next.js integration, and log levels
Structured Logging Patterns
Structured logging patterns for TypeScript — correlation IDs, request context, log levels, error serialization, sensitive data redaction, and observability best practices