Technology & EngineeringLogging Services316 lines

Fluentd

Fluentd unified logging — input/output plugins, routing with tags, buffering, Kubernetes DaemonSet, and Fluent Bit

Quick Summary28 lines

You are an expert in integrating Fluentd for unified log collection, transformation, and routing.

## Key Points

- **Using memory buffers in production** -- Memory buffers lose all buffered logs when Fluentd restarts or crashes. Always use `@type file` buffers for production outputs.
- **Catching all tags with `<match **>` early in the config** -- Fluentd uses first-match routing. A broad match early in the config prevents all subsequent match blocks from ever executing.
- **Omitting `pos_file` on tail inputs** -- Without a position file, Fluentd re-reads entire log files from the beginning on restart, causing massive duplicate log ingestion.
- Use file-based buffers in production (`@type file`) — memory buffers lose data on process restart or crash.
- Deploy Fluent Bit on application hosts as a lightweight forwarder, and Fluentd as a centralized aggregator — this minimizes resource usage at the edge.
- Use the `copy` output plugin to send logs to multiple destinations (e.g., Elasticsearch for search and S3 for archival) from a single pipeline.
- Always set `pos_file` for tail inputs so Fluentd remembers where it left off after a restart and does not re-send or skip logs.
- Apply `record_transformer` filters to enrich logs with environment, cluster, and service metadata early in the pipeline.
- Using memory buffers in production and losing all buffered logs when Fluentd restarts — always use file buffers.
- Not setting `retry_forever true` for critical outputs, causing Fluentd to permanently drop log chunks after exhausting retries.
- Running Fluentd (full Ruby runtime) instead of Fluent Bit on resource-constrained edge nodes or sidecar containers, wasting 10x the memory.
- Forgetting to set `pos_file` on tail inputs — without it, Fluentd re-reads entire log files from the beginning on restart, causing duplicate logs.

## Quick Example

```bash
npm install fluent-logger
```

```bash
pip install fluent-logger
```

skilldb get logging-services-skills/FluentdFull skill: 316 lines

Paste into your CLAUDE.md or agent config

Fluentd — Logging Integration

You are an expert in integrating Fluentd for unified log collection, transformation, and routing.

Core Philosophy

Fluentd is infrastructure plumbing, not application logic. Its job is to collect logs from diverse sources, normalize them into structured JSON, and route them reliably to one or more destinations. Applications should not know or care where their logs end up -- they write to stdout or a local file, and Fluentd handles everything downstream. This separation means you can change your log storage backend (swap Elasticsearch for Datadog, add S3 archival) without modifying a single line of application code.

Reliability is the non-negotiable requirement for a log pipeline. Logs that are lost during a downstream outage are exactly the logs you needed most. File-based buffers, retry policies with exponential backoff, and retry_forever on critical outputs ensure that Fluentd absorbs temporary failures and delivers every log eventually. Memory buffers are faster but lose all buffered data on process restart -- never use them in production.

Deploy the right tool at each layer. Fluent Bit is a lightweight forwarder (10 MB memory footprint) designed to run on application hosts, edge devices, and sidecar containers. Fluentd is a full-featured aggregator (100+ MB, JVM-weight Ruby process) designed to run as a centralized service that receives logs from many Fluent Bit instances, applies complex transformations, and routes to multiple backends. Running full Fluentd on every application pod wastes resources; running only Fluent Bit as your aggregator limits your transformation capabilities.

Anti-Patterns

Using memory buffers in production -- Memory buffers lose all buffered logs when Fluentd restarts or crashes. Always use @type file buffers for production outputs.
Running full Fluentd as a sidecar on every pod -- Fluentd's Ruby runtime consumes significantly more memory than Fluent Bit. Use Fluent Bit on application hosts and Fluentd as a centralized aggregator.
Catching all tags with <match **> early in the config -- Fluentd uses first-match routing. A broad match early in the config prevents all subsequent match blocks from ever executing.
Omitting pos_file on tail inputs -- Without a position file, Fluentd re-reads entire log files from the beginning on restart, causing massive duplicate log ingestion.
Not setting retry_forever true on critical outputs -- Without it, Fluentd permanently drops log chunks after exhausting the default retry limit, silently losing data during extended downstream outages.

Overview

Fluentd is a CNCF-graduated open-source data collector that unifies log collection and routing. It decouples log producers from consumers: applications write logs in any format, and Fluentd normalizes them into structured JSON, then routes them to one or more destinations (Elasticsearch, S3, Datadog, BigQuery, Kafka, and 600+ plugins). Fluentd is the standard log collector in Kubernetes environments. Fluent Bit is its lightweight sibling, designed for edge and resource-constrained environments.

Setup & Configuration

Installation

# Ubuntu/Debian (td-agent is the stable distribution of Fluentd)
curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-noble-fluent-package5-lts.sh | sh

# macOS
brew install fluentd

# Docker
docker run -v /path/to/fluent.conf:/fluentd/etc/fluent.conf fluent/fluentd:v1.16

Basic Configuration

<!-- /etc/fluent/fluent.conf -->

<!-- Collect from application log files -->
<source>
  @type tail
  path /var/log/myapp/*.log
  pos_file /var/log/fluentd/myapp.log.pos
  tag myapp.logs
  <parse>
    @type json
    time_key timestamp
    time_format %Y-%m-%dT%H:%M:%S.%NZ
  </parse>
</source>

<!-- Accept logs over HTTP -->
<source>
  @type http
  port 9880
  bind 0.0.0.0
</source>

<!-- Accept forward protocol (from Fluent Bit or other Fluentd instances) -->
<source>
  @type forward
  port 24224
</source>

<!-- Route to Elasticsearch -->
<match myapp.**>
  @type elasticsearch
  host elasticsearch
  port 9200
  index_name myapp-logs
  logstash_format true
  logstash_prefix myapp

  <buffer>
    @type file
    path /var/log/fluentd/buffer/elasticsearch
    flush_interval 5s
    chunk_limit_size 8m
    retry_max_interval 30s
    retry_forever true
  </buffer>
</match>

Kubernetes DaemonSet

Fluentd runs as a DaemonSet to collect logs from every node:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccountName: fluentd
      tolerations:
        - key: node-role.kubernetes.io/control-plane
          effect: NoSchedule
      containers:
        - name: fluentd
          image: fluent/fluentd-kubernetes-daemonset:v1.16-debian-elasticsearch8-1
          env:
            - name: FLUENT_ELASTICSEARCH_HOST
              value: "elasticsearch.kube-logging.svc.cluster.local"
            - name: FLUENT_ELASTICSEARCH_PORT
              value: "9200"
          volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: containers
              mountPath: /var/lib/docker/containers
              readOnly: true
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: containers
          hostPath:
            path: /var/lib/docker/containers

Sending Logs from Node.js

npm install fluent-logger

const fluent = require('fluent-logger');

fluent.configure('myapp', {
  host: 'localhost',
  port: 24224,
  timeout: 3.0,
  reconnectInterval: 600000,
});

fluent.emit('request', {
  method: 'POST',
  path: '/api/orders',
  status: 201,
  durationMs: 87,
  userId: 'usr-456',
});

Sending Logs from Python

pip install fluent-logger

from fluent import sender, event

sender.setup('myapp', host='localhost', port=24224)

event.Event('request', {
    'method': 'POST',
    'path': '/api/orders',
    'status': 201,
    'duration_ms': 87,
    'user_id': 'usr-456',
})

Core Patterns

Tag-Based Routing

Fluentd routes logs using dot-separated tags and glob-style matching:

<!-- Route web logs to Elasticsearch -->
<match web.**>
  @type elasticsearch
  ...
</match>

<!-- Route audit logs to S3 for compliance -->
<match audit.**>
  @type s3
  s3_bucket my-audit-logs
  s3_region us-east-1
  path audit/
  <buffer time>
    @type file
    path /var/log/fluentd/buffer/s3
    timekey 3600
    timekey_wait 10m
  </buffer>
</match>

<!-- Copy error logs to both Elasticsearch and Slack -->
<match error.**>
  @type copy
  <store>
    @type elasticsearch
    ...
  </store>
  <store>
    @type slack
    webhook_url https://hooks.slack.com/services/...
    channel #alerts
  </store>
</match>

Filtering and Transformation

<!-- Add Kubernetes metadata -->
<filter kubernetes.**>
  @type kubernetes_metadata
</filter>

<!-- Parse nested JSON in the log field -->
<filter myapp.**>
  @type parser
  key_name log
  reserve_data true
  <parse>
    @type json
  </parse>
</filter>

<!-- Remove sensitive fields -->
<filter **>
  @type record_transformer
  remove_keys password, ssn, credit_card
</filter>

<!-- Add static fields -->
<filter myapp.**>
  @type record_transformer
  <record>
    environment "#{ENV['ENVIRONMENT']}"
    cluster "#{ENV['CLUSTER_NAME']}"
  </record>
</filter>

Fluent Bit as a Lightweight Forwarder

Use Fluent Bit on application hosts to forward to a central Fluentd aggregator:

# fluent-bit.conf
[INPUT]
    Name              tail
    Path              /var/log/myapp/*.log
    Parser            json
    Tag               myapp.*
    Refresh_Interval  5

[OUTPUT]
    Name          forward
    Match         *
    Host          fluentd-aggregator
    Port          24224

Buffering and Reliability

Fluentd's buffer system ensures no log loss during downstream outages:

<buffer>
  @type file                    # persist buffer to disk (survives restarts)
  path /var/log/fluentd/buffer
  flush_interval 5s             # flush every 5 seconds
  chunk_limit_size 8m           # max chunk size
  total_limit_size 2g           # max total buffer size
  retry_max_interval 60s        # backoff cap on retries
  retry_forever true            # never drop chunks
  overflow_action block         # block input when buffer is full
</buffer>

Best Practices

Use file-based buffers in production (@type file) — memory buffers lose data on process restart or crash.
Deploy Fluent Bit on application hosts as a lightweight forwarder, and Fluentd as a centralized aggregator — this minimizes resource usage at the edge.
Use the copy output plugin to send logs to multiple destinations (e.g., Elasticsearch for search and S3 for archival) from a single pipeline.
Always set pos_file for tail inputs so Fluentd remembers where it left off after a restart and does not re-send or skip logs.
Apply record_transformer filters to enrich logs with environment, cluster, and service metadata early in the pipeline.

Common Pitfalls

Using memory buffers in production and losing all buffered logs when Fluentd restarts — always use file buffers.
Not setting retry_forever true for critical outputs, causing Fluentd to permanently drop log chunks after exhausting retries.
Running Fluentd (full Ruby runtime) instead of Fluent Bit on resource-constrained edge nodes or sidecar containers, wasting 10x the memory.
Forgetting to set pos_file on tail inputs — without it, Fluentd re-reads entire log files from the beginning on restart, causing duplicate logs.
Over-matching with <match **> early in the config, which catches all tags and prevents subsequent match blocks from ever executing (Fluentd uses first-match routing).

Install this skill directly: skilldb add logging-services-skills

Get CLI access →