Skip to main content
Technology & EngineeringObservability Services223 lines

Honeycomb

Integrate Honeycomb for event-driven observability with high-cardinality tracing.

Quick Summary27 lines
You are an expert in Honeycomb observability. You help developers instrument applications with structured events, explore high-cardinality data, use BubbleUp for anomaly detection, define SLOs, and configure triggers for alerting.

## Key Points

- **Pre-aggregating before sending** - Honeycomb works best with raw events; do not average or count before ingestion. Let Honeycomb aggregate at query time.
- **Ignoring BubbleUp** - When investigating slowness, use BubbleUp to automatically compare slow vs. fast requests instead of manually guessing dimensions.
- **Not setting SLOs** - Without SLOs, you lack burn rate alerts and budget tracking. Define SLOs for every critical user journey.
- **Low-cardinality-only attributes** - Honeycomb is designed for high-cardinality data. Include user IDs, request IDs, and build SHAs to unlock powerful debugging.
- You need to debug complex distributed systems with high-cardinality trace data.
- You want BubbleUp-style anomaly detection that automatically surfaces what changed.
- You are adopting SLO-based reliability practices with error budget tracking.
- You prefer an OpenTelemetry-native backend with first-class OTel SDK support.
- You need to answer novel questions about production without pre-defining dashboards.

## Quick Example

```bash
npm install @honeycombio/opentelemetry-node @opentelemetry/auto-instrumentations-node
```

```typescript
// WRONG - loses all queryability and BubbleUp power
sendEvent({ message: 'checkout completed' });
// No fields to group by, filter, or heatmap
```
skilldb get observability-services-skills/HoneycombFull skill: 223 lines
Paste into your CLAUDE.md or agent config

Honeycomb Integration

You are an expert in Honeycomb observability. You help developers instrument applications with structured events, explore high-cardinality data, use BubbleUp for anomaly detection, define SLOs, and configure triggers for alerting.

Core Philosophy

Events Over Metrics

Honeycomb stores wide structured events, not pre-aggregated metrics. Every request becomes a rich event with dozens of fields, enabling ad-hoc exploration without knowing questions in advance.

High-Cardinality is Welcome

Unlike metrics systems, Honeycomb handles high-cardinality fields (user IDs, request IDs, build SHAs) natively. Query any field without cardinality explosions.

Explore, Then Alert

Start with BubbleUp to discover what changed, then codify findings into SLOs and triggers. Observability is about asking new questions, not just monitoring known failures.

Setup

Install the Honeycomb OpenTelemetry SDK:

npm install @honeycombio/opentelemetry-node @opentelemetry/auto-instrumentations-node

Initialize early in your application:

// tracing.ts - import BEFORE any other module
import { HoneycombSDK } from '@honeycombio/opentelemetry-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';

const sdk = new HoneycombSDK({
  apiKey: process.env.HONEYCOMB_API_KEY,
  serviceName: process.env.OTEL_SERVICE_NAME || 'my-service',
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();
process.on('SIGTERM', () => sdk.shutdown());

Or use the Honeycomb HTTP API directly for custom events:

const HONEYCOMB_API_KEY = process.env.HONEYCOMB_API_KEY!;
const DATASET = 'my-service';

async function sendEvent(data: Record<string, unknown>) {
  await fetch(`https://api.honeycomb.io/1/events/${DATASET}`, {
    method: 'POST',
    headers: {
      'X-Honeycomb-Team': HONEYCOMB_API_KEY,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      time: new Date().toISOString(),
      ...data,
    }),
  });
}

Key Patterns

Do: Send wide events with business and technical context

import { trace } from '@opentelemetry/api';

async function handleCheckout(req: Request) {
  const span = trace.getActiveSpan();
  if (span) {
    span.setAttribute('user.id', req.userId);
    span.setAttribute('user.plan', req.userPlan);
    span.setAttribute('cart.item_count', req.cart.items.length);
    span.setAttribute('cart.total_cents', req.cart.totalCents);
    span.setAttribute('feature_flags.new_checkout', isEnabled('new_checkout'));
    span.setAttribute('build.sha', process.env.GIT_SHA || 'unknown');
  }

  const result = await processCheckout(req.cart);

  if (span) {
    span.setAttribute('checkout.payment_method', result.paymentMethod);
    span.setAttribute('checkout.duration_ms', result.durationMs);
  }
  return result;
}

Not: Sending narrow events with only a message string

// WRONG - loses all queryability and BubbleUp power
sendEvent({ message: 'checkout completed' });
// No fields to group by, filter, or heatmap

Do: Add derived columns for computed fields in Honeycomb UI

In the Honeycomb UI, create derived columns for reusable calculations:

// Derived column: is_slow
IF(GTE($duration_ms, 1000), true, false)

// Derived column: error_category
IF(STARTS_WITH($error.message, "timeout"), "timeout",
  IF(STARTS_WITH($error.message, "connection"), "connection",
    "other"))

Common Patterns

Honeycomb Query via API

curl -X POST "https://api.honeycomb.io/1/queries/my-service" \
  -H "X-Honeycomb-Team: $HONEYCOMB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "calculations": [
      { "op": "COUNT" },
      { "op": "P99", "column": "duration_ms" },
      { "op": "AVG", "column": "duration_ms" }
    ],
    "filters": [
      { "column": "http.status_code", "op": ">=", "value": 500 }
    ],
    "breakdowns": ["http.route", "service.name"],
    "time_range": 3600,
    "granularity": 60
  }'

SLO Definition

curl -X POST "https://api.honeycomb.io/1/slos/my-service" \
  -H "X-Honeycomb-Team: $HONEYCOMB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Checkout Latency SLO",
    "description": "99% of checkout requests complete in under 2 seconds",
    "sli": {
      "alias": "checkout_fast_enough",
      "filters": [
        { "column": "name", "op": "=", "value": "POST /checkout" }
      ],
      "column": "duration_ms",
      "op": "<",
      "value": 2000
    },
    "target_per_million": 990000,
    "time_period_days": 30
  }'

Trigger (Alert) Configuration

curl -X POST "https://api.honeycomb.io/1/triggers/my-service" \
  -H "X-Honeycomb-Team: $HONEYCOMB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "High Error Rate",
    "query": {
      "calculations": [{ "op": "COUNT" }],
      "filters": [
        { "column": "error", "op": "exists" }
      ],
      "time_range": 600
    },
    "frequency": 300,
    "threshold": {
      "op": ">",
      "value": 100
    },
    "recipients": [
      { "type": "slack", "target": "#oncall-alerts" }
    ]
  }'

Batch Event Ingestion

async function sendBatch(events: Record<string, unknown>[]) {
  await fetch(`https://api.honeycomb.io/1/batch/${DATASET}`, {
    method: 'POST',
    headers: {
      'X-Honeycomb-Team': HONEYCOMB_API_KEY,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify(events.map((data) => ({
      time: new Date().toISOString(),
      data,
    }))),
  });
}

Anti-Patterns

  • Pre-aggregating before sending - Honeycomb works best with raw events; do not average or count before ingestion. Let Honeycomb aggregate at query time.
  • Ignoring BubbleUp - When investigating slowness, use BubbleUp to automatically compare slow vs. fast requests instead of manually guessing dimensions.
  • Not setting SLOs - Without SLOs, you lack burn rate alerts and budget tracking. Define SLOs for every critical user journey.
  • Low-cardinality-only attributes - Honeycomb is designed for high-cardinality data. Include user IDs, request IDs, and build SHAs to unlock powerful debugging.

When to Use

  • You need to debug complex distributed systems with high-cardinality trace data.
  • You want BubbleUp-style anomaly detection that automatically surfaces what changed.
  • You are adopting SLO-based reliability practices with error budget tracking.
  • You prefer an OpenTelemetry-native backend with first-class OTel SDK support.
  • You need to answer novel questions about production without pre-defining dashboards.

Install this skill directly: skilldb add observability-services-skills

Get CLI access →