Technology & EngineeringMonitoring Services286 lines

Honeycomb

Honeycomb is an observability platform designed for complex, distributed systems. It allows you to analyze high-cardinality, wide event data to understand system behavior, debug issues, and explore unknown unknowns in production with unparalleled depth and speed.

Quick Summary14 lines

You are an expert in modern observability practices, highly proficient in leveraging Honeycomb to instrument, query, and interpret the behavior of distributed services. You advocate for "debug with data" and empower teams to rapidly diagnose and resolve issues by interactively exploring rich, high-cardinality events flowing from production systems.

## Quick Example

```bash
npm install @honeycomb-io/opentelemetry-sdk-node @opentelemetry/api
```

```env
HONEYCOMB_API_KEY=YOUR_HONEYCOMB_API_KEY
HONEYCOMB_SERVICE_NAME=my-cool-webapp
```

skilldb get monitoring-services-skills/HoneycombFull skill: 286 lines

Paste into your CLAUDE.md or agent config

You are an expert in modern observability practices, highly proficient in leveraging Honeycomb to instrument, query, and interpret the behavior of distributed services. You advocate for "debug with data" and empower teams to rapidly diagnose and resolve issues by interactively exploring rich, high-cardinality events flowing from production systems.

Core Philosophy

Honeycomb's core philosophy centers on observability through rich, structured events, moving beyond traditional metrics, logs, and traces as separate pillars. Instead, it encourages sending "wide events"—single, comprehensive data points that capture all relevant context for a specific operation or request, including attributes that might have millions of unique values (high cardinality). This approach is particularly powerful for modern, distributed architectures where issues often span multiple services and components, and traditional monitoring falls short in providing the necessary context.

The platform excels at helping you ask arbitrary questions of your data after it's been collected, rather than requiring you to define dashboards and alerts beforehand. This capability is crucial for debugging "unknown unknowns"—problems you didn't anticipate and therefore couldn't pre-configure metrics or logs for. By embracing wide events and interactive query capabilities, Honeycomb transforms debugging from a frantic log-grepping and dashboard-hopping exercise into a guided, data-driven exploration that leads to faster root cause analysis and a deeper understanding of your system's actual behavior in production. Choose Honeycomb when you need to understand why things are happening, not just what is happening.

Setup

To integrate Honeycomb into your web application, you typically use one of their Beelines (SDKs) or OpenTelemetry. Beelines automatically instrument common frameworks and provide helpful abstractions for sending events.

Here's how to set up the Honeycomb Beeline for a Node.js web application using Express:

First, install the Beeline:

npm install @honeycomb-io/opentelemetry-sdk-node @opentelemetry/api

Then, initialize the Beeline early in your application's lifecycle, before any other modules that might generate events.

// src/app.ts or index.ts
import { HoneycombSDK } from '@honeycomb-io/opentelemetry-sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import express from 'express';
import dotenv from 'dotenv';

dotenv.config(); // Load environment variables from .env

// Initialize Honeycomb Beeline
// This should be done as early as possible in your application's entry point.
const sdk = new HoneycombSDK({
  serviceName: process.env.HONEYCOMB_SERVICE_NAME || 'my-express-app',
  apiKey: process.env.HONEYCOMB_API_KEY,
  // dataset: process.env.HONEYCOMB_DATASET, // If not using team API key with fixed dataset
  instrumentations: [
    getNodeAutoInstrumentations({
      // Configure specific instrumentations if needed
      '@opentelemetry/instrumentation-express': {
        // Example: Add request body to traces (use with caution for sensitive data)
        requestHook: (span, req) => {
          span.setAttribute('http.request.body', JSON.stringify(req.body));
        },
      },
    }),
  ],
});

sdk.start();

// Ensure the SDK is shut down gracefully on exit
process.on('SIGTERM', () => {
  sdk.shutdown()
    .then(() => console.log('Honeycomb SDK shut down successfully'))
    .catch((error) => console.error('Error shutting down Honeycomb SDK', error))
    .finally(() => process.exit(0));
});

const app = express();
app.use(express.json()); // For parsing application/json

// Your application routes and logic here
app.get('/', (req, res) => {
  res.send('Hello from Express!');
});

app.post('/api/users', (req, res) => {
  console.log('Received user data:', req.body);
  res.status(201).send({ message: 'User created', data: req.body });
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server running on http://localhost:${PORT}`);
});

Make sure to set HONEYCOMB_API_KEY and HONEYCOMB_SERVICE_NAME in your environment variables.

HONEYCOMB_API_KEY=YOUR_HONEYCOMB_API_KEY
HONEYCOMB_SERVICE_NAME=my-cool-webapp

Key Techniques

1. Enriching HTTP Request Traces

Automatically captured HTTP request traces are a great starting point, but you need to add business-relevant context to make them truly useful for debugging. This includes user IDs, organization IDs, feature flags, or specific request parameters.

// src/routes/userRoutes.ts (example with Express)
import { Router } from 'express';
import { trace } from '@opentelemetry/api';

const router = Router();

router.get('/users/:id', (req, res) => {
  const userId = req.params.id;
  const currentSpan = trace.getActiveSpan();

  if (currentSpan) {
    // Add specific attributes to the current span for richer context
    currentSpan.setAttribute('app.user.id', userId);
    currentSpan.setAttribute('app.request.type', 'get_user_profile');
    // Simulate fetching user data
    setTimeout(() => {
      res.json({ id: userId, name: `User ${userId}`, email: `user${userId}@example.com` });
    }, 50);
  } else {
    res.json({ id: userId, name: `User ${userId}`, email: `user${userId}@example.com` });
  }
});

export default router;

// In your main app.ts:
// import userRoutes from './routes/userRoutes';
// app.use('/api', userRoutes);

2. Creating Custom Spans for Business Logic

Wrap important business logic, database calls, or external API interactions in custom spans. This allows you to measure their duration and attach specific attributes, giving you granular insight into where time is spent and what parameters influence performance.

// src/services/orderService.ts
import { trace, SpanStatusCode } from '@opentelemetry/api';

class OrderService {
  async processOrder(orderId: string, userId: string, items: any[]) {
    const tracer = trace.getTracer('order-service');

    // Create a custom span for the entire order processing
    const span = tracer.startSpan('OrderService.processOrder', {
      attributes: {
        'app.order.id': orderId,
        'app.user.id': userId,
        'app.order.item_count': items.length,
      },
    });

    try {
      // Simulate database operation
      await new Promise(resolve => setTimeout(resolve, Math.random() * 100));
      span.addEvent('database_save_start', { 'db.collection': 'orders' });
      // ... actual database save ...
      span.addEvent('database_save_end');

      // Simulate calling an external payment gateway
      const paymentResult = await this.callPaymentGateway(orderId, items.reduce((sum, item) => sum + item.price, 0));
      span.setAttribute('app.payment.status', paymentResult.status);

      span.setStatus({ code: SpanStatusCode.OK });
      return { success: true, orderId, paymentStatus: paymentResult.status };
    } catch (error: any) {
      span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
      span.recordException(error);
      throw error;
    } finally {
      span.end();
    }
  }

  private async callPaymentGateway(orderId: string, amount: number) {
    const tracer = trace.getTracer('order-service');
    // Create a child span for the external call
    const span = tracer.startSpan('OrderService.callPaymentGateway', {
      attributes: {
        'app.payment.gateway': 'Stripe',
        'app.order.id': orderId,
        'app.payment.amount': amount,
      },
    });

    try {
      await new Promise(resolve => setTimeout(resolve, Math.random() * 200 + 50)); // Simulate network latency
      span.end();
      return { status: 'success', transactionId: `txn_${Date.now()}` };
    } catch (error: any) {
      span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
      span.recordException(error);
      span.end();
      throw error;
    }
  }
}

export const orderService = new OrderService();

// Example usage in an Express route:
// router.post('/orders', async (req, res, next) => {
//   try {
//     const { userId, items } = req.body;
//     const orderId = `order_${Date.now()}`;
//     const result = await orderService.processOrder(orderId, userId, items);
//     res.status(200).json(result);
//   } catch (error) {
//     next(error); // Pass error to Express error handler
//   }
// });

3. Adding Context to Logs/Events That Aren't Spans

Sometimes you have discrete events or logs that aren't part of a request trace but still need context. You can manually send events to Honeycomb with custom fields. While OpenTelemetry focuses on traces, Honeycomb also allows sending arbitrary structured events directly.

For direct event sending (less common with OpenTelemetry setup, but useful for background jobs or specific, isolated events):

// src/backgroundWorker.ts
import { HoneycombSDK } from '@honeycomb-io/opentelemetry-sdk-node'; // Still need SDK for config
import { diag, DiagConsoleLogger, DiagLogLevel } from '@opentelemetry/api';
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base';
import dotenv from 'dotenv';
import axios from 'axios'; // Example for an external call

dotenv.config();

// Initialize the OTel SDK for Honeycomb
const honeycombExporter = new OTLPTraceExporter({
  url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'https://api.honeycomb.io/v1/traces',
  headers: {
    'x-honeycomb-team': process.env.HONEYCOMB_API_KEY!,
    'x-honeycomb-dataset': process.env.HONEYCOMB_DATASET || process.env.HONEYCOMB_SERVICE_NAME!,
  },
});

const sdk = new NodeSDK({
  serviceName: process.env.HONEYCOMB_SERVICE_NAME || 'my-background-worker',
  spanProcessor: new SimpleSpanProcessor(honeycombExporter),
  // No auto-instrumentations needed for pure background worker without web frameworks
});

sdk.start();

// Enable diagnostic logging for OTel if needed
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.INFO);


async function runJob(jobId: string) {
  const currentSpan = diag.getLogger().info(`Starting job: ${jobId}`);
  // In a real scenario, you'd get the current active span if this job was triggered by a request.
  // For a purely background job, you'd create a new root span.
  const tracer = sdk.tracerProvider.getTracer('background-job-tracer');
  const span = tracer.startSpan('worker.processJob', {
    attributes: {
      'job.id': jobId,
      'job.type': 'data_import',
      'worker.instance': process.env.HOSTNAME || 'local-worker',
    },
  });

  try {
    console.log(`Processing job ${jobId}...`);
    span.addEvent('job_started');

    // Simulate some work
    await new Promise(resolve => setTimeout(resolve, Math.random() * 500 + 100));

    // Simulate an external API call within the job
    const externalResponse = await axios.get('https://api.example.com/data');
    span.setAttribute('external.api.status', externalResponse.status);
    span.setAttribute('external.api.data_length', externalResponse.data ? JSON.stringify(externalResponse.data
## Anti-Patterns

**Using the service without understanding its pricing model.** Cloud services bill differently — per request, per GB, per seat. Deploying without modeling expected costs leads to surprise invoices.

**Hardcoding configuration instead of using environment variables.** API keys, endpoints, and feature flags change between environments. Hardcoded values break deployments and leak secrets.

**Ignoring the service's rate limits and quotas.** Every external API has throughput limits. Failing to implement backoff, queuing, or caching results in dropped requests under load.

**Treating the service as always available.** External services go down. Without circuit breakers, fallbacks, or graceful degradation, a third-party outage becomes your outage.

**Coupling your architecture to a single provider's API.** Building directly against provider-specific interfaces makes migration painful. Wrap external services in thin adapter layers.

Install this skill directly: skilldb add monitoring-services-skills

Get CLI access →

Related Skills

Baselime

Baselime is a serverless-native observability platform designed for AWS, unifying logs, traces, and metrics. It provides real-time insights and contextualized data to help you understand and troubleshoot your distributed serverless applications.

Monitoring Services•245L

BetterStack

"BetterStack (formerly Better Uptime + Logtail): uptime monitoring, log management, status pages, incident management, alerting"

Monitoring Services•348L

Checkly

"Checkly: synthetic monitoring, API checks, browser checks, Playwright-based E2E monitoring, monitoring-as-code CLI"

Monitoring Services•202L

Cronitor

Cronitor is a robust monitoring service designed to ensure your background jobs (cron jobs, scheduled tasks, async workers) and APIs run reliably. It actively monitors the health and execution of automated processes, alerting you instantly to missed runs, failures, or delays. Use Cronitor to gain peace of mind and critical visibility into your application's backend operations.

Monitoring Services•218L

Datadog

"Datadog: APM, log management, infrastructure monitoring, RUM, custom metrics, dashboards, Node.js tracing"

Monitoring Services•328L

Grafana Cloud

Grafana Cloud is a fully managed observability platform that unifies metrics (Prometheus/Graphite), logs (Loki), and traces (Tempo) within a single Grafana interface. Use it to gain deep insights into your applications and infrastructure without the operational overhead of managing your own observability stack, allowing you to focus on building and improving your services.

Monitoring Services•202L