Honeycomb
Integrate Honeycomb for event-driven observability with high-cardinality tracing.
You are an expert in Honeycomb observability. You help developers instrument applications with structured events, explore high-cardinality data, use BubbleUp for anomaly detection, define SLOs, and configure triggers for alerting.
## Key Points
- **Pre-aggregating before sending** - Honeycomb works best with raw events; do not average or count before ingestion. Let Honeycomb aggregate at query time.
- **Ignoring BubbleUp** - When investigating slowness, use BubbleUp to automatically compare slow vs. fast requests instead of manually guessing dimensions.
- **Not setting SLOs** - Without SLOs, you lack burn rate alerts and budget tracking. Define SLOs for every critical user journey.
- **Low-cardinality-only attributes** - Honeycomb is designed for high-cardinality data. Include user IDs, request IDs, and build SHAs to unlock powerful debugging.
- You need to debug complex distributed systems with high-cardinality trace data.
- You want BubbleUp-style anomaly detection that automatically surfaces what changed.
- You are adopting SLO-based reliability practices with error budget tracking.
- You prefer an OpenTelemetry-native backend with first-class OTel SDK support.
- You need to answer novel questions about production without pre-defining dashboards.
## Quick Example
```bash
npm install @honeycombio/opentelemetry-node @opentelemetry/auto-instrumentations-node
```
```typescript
// WRONG - loses all queryability and BubbleUp power
sendEvent({ message: 'checkout completed' });
// No fields to group by, filter, or heatmap
```skilldb get observability-services-skills/HoneycombFull skill: 223 linesHoneycomb Integration
You are an expert in Honeycomb observability. You help developers instrument applications with structured events, explore high-cardinality data, use BubbleUp for anomaly detection, define SLOs, and configure triggers for alerting.
Core Philosophy
Events Over Metrics
Honeycomb stores wide structured events, not pre-aggregated metrics. Every request becomes a rich event with dozens of fields, enabling ad-hoc exploration without knowing questions in advance.
High-Cardinality is Welcome
Unlike metrics systems, Honeycomb handles high-cardinality fields (user IDs, request IDs, build SHAs) natively. Query any field without cardinality explosions.
Explore, Then Alert
Start with BubbleUp to discover what changed, then codify findings into SLOs and triggers. Observability is about asking new questions, not just monitoring known failures.
Setup
Install the Honeycomb OpenTelemetry SDK:
npm install @honeycombio/opentelemetry-node @opentelemetry/auto-instrumentations-node
Initialize early in your application:
// tracing.ts - import BEFORE any other module
import { HoneycombSDK } from '@honeycombio/opentelemetry-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
const sdk = new HoneycombSDK({
apiKey: process.env.HONEYCOMB_API_KEY,
serviceName: process.env.OTEL_SERVICE_NAME || 'my-service',
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
process.on('SIGTERM', () => sdk.shutdown());
Or use the Honeycomb HTTP API directly for custom events:
const HONEYCOMB_API_KEY = process.env.HONEYCOMB_API_KEY!;
const DATASET = 'my-service';
async function sendEvent(data: Record<string, unknown>) {
await fetch(`https://api.honeycomb.io/1/events/${DATASET}`, {
method: 'POST',
headers: {
'X-Honeycomb-Team': HONEYCOMB_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({
time: new Date().toISOString(),
...data,
}),
});
}
Key Patterns
Do: Send wide events with business and technical context
import { trace } from '@opentelemetry/api';
async function handleCheckout(req: Request) {
const span = trace.getActiveSpan();
if (span) {
span.setAttribute('user.id', req.userId);
span.setAttribute('user.plan', req.userPlan);
span.setAttribute('cart.item_count', req.cart.items.length);
span.setAttribute('cart.total_cents', req.cart.totalCents);
span.setAttribute('feature_flags.new_checkout', isEnabled('new_checkout'));
span.setAttribute('build.sha', process.env.GIT_SHA || 'unknown');
}
const result = await processCheckout(req.cart);
if (span) {
span.setAttribute('checkout.payment_method', result.paymentMethod);
span.setAttribute('checkout.duration_ms', result.durationMs);
}
return result;
}
Not: Sending narrow events with only a message string
// WRONG - loses all queryability and BubbleUp power
sendEvent({ message: 'checkout completed' });
// No fields to group by, filter, or heatmap
Do: Add derived columns for computed fields in Honeycomb UI
In the Honeycomb UI, create derived columns for reusable calculations:
// Derived column: is_slow
IF(GTE($duration_ms, 1000), true, false)
// Derived column: error_category
IF(STARTS_WITH($error.message, "timeout"), "timeout",
IF(STARTS_WITH($error.message, "connection"), "connection",
"other"))
Common Patterns
Honeycomb Query via API
curl -X POST "https://api.honeycomb.io/1/queries/my-service" \
-H "X-Honeycomb-Team: $HONEYCOMB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"calculations": [
{ "op": "COUNT" },
{ "op": "P99", "column": "duration_ms" },
{ "op": "AVG", "column": "duration_ms" }
],
"filters": [
{ "column": "http.status_code", "op": ">=", "value": 500 }
],
"breakdowns": ["http.route", "service.name"],
"time_range": 3600,
"granularity": 60
}'
SLO Definition
curl -X POST "https://api.honeycomb.io/1/slos/my-service" \
-H "X-Honeycomb-Team: $HONEYCOMB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Checkout Latency SLO",
"description": "99% of checkout requests complete in under 2 seconds",
"sli": {
"alias": "checkout_fast_enough",
"filters": [
{ "column": "name", "op": "=", "value": "POST /checkout" }
],
"column": "duration_ms",
"op": "<",
"value": 2000
},
"target_per_million": 990000,
"time_period_days": 30
}'
Trigger (Alert) Configuration
curl -X POST "https://api.honeycomb.io/1/triggers/my-service" \
-H "X-Honeycomb-Team: $HONEYCOMB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "High Error Rate",
"query": {
"calculations": [{ "op": "COUNT" }],
"filters": [
{ "column": "error", "op": "exists" }
],
"time_range": 600
},
"frequency": 300,
"threshold": {
"op": ">",
"value": 100
},
"recipients": [
{ "type": "slack", "target": "#oncall-alerts" }
]
}'
Batch Event Ingestion
async function sendBatch(events: Record<string, unknown>[]) {
await fetch(`https://api.honeycomb.io/1/batch/${DATASET}`, {
method: 'POST',
headers: {
'X-Honeycomb-Team': HONEYCOMB_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify(events.map((data) => ({
time: new Date().toISOString(),
data,
}))),
});
}
Anti-Patterns
- Pre-aggregating before sending - Honeycomb works best with raw events; do not average or count before ingestion. Let Honeycomb aggregate at query time.
- Ignoring BubbleUp - When investigating slowness, use BubbleUp to automatically compare slow vs. fast requests instead of manually guessing dimensions.
- Not setting SLOs - Without SLOs, you lack burn rate alerts and budget tracking. Define SLOs for every critical user journey.
- Low-cardinality-only attributes - Honeycomb is designed for high-cardinality data. Include user IDs, request IDs, and build SHAs to unlock powerful debugging.
When to Use
- You need to debug complex distributed systems with high-cardinality trace data.
- You want BubbleUp-style anomaly detection that automatically surfaces what changed.
- You are adopting SLO-based reliability practices with error budget tracking.
- You prefer an OpenTelemetry-native backend with first-class OTel SDK support.
- You need to answer novel questions about production without pre-defining dashboards.
Install this skill directly: skilldb add observability-services-skills
Related Skills
Axiom
Integrate Axiom for log management, analytics, and real-time dashboards.
Elastic Apm
Instrument applications with Elastic APM and the ELK Stack for traces, logs, and metrics.
Grafana
Build Grafana dashboards, configure data sources, and set up alerting rules.
Jaeger
Deploy and integrate Jaeger for distributed tracing across microservices.
New Relic
Integrate New Relic APM for application performance monitoring and distributed tracing.
Opentelemetry
Instrument applications with OpenTelemetry for distributed traces, metrics, and logs.