Checkly
"Checkly: synthetic monitoring, API checks, browser checks, Playwright-based E2E monitoring, monitoring-as-code CLI"
You are an expert in integrating Checkly for synthetic application monitoring.
## Key Points
- name: Run Checkly checks
- **Define checks as code next to the feature they monitor** — colocating check files with application routes ensures checks are updated when endpoints change, preventing stale monitors.
- **Run checks in CI before deploying** — use `checkly test` in your pipeline to catch broken endpoints or degraded pages before the deploy reaches production.
- **Use multiple locations for availability** — running checks from at least three geographically distributed regions distinguishes between genuine outages and regional network issues.
- **Hardcoding secrets in check files** — use Checkly environment variables (`{{VAR_NAME}}`) instead of inline credentials; hardcoded secrets end up in version control and the Checkly dashboard.
## Quick Example
```bash
npm install --save-dev checkly
npx checkly init
```
```bash
# .env (for local development and CI)
CHECKLY_API_KEY=cu_xxx
CHECKLY_ACCOUNT_ID=xxx
```skilldb get monitoring-services-skills/ChecklyFull skill: 202 linesCheckly — Application Monitoring
You are an expert in integrating Checkly for synthetic application monitoring.
Core Philosophy
Overview
Checkly is a synthetic monitoring platform that runs API checks and Playwright-based browser checks from multiple global locations on a schedule. It follows a monitoring-as-code approach through its CLI, allowing check definitions to live alongside application source code. Checkly integrates with CI/CD pipelines to validate deployments before promoting them to production.
Setup & Configuration
Install the Checkly CLI
npm install --save-dev checkly
npx checkly init
This creates a checkly.config.ts at the project root:
// checkly.config.ts
import { defineConfig } from "checkly";
import { Frequency } from "checkly/constructs";
export default defineConfig({
projectName: "My App Monitoring",
logicalId: "my-app-monitoring",
repoUrl: "https://github.com/org/my-app",
checks: {
frequency: Frequency.EVERY_5M,
locations: ["us-east-1", "eu-west-1", "ap-southeast-1"],
runtimeId: "2024.02",
tags: ["production"],
checkMatch: "**/__checks__/**/*.check.ts",
browserChecks: {
testMatch: "**/__checks__/**/*.spec.ts",
},
},
cli: {
runLocation: "us-east-1",
},
});
Environment Variables
# .env (for local development and CI)
CHECKLY_API_KEY=cu_xxx
CHECKLY_ACCOUNT_ID=xxx
Core Patterns
API Check
// __checks__/api/health.check.ts
import { ApiCheck, AssertionBuilder } from "checkly/constructs";
new ApiCheck("api-health-check", {
name: "API Health Check",
request: {
method: "GET",
url: "https://api.example.com/health",
assertions: [
AssertionBuilder.statusCode().equals(200),
AssertionBuilder.jsonBody("$.status").equals("ok"),
AssertionBuilder.responseTime().lessThan(2000),
],
followRedirects: true,
},
degradedResponseTime: 1000,
maxResponseTime: 5000,
tags: ["api", "critical"],
});
API Check with Authentication and Setup Script
// __checks__/api/orders.check.ts
import { ApiCheck, AssertionBuilder } from "checkly/constructs";
new ApiCheck("api-orders-list", {
name: "Orders API — List Orders",
request: {
method: "GET",
url: "https://api.example.com/v1/orders?limit=5",
headers: [
{ key: "Authorization", value: "Bearer {{CHECKLY_API_TOKEN}}" },
{ key: "Content-Type", value: "application/json" },
],
assertions: [
AssertionBuilder.statusCode().equals(200),
AssertionBuilder.jsonBody("$.data").isNotEmpty(),
AssertionBuilder.jsonBody("$.data[0].id").isNotNull(),
AssertionBuilder.responseTime().lessThan(3000),
],
},
setupScript: {
content: `
const axios = require("axios").default;
const res = await axios.post("https://auth.example.com/oauth/token", {
client_id: process.env.AUTH_CLIENT_ID,
client_secret: process.env.AUTH_CLIENT_SECRET,
grant_type: "client_credentials",
});
request.headers["Authorization"] = "Bearer " + res.data.access_token;
`,
},
tags: ["api", "orders"],
});
Browser Check (Playwright)
// __checks__/browser/login-flow.spec.ts
import { test, expect } from "@playwright/test";
test("user can log in and see dashboard", async ({ page }) => {
await page.goto("https://app.example.com/login");
await page.fill('[data-testid="email"]', process.env.TEST_USER_EMAIL!);
await page.fill('[data-testid="password"]', process.env.TEST_USER_PASSWORD!);
await page.click('[data-testid="login-button"]');
await expect(page).toHaveURL(/\/dashboard/);
await expect(page.locator("h1")).toContainText("Dashboard");
await expect(page.locator('[data-testid="user-menu"]')).toBeVisible();
});
Alert Channels and Escalation
// __checks__/alert-channels.ts
import { SlackAlertChannel, EmailAlertChannel } from "checkly/constructs";
export const slackChannel = new SlackAlertChannel("slack-engineering", {
webhookUrl: "https://hooks.slack.com/services/T00/B00/xxx",
channel: "#engineering-alerts",
sendRecovery: true,
sendFailure: true,
sendDegraded: false,
});
export const emailChannel = new EmailAlertChannel("email-oncall", {
address: "oncall@example.com",
sendRecovery: true,
sendFailure: true,
});
CI/CD Integration
# .github/workflows/deploy.yml
- name: Run Checkly checks
uses: checkly/checkly-github-action@v1
with:
apiKey: ${{ secrets.CHECKLY_API_KEY }}
accountId: ${{ secrets.CHECKLY_ACCOUNT_ID }}
command: "npx checkly test --record"
# Fail the deployment if any check fails
# Deploy checks to Checkly after merge
npx checkly deploy --force
Best Practices
- Define checks as code next to the feature they monitor — colocating check files with application routes ensures checks are updated when endpoints change, preventing stale monitors.
- Run checks in CI before deploying — use
checkly testin your pipeline to catch broken endpoints or degraded pages before the deploy reaches production. - Use multiple locations for availability — running checks from at least three geographically distributed regions distinguishes between genuine outages and regional network issues.
Common Pitfalls
- Hardcoding secrets in check files — use Checkly environment variables (
{{VAR_NAME}}) instead of inline credentials; hardcoded secrets end up in version control and the Checkly dashboard. - Setting response time thresholds too tight — synthetic checks run from external locations and include network latency; setting maxResponseTime to 500ms causes false positives from distant regions. Start with 2-3 seconds and tighten based on observed baselines.
Anti-Patterns
Using the service without understanding its pricing model. Cloud services bill differently — per request, per GB, per seat. Deploying without modeling expected costs leads to surprise invoices.
Hardcoding configuration instead of using environment variables. API keys, endpoints, and feature flags change between environments. Hardcoded values break deployments and leak secrets.
Ignoring the service's rate limits and quotas. Every external API has throughput limits. Failing to implement backoff, queuing, or caching results in dropped requests under load.
Treating the service as always available. External services go down. Without circuit breakers, fallbacks, or graceful degradation, a third-party outage becomes your outage.
Coupling your architecture to a single provider's API. Building directly against provider-specific interfaces makes migration painful. Wrap external services in thin adapter layers.
Install this skill directly: skilldb add monitoring-services-skills
Related Skills
Baselime
Baselime is a serverless-native observability platform designed for AWS, unifying logs, traces, and metrics. It provides real-time insights and contextualized data to help you understand and troubleshoot your distributed serverless applications.
BetterStack
"BetterStack (formerly Better Uptime + Logtail): uptime monitoring, log management, status pages, incident management, alerting"
Cronitor
Cronitor is a robust monitoring service designed to ensure your background jobs (cron jobs, scheduled tasks, async workers) and APIs run reliably. It actively monitors the health and execution of automated processes, alerting you instantly to missed runs, failures, or delays. Use Cronitor to gain peace of mind and critical visibility into your application's backend operations.
Datadog
"Datadog: APM, log management, infrastructure monitoring, RUM, custom metrics, dashboards, Node.js tracing"
Grafana Cloud
Grafana Cloud is a fully managed observability platform that unifies metrics (Prometheus/Graphite), logs (Loki), and traces (Tempo) within a single Grafana interface. Use it to gain deep insights into your applications and infrastructure without the operational overhead of managing your own observability stack, allowing you to focus on building and improving your services.
Highlight.io
"Highlight.io: open-source monitoring, session replay, error tracking, logging, tracing, Next.js SDK, self-hosted option"