Skip to main content
Technology & EngineeringAi Llm Services328 lines

Anthropic Claude API

"Anthropic Claude API: messages API, tool use, streaming, vision, system prompts, extended thinking, batches, Node SDK"

Quick Summary25 lines
Claude's API is built around the **Messages API** — a single, clean endpoint for all interactions. Design prompts with clear, direct instructions. Leverage **system prompts** to define persona and constraints separately from user input. Use **tool use** for structured actions and data extraction. Claude excels at following nuanced instructions, long-context reasoning, and careful analysis. Prefer the official TypeScript SDK for streaming helpers, typed responses, and automatic retries.

## Key Points

- **Always set `max_tokens`** — it is required by the API and prevents runaway generation.
- **Use system prompts for persona and rules**, user messages for the actual task. This separation improves instruction following.
- **Check `stop_reason`** — handle `"end_turn"`, `"max_tokens"`, `"tool_use"` appropriately.
- **Use `tool_choice: { type: "tool", name: "..." }`** to force structured extraction without extra text.
- **Enable prompt caching** with `cache_control` on system prompts and long context to reduce latency and cost.
- **Use extended thinking** for complex reasoning tasks — it significantly improves accuracy on math, logic, and multi-step problems.
- **Use batches** for bulk processing — they offer 50% cost savings and higher throughput.
- **Log `usage` fields** including `cache_creation_input_tokens` and `cache_read_input_tokens` for cost tracking.
- **Using `assistant` prefill to put words in Claude's mouth** without understanding that it affects the response distribution. Use it deliberately, not casually.
- **Sending images as text descriptions** instead of actual image content blocks. Claude's vision is highly capable — use it.
- **Not handling `tool_use` stop reasons** — if you provide tools, you must handle the case where the model calls them.
- **Ignoring the content array structure** — responses are arrays of content blocks (text, tool_use, thinking), not a single string.

## Quick Example

```bash
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://...  # optional, for proxies
```
skilldb get ai-llm-services-skills/Anthropic Claude APIFull skill: 328 lines
Paste into your CLAUDE.md or agent config

Anthropic Claude API Skill

Core Philosophy

Claude's API is built around the Messages API — a single, clean endpoint for all interactions. Design prompts with clear, direct instructions. Leverage system prompts to define persona and constraints separately from user input. Use tool use for structured actions and data extraction. Claude excels at following nuanced instructions, long-context reasoning, and careful analysis. Prefer the official TypeScript SDK for streaming helpers, typed responses, and automatic retries.

Setup

Install the SDK and configure your client:

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

// Basic message
const message = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  system: "You are a concise technical writer.",
  messages: [
    { role: "user", content: "Explain the actor model in three sentences." },
  ],
});

console.log(message.content[0].type === "text" ? message.content[0].text : "");

Environment variables:

ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://...  # optional, for proxies

Key Techniques

Streaming Responses

const stream = anthropic.messages.stream({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Write a haiku about TypeScript." }],
});

for await (const event of stream) {
  if (
    event.type === "content_block_delta" &&
    event.delta.type === "text_delta"
  ) {
    process.stdout.write(event.delta.text);
  }
}

// Or use the helper for the final assembled message
const finalMessage = await stream.finalMessage();
console.log("\n\nTokens used:", finalMessage.usage);

Tool Use (Function Calling)

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  tools: [
    {
      name: "search_database",
      description: "Search the product database by query string",
      input_schema: {
        type: "object" as const,
        properties: {
          query: { type: "string", description: "Search query" },
          limit: { type: "number", description: "Max results to return" },
        },
        required: ["query"],
      },
    },
  ],
  messages: [{ role: "user", content: "Find laptops under $1000" }],
});

// Process tool use blocks
for (const block of response.content) {
  if (block.type === "tool_use") {
    const searchInput = block.input as { query: string; limit?: number };
    const results = await searchDatabase(searchInput.query, searchInput.limit);

    // Send tool result back
    const followUp = await anthropic.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1024,
      tools: [/* same tools */],
      messages: [
        { role: "user", content: "Find laptops under $1000" },
        { role: "assistant", content: response.content },
        {
          role: "user",
          content: [
            {
              type: "tool_result",
              tool_use_id: block.id,
              content: JSON.stringify(results),
            },
          ],
        },
      ],
    });
    console.log(followUp.content);
  }
}

Tool Use for Structured Extraction

// Force a tool call to get structured JSON output
const extraction = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  tool_choice: { type: "tool", name: "extract_contact" },
  tools: [
    {
      name: "extract_contact",
      description: "Extract contact information from text",
      input_schema: {
        type: "object" as const,
        properties: {
          name: { type: "string" },
          email: { type: "string" },
          phone: { type: "string" },
          company: { type: "string" },
        },
        required: ["name"],
      },
    },
  ],
  messages: [
    {
      role: "user",
      content:
        "Contact Jane Smith at jane@acme.co or 555-0123. She works at Acme Corp.",
    },
  ],
});

const toolBlock = extraction.content.find((b) => b.type === "tool_use");
if (toolBlock && toolBlock.type === "tool_use") {
  const contact = toolBlock.input; // { name, email, phone, company }
  console.log(contact);
}

Vision (Image Inputs)

import { readFileSync } from "fs";

// From URL
const visionResponse = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "url",
            url: "https://example.com/chart.png",
          },
        },
        { type: "text", text: "Describe the trends shown in this chart." },
      ],
    },
  ],
});

// From base64
const imageData = readFileSync("./screenshot.png").toString("base64");
const base64Response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "base64",
            media_type: "image/png",
            data: imageData,
          },
        },
        { type: "text", text: "What UI issues do you see?" },
      ],
    },
  ],
});

Extended Thinking

const thinkingResponse = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 16000,
  thinking: {
    type: "enabled",
    budget_tokens: 10000,
  },
  messages: [
    {
      role: "user",
      content: "Solve this step by step: If f(x) = x^3 - 6x^2 + 11x - 6, find all roots.",
    },
  ],
});

for (const block of thinkingResponse.content) {
  if (block.type === "thinking") {
    console.log("Reasoning:", block.thinking);
  } else if (block.type === "text") {
    console.log("Answer:", block.text);
  }
}

Message Batches

const batch = await anthropic.beta.messages.batches.create({
  requests: [
    {
      custom_id: "review-1",
      params: {
        model: "claude-sonnet-4-20250514",
        max_tokens: 512,
        messages: [{ role: "user", content: "Summarize: ..." }],
      },
    },
    {
      custom_id: "review-2",
      params: {
        model: "claude-sonnet-4-20250514",
        max_tokens: 512,
        messages: [{ role: "user", content: "Summarize: ..." }],
      },
    },
  ],
});

// Poll for completion
let status = batch;
while (status.processing_status === "in_progress") {
  await new Promise((r) => setTimeout(r, 30000));
  status = await anthropic.beta.messages.batches.retrieve(batch.id);
}

// Stream results
for await (const result of anthropic.beta.messages.batches.results(batch.id)) {
  if (result.result.type === "succeeded") {
    console.log(result.custom_id, result.result.message.content);
  }
}

System Prompts and Multi-Turn Conversations

const conversationHistory: Anthropic.MessageParam[] = [];

async function chat(userMessage: string): Promise<string> {
  conversationHistory.push({ role: "user", content: userMessage });

  const response = await anthropic.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    system: [
      {
        type: "text",
        text: "You are a senior code reviewer. Be direct and specific.",
        cache_control: { type: "ephemeral" },
      },
    ],
    messages: conversationHistory,
  });

  const assistantText =
    response.content[0].type === "text" ? response.content[0].text : "";
  conversationHistory.push({ role: "assistant", content: response.content });

  return assistantText;
}

Best Practices

  • Always set max_tokens — it is required by the API and prevents runaway generation.
  • Use system prompts for persona and rules, user messages for the actual task. This separation improves instruction following.
  • Check stop_reason — handle "end_turn", "max_tokens", "tool_use" appropriately.
  • Use tool_choice: { type: "tool", name: "..." } to force structured extraction without extra text.
  • Enable prompt caching with cache_control on system prompts and long context to reduce latency and cost.
  • Use extended thinking for complex reasoning tasks — it significantly improves accuracy on math, logic, and multi-step problems.
  • Use batches for bulk processing — they offer 50% cost savings and higher throughput.
  • Log usage fields including cache_creation_input_tokens and cache_read_input_tokens for cost tracking.

Anti-Patterns

  • Using assistant prefill to put words in Claude's mouth without understanding that it affects the response distribution. Use it deliberately, not casually.
  • Sending images as text descriptions instead of actual image content blocks. Claude's vision is highly capable — use it.
  • Not handling tool_use stop reasons — if you provide tools, you must handle the case where the model calls them.
  • Ignoring the content array structure — responses are arrays of content blocks (text, tool_use, thinking), not a single string.
  • Recreating conversation history from scratch each turn instead of appending. This wastes cache hits and increases costs.
  • Setting thinking budget too low — if you enable extended thinking, give it enough tokens (at least 5000) to be useful.
  • Mixing tool results with regular user messages — tool results must use the tool_result content block type with the correct tool_use_id.
  • Polling batch status in a tight loop — use 30-60 second intervals. Batches are designed for async workloads.

Install this skill directly: skilldb add ai-llm-services-skills

Get CLI access →