Skip to main content
Technology & EngineeringLlm Integration272 lines

Anthropic API

Anthropic Claude API integration for messages, streaming, and tool use

Quick Summary28 lines
You are an expert in Anthropic Claude API integration for building LLM-powered applications.

## Key Points

- Use the `system` parameter for instructions rather than injecting a system message into the messages array.
- Always set `max_tokens` explicitly; there is no default and it is a required parameter.
- Use streaming for any user-facing interaction to reduce perceived latency.
- Handle all content block types (`text`, `tool_use`, `thinking`) in responses.
- Use `stop_reason` to determine whether to continue a tool-use loop or return.
- Prefer Claude's native tool use over prompt-based function calling for reliability.
- Cache the Anthropic client instance; do not create a new one per request.
- Use model strings from a config or constant to simplify upgrades across the codebase.
- Forgetting that `max_tokens` is required, not optional, in the Anthropic API.
- Treating `response.content` as a string; it is an array of typed content blocks.
- Not handling `tool_use` stop reasons, causing the agent to silently drop tool call requests.
- Sending `tool_result` blocks without matching `tool_use_id` causes validation errors.

## Quick Example

```typescript
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});
```
skilldb get llm-integration-skills/Anthropic APIFull skill: 272 lines
Paste into your CLAUDE.md or agent config

Anthropic API — LLM Integration

You are an expert in Anthropic Claude API integration for building LLM-powered applications.

Overview

The Anthropic API provides access to Claude models through the Messages API. It features a distinct system prompt mechanism, native tool use, streaming via server-sent events, and a focus on safe, steerable AI outputs. The SDK is available for Python and TypeScript.

Core Concepts

Authentication and Client Setup

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

Messages API

The Messages API uses a system parameter separate from the messages array:

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  system: "You are a senior software engineer who gives concise code reviews.",
  messages: [
    { role: "user", content: "Review this function: function add(a, b) { return a + b; }" },
  ],
});

console.log(response.content[0].type === "text" ? response.content[0].text : "");

Content Blocks

Claude responses use typed content blocks rather than a single string:

interface TextBlock {
  type: "text";
  text: string;
}

interface ToolUseBlock {
  type: "tool_use";
  id: string;
  name: string;
  input: Record<string, unknown>;
}

// Extract text from response
function extractText(response: Anthropic.Message): string {
  return response.content
    .filter((block): block is Anthropic.TextBlock => block.type === "text")
    .map((block) => block.text)
    .join("\n");
}

Streaming

const stream = anthropic.messages.stream({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Write a haiku about TypeScript." }],
});

stream.on("text", (text) => {
  process.stdout.write(text);
});

const finalMessage = await stream.finalMessage();
console.log("\nStop reason:", finalMessage.stop_reason);

Tool Use (Function Calling)

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  tools: [
    {
      name: "get_weather",
      description: "Get the current weather for a given location.",
      input_schema: {
        type: "object" as const,
        properties: {
          location: { type: "string", description: "City and state, e.g. San Francisco, CA" },
        },
        required: ["location"],
      },
    },
  ],
  messages: [{ role: "user", content: "What's the weather in Boston?" }],
});

// Check if the model wants to use a tool
for (const block of response.content) {
  if (block.type === "tool_use") {
    console.log(`Tool: ${block.name}, Input:`, block.input);
  }
}

Multi-Turn Tool Use Loop

async function agentLoop(userMessage: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await anthropic.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      tools: toolDefinitions,
      messages,
    });

    messages.push({ role: "assistant", content: response.content });

    if (response.stop_reason === "end_turn") {
      return extractText(response);
    }

    if (response.stop_reason === "tool_use") {
      const toolResults: Anthropic.ToolResultBlockParam[] = [];
      for (const block of response.content) {
        if (block.type === "tool_use") {
          const result = await executeToolCall(block.name, block.input);
          toolResults.push({
            type: "tool_result",
            tool_use_id: block.id,
            content: JSON.stringify(result),
          });
        }
      }
      messages.push({ role: "user", content: toolResults });
    }
  }
}

Implementation Patterns

Vision / Image Input

import * as fs from "fs";

const imageData = fs.readFileSync("diagram.png").toString("base64");

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: { type: "base64", media_type: "image/png", data: imageData },
        },
        { type: "text", text: "Describe what you see in this diagram." },
      ],
    },
  ],
});

Error Handling

import Anthropic from "@anthropic-ai/sdk";

async function safeChatCall(prompt: string): Promise<string> {
  try {
    const response = await anthropic.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1024,
      messages: [{ role: "user", content: prompt }],
    });
    return extractText(response);
  } catch (error) {
    if (error instanceof Anthropic.RateLimitError) {
      // Back off and retry
      await new Promise((r) => setTimeout(r, 5000));
      return safeChatCall(prompt);
    }
    if (error instanceof Anthropic.AuthenticationError) {
      throw new Error("Invalid API key. Check ANTHROPIC_API_KEY.");
    }
    throw error;
  }
}

Extended Thinking

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 16000,
  thinking: {
    type: "enabled",
    budget_tokens: 10000,
  },
  messages: [{ role: "user", content: "Solve this step by step: ..." }],
});

for (const block of response.content) {
  if (block.type === "thinking") {
    console.log("Thinking:", block.thinking);
  } else if (block.type === "text") {
    console.log("Answer:", block.text);
  }
}

Best Practices

  • Use the system parameter for instructions rather than injecting a system message into the messages array.
  • Always set max_tokens explicitly; there is no default and it is a required parameter.
  • Use streaming for any user-facing interaction to reduce perceived latency.
  • Handle all content block types (text, tool_use, thinking) in responses.
  • Use stop_reason to determine whether to continue a tool-use loop or return.
  • Prefer Claude's native tool use over prompt-based function calling for reliability.
  • Cache the Anthropic client instance; do not create a new one per request.
  • Use model strings from a config or constant to simplify upgrades across the codebase.

Core Philosophy

The Anthropic API is designed around the principle that the structure of the interaction should be explicit, not implicit. The system prompt is a separate parameter because it serves a fundamentally different purpose than user messages. Content blocks are typed objects because a response may contain text, tool calls, and thinking traces simultaneously. This explicitness adds verbosity compared to string-in-string-out APIs, but it eliminates the ambiguity that causes bugs in production systems.

Streaming should be the default for any user-facing integration. Claude models generate tokens sequentially, and waiting for the complete response before showing anything to the user wastes seconds of perceived latency. The streaming API delivers tokens as they are produced, making the application feel responsive from the first token. The non-streaming API is appropriate only for background processing, batch jobs, or cases where the full response must be validated before display.

Tool use is not a workaround -- it is the intended mechanism for structured interaction between Claude and external systems. When you need Claude to call an API, query a database, or perform a calculation, define a tool with a typed schema. This is more reliable than asking Claude to output a JSON blob that you parse and dispatch yourself, because the tool use protocol separates the model's decision to call a function from its generation of arguments, and the SDK validates both. Build your agent loops around stop_reason, not around parsing heuristics.

Anti-Patterns

  • Treating response.content as a string: Accessing response.content and expecting a plain string. It is an array of typed content blocks (text, tool_use, thinking). Code that indexes into it without type-checking will fail when the response contains tool calls or thinking blocks.

  • Putting system instructions in the messages array: Adding a message with role: "system" to the messages array instead of using the top-level system parameter. The Anthropic API does not support a system role in messages; this either causes an error or silently mishandles the instruction.

  • Creating a new client instance per request: Instantiating new Anthropic() on every API call instead of reusing a single client. This wastes connection setup overhead and prevents the SDK from managing connection pooling and retry state effectively.

  • Ignoring stop_reason in tool use loops: Checking for tool use blocks in the response content without also checking stop_reason === "tool_use". The model may include text alongside tool calls, and the stop reason is the authoritative signal for whether the loop should continue.

  • Hardcoding model identifiers throughout the codebase: Scattering "claude-sonnet-4-20250514" across dozens of files instead of centralizing it in a configuration constant. When a new model version is released, every callsite must be updated individually.

Common Pitfalls

  • Forgetting that max_tokens is required, not optional, in the Anthropic API.
  • Treating response.content as a string; it is an array of typed content blocks.
  • Not handling tool_use stop reasons, causing the agent to silently drop tool call requests.
  • Sending tool_result blocks without matching tool_use_id causes validation errors.
  • Exceeding the model's context window without trimming conversation history.
  • Using role: "system" in the messages array instead of the top-level system parameter.
  • Not accounting for extended thinking tokens when calculating cost and latency budgets.

Install this skill directly: skilldb add llm-integration-skills

Get CLI access →