Anthropic API
Anthropic Claude API integration for messages, streaming, and tool use
You are an expert in Anthropic Claude API integration for building LLM-powered applications.
## Key Points
- Use the `system` parameter for instructions rather than injecting a system message into the messages array.
- Always set `max_tokens` explicitly; there is no default and it is a required parameter.
- Use streaming for any user-facing interaction to reduce perceived latency.
- Handle all content block types (`text`, `tool_use`, `thinking`) in responses.
- Use `stop_reason` to determine whether to continue a tool-use loop or return.
- Prefer Claude's native tool use over prompt-based function calling for reliability.
- Cache the Anthropic client instance; do not create a new one per request.
- Use model strings from a config or constant to simplify upgrades across the codebase.
- Forgetting that `max_tokens` is required, not optional, in the Anthropic API.
- Treating `response.content` as a string; it is an array of typed content blocks.
- Not handling `tool_use` stop reasons, causing the agent to silently drop tool call requests.
- Sending `tool_result` blocks without matching `tool_use_id` causes validation errors.
## Quick Example
```typescript
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
```skilldb get llm-integration-skills/Anthropic APIFull skill: 272 linesAnthropic API — LLM Integration
You are an expert in Anthropic Claude API integration for building LLM-powered applications.
Overview
The Anthropic API provides access to Claude models through the Messages API. It features a distinct system prompt mechanism, native tool use, streaming via server-sent events, and a focus on safe, steerable AI outputs. The SDK is available for Python and TypeScript.
Core Concepts
Authentication and Client Setup
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
Messages API
The Messages API uses a system parameter separate from the messages array:
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
system: "You are a senior software engineer who gives concise code reviews.",
messages: [
{ role: "user", content: "Review this function: function add(a, b) { return a + b; }" },
],
});
console.log(response.content[0].type === "text" ? response.content[0].text : "");
Content Blocks
Claude responses use typed content blocks rather than a single string:
interface TextBlock {
type: "text";
text: string;
}
interface ToolUseBlock {
type: "tool_use";
id: string;
name: string;
input: Record<string, unknown>;
}
// Extract text from response
function extractText(response: Anthropic.Message): string {
return response.content
.filter((block): block is Anthropic.TextBlock => block.type === "text")
.map((block) => block.text)
.join("\n");
}
Streaming
const stream = anthropic.messages.stream({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: "Write a haiku about TypeScript." }],
});
stream.on("text", (text) => {
process.stdout.write(text);
});
const finalMessage = await stream.finalMessage();
console.log("\nStop reason:", finalMessage.stop_reason);
Tool Use (Function Calling)
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
tools: [
{
name: "get_weather",
description: "Get the current weather for a given location.",
input_schema: {
type: "object" as const,
properties: {
location: { type: "string", description: "City and state, e.g. San Francisco, CA" },
},
required: ["location"],
},
},
],
messages: [{ role: "user", content: "What's the weather in Boston?" }],
});
// Check if the model wants to use a tool
for (const block of response.content) {
if (block.type === "tool_use") {
console.log(`Tool: ${block.name}, Input:`, block.input);
}
}
Multi-Turn Tool Use Loop
async function agentLoop(userMessage: string): Promise<string> {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 4096,
tools: toolDefinitions,
messages,
});
messages.push({ role: "assistant", content: response.content });
if (response.stop_reason === "end_turn") {
return extractText(response);
}
if (response.stop_reason === "tool_use") {
const toolResults: Anthropic.ToolResultBlockParam[] = [];
for (const block of response.content) {
if (block.type === "tool_use") {
const result = await executeToolCall(block.name, block.input);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: JSON.stringify(result),
});
}
}
messages.push({ role: "user", content: toolResults });
}
}
}
Implementation Patterns
Vision / Image Input
import * as fs from "fs";
const imageData = fs.readFileSync("diagram.png").toString("base64");
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [
{
role: "user",
content: [
{
type: "image",
source: { type: "base64", media_type: "image/png", data: imageData },
},
{ type: "text", text: "Describe what you see in this diagram." },
],
},
],
});
Error Handling
import Anthropic from "@anthropic-ai/sdk";
async function safeChatCall(prompt: string): Promise<string> {
try {
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: prompt }],
});
return extractText(response);
} catch (error) {
if (error instanceof Anthropic.RateLimitError) {
// Back off and retry
await new Promise((r) => setTimeout(r, 5000));
return safeChatCall(prompt);
}
if (error instanceof Anthropic.AuthenticationError) {
throw new Error("Invalid API key. Check ANTHROPIC_API_KEY.");
}
throw error;
}
}
Extended Thinking
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 16000,
thinking: {
type: "enabled",
budget_tokens: 10000,
},
messages: [{ role: "user", content: "Solve this step by step: ..." }],
});
for (const block of response.content) {
if (block.type === "thinking") {
console.log("Thinking:", block.thinking);
} else if (block.type === "text") {
console.log("Answer:", block.text);
}
}
Best Practices
- Use the
systemparameter for instructions rather than injecting a system message into the messages array. - Always set
max_tokensexplicitly; there is no default and it is a required parameter. - Use streaming for any user-facing interaction to reduce perceived latency.
- Handle all content block types (
text,tool_use,thinking) in responses. - Use
stop_reasonto determine whether to continue a tool-use loop or return. - Prefer Claude's native tool use over prompt-based function calling for reliability.
- Cache the Anthropic client instance; do not create a new one per request.
- Use model strings from a config or constant to simplify upgrades across the codebase.
Core Philosophy
The Anthropic API is designed around the principle that the structure of the interaction should be explicit, not implicit. The system prompt is a separate parameter because it serves a fundamentally different purpose than user messages. Content blocks are typed objects because a response may contain text, tool calls, and thinking traces simultaneously. This explicitness adds verbosity compared to string-in-string-out APIs, but it eliminates the ambiguity that causes bugs in production systems.
Streaming should be the default for any user-facing integration. Claude models generate tokens sequentially, and waiting for the complete response before showing anything to the user wastes seconds of perceived latency. The streaming API delivers tokens as they are produced, making the application feel responsive from the first token. The non-streaming API is appropriate only for background processing, batch jobs, or cases where the full response must be validated before display.
Tool use is not a workaround -- it is the intended mechanism for structured interaction between Claude and external systems. When you need Claude to call an API, query a database, or perform a calculation, define a tool with a typed schema. This is more reliable than asking Claude to output a JSON blob that you parse and dispatch yourself, because the tool use protocol separates the model's decision to call a function from its generation of arguments, and the SDK validates both. Build your agent loops around stop_reason, not around parsing heuristics.
Anti-Patterns
-
Treating
response.contentas a string: Accessingresponse.contentand expecting a plain string. It is an array of typed content blocks (text,tool_use,thinking). Code that indexes into it without type-checking will fail when the response contains tool calls or thinking blocks. -
Putting system instructions in the messages array: Adding a message with
role: "system"to the messages array instead of using the top-levelsystemparameter. The Anthropic API does not support a system role in messages; this either causes an error or silently mishandles the instruction. -
Creating a new client instance per request: Instantiating
new Anthropic()on every API call instead of reusing a single client. This wastes connection setup overhead and prevents the SDK from managing connection pooling and retry state effectively. -
Ignoring
stop_reasonin tool use loops: Checking for tool use blocks in the response content without also checkingstop_reason === "tool_use". The model may include text alongside tool calls, and the stop reason is the authoritative signal for whether the loop should continue. -
Hardcoding model identifiers throughout the codebase: Scattering
"claude-sonnet-4-20250514"across dozens of files instead of centralizing it in a configuration constant. When a new model version is released, every callsite must be updated individually.
Common Pitfalls
- Forgetting that
max_tokensis required, not optional, in the Anthropic API. - Treating
response.contentas a string; it is an array of typed content blocks. - Not handling
tool_usestop reasons, causing the agent to silently drop tool call requests. - Sending
tool_resultblocks without matchingtool_use_idcauses validation errors. - Exceeding the model's context window without trimming conversation history.
- Using
role: "system"in the messages array instead of the top-levelsystemparameter. - Not accounting for extended thinking tokens when calculating cost and latency budgets.
Install this skill directly: skilldb add llm-integration-skills
Related Skills
Embeddings
Text embeddings and semantic search with vector databases for LLM applications
Function Calling
Function/tool calling patterns for connecting LLMs to external APIs and data sources
Langchain
LangChain orchestration for chains, agents, memory, and retrieval workflows
Openai API
OpenAI API integration patterns for chat completions, embeddings, and assistants
Rag Pipeline
Building retrieval-augmented generation pipelines with document ingestion, retrieval, and synthesis
Streaming
Streaming LLM responses with SSE, WebSockets, and backpressure handling