Skip to main content
Technology & EngineeringAi Agent Orchestration415 lines

agent-with-claude

Building agents specifically with the Claude API: extended thinking for complex reasoning, tool use patterns, computer use for browser/desktop automation, multi-turn conversation management, crafting system prompts for agents, and streaming agent responses. Covers Claude-specific features and best practices for building reliable autonomous agents.

Quick Summary18 lines
Leverage Claude-specific API features to build powerful agents: extended thinking, tool use, computer use, and streaming.

## Key Points

- **Complex tool selection**: When the agent has many tools and must choose carefully.
- **Multi-step reasoning**: Math, logic, code analysis, planning.
- **Error recovery**: After a tool failure, extra thinking helps the agent reason about alternatives.
- **Final synthesis**: The last turn where the agent produces its final answer.
1. Understand the task by reading relevant files first.
2. Plan your changes before writing any code.
3. Make changes incrementally — small edits, then test.
4. After writing code, ALWAYS run it to verify it works.
5. If tests fail, read the error, fix the issue, and re-run.
- Never modify files outside the project directory.
- Always run existing tests after making changes.
- If you are unsure about something, search the codebase first.
skilldb get ai-agent-orchestration-skills/agent-with-claudeFull skill: 415 lines
Paste into your CLAUDE.md or agent config

Building Agents with Claude

Leverage Claude-specific API features to build powerful agents: extended thinking, tool use, computer use, and streaming.


Basic Claude Agent Setup

import anthropic

client = anthropic.Anthropic()


def claude_agent(task: str, tools: list[dict], system: str,
                 model: str = "claude-sonnet-4-20250514",
                 max_steps: int = 20) -> str:
    """Standard Claude agent loop."""
    messages = [{"role": "user", "content": task}]

    for step in range(max_steps):
        response = client.messages.create(
            model=model,
            max_tokens=4096,
            system=system,
            tools=tools,
            messages=messages,
        )

        # Done when model responds without tool use
        if response.stop_reason == "end_turn":
            return next(
                (b.text for b in response.content if b.type == "text"), ""
            )

        # Execute tools and continue
        messages.append({"role": "assistant", "content": response.content})
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })
        messages.append({"role": "user", "content": tool_results})

    return "Reached max steps."

Extended Thinking for Complex Reasoning

Extended thinking lets Claude reason deeply before responding. Critical for agents tackling complex, multi-step problems.

def agent_with_thinking(task: str, tools: list[dict],
                        budget_tokens: int = 10000) -> str:
    """Agent that uses extended thinking for better reasoning."""
    messages = [{"role": "user", "content": task}]

    for _ in range(15):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=16000,
            thinking={
                "type": "enabled",
                "budget_tokens": budget_tokens,
            },
            tools=tools,
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            # Extract text (thinking blocks are also present but we want the final text)
            return next(
                (b.text for b in response.content if b.type == "text"), ""
            )

        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })

        messages.append({"role": "user", "content": tool_results})

    return "Max iterations reached."

When to use extended thinking in agents:

  • Complex tool selection: When the agent has many tools and must choose carefully.
  • Multi-step reasoning: Math, logic, code analysis, planning.
  • Error recovery: After a tool failure, extra thinking helps the agent reason about alternatives.
  • Final synthesis: The last turn where the agent produces its final answer.

Dynamic Thinking Budget

def adaptive_thinking_agent(task: str, tools: list[dict]) -> str:
    """Adjust thinking budget based on task phase."""
    messages = [{"role": "user", "content": task}]
    step = 0

    for _ in range(20):
        step += 1

        # More thinking budget for first step (planning) and every 5th step (checkpoint)
        if step == 1 or step % 5 == 0:
            budget = 15000
        else:
            budget = 5000

        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=16000,
            thinking={"type": "enabled", "budget_tokens": budget},
            tools=tools,
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            return extract_text(response)

        messages.append({"role": "assistant", "content": response.content})
        tool_results = execute_all_tools(response)
        messages.append({"role": "user", "content": tool_results})

    return "Max steps reached."

System Prompts for Agents

System prompts define agent behavior. Be specific about capabilities, constraints, and expected patterns.

CODING_AGENT_SYSTEM = """You are an expert coding agent. You solve programming tasks by reading, writing, and running code.

## Capabilities
You have these tools: read_file, write_file, run_command, search_files.

## Workflow
1. Understand the task by reading relevant files first.
2. Plan your changes before writing any code.
3. Make changes incrementally — small edits, then test.
4. After writing code, ALWAYS run it to verify it works.
5. If tests fail, read the error, fix the issue, and re-run.

## Rules
- Never modify files outside the project directory.
- Always run existing tests after making changes.
- If you are unsure about something, search the codebase first.
- When you are done, provide a summary of what you changed and why.

## Error Handling
- If a command fails, read the error message carefully before retrying.
- If you are stuck after 3 attempts, explain what is going wrong and ask for help.
- Never run destructive commands (rm -rf, drop database, etc.) without confirmation."""


RESEARCH_AGENT_SYSTEM = """You are a research agent. You find, verify, and synthesize information.

## Workflow
1. Break the research question into specific sub-questions.
2. Search for information using available tools.
3. Cross-reference findings across multiple sources.
4. Note any contradictions or uncertainty.
5. Synthesize findings into a clear, cited answer.

## Rules
- Always cite your sources with URLs.
- Distinguish between facts, estimates, and opinions.
- If information is outdated or uncertain, say so explicitly.
- Never fabricate sources or statistics."""

Multi-Turn Conversation Management

For agents that run many turns, manage the message history to stay within context limits.

class ClaudeConversation:
    """Manage multi-turn conversations with context window awareness."""

    def __init__(self, model: str = "claude-sonnet-4-20250514",
                 max_context_tokens: int = 180000):
        self.model = model
        self.max_context = max_context_tokens
        self.messages: list[dict] = []
        self.system: str = ""

    def _estimate_tokens(self) -> int:
        import json
        return len(json.dumps(self.messages)) // 4

    def add_user(self, content):
        self.messages.append({"role": "user", "content": content})

    def add_assistant(self, content):
        self.messages.append({"role": "assistant", "content": content})

    def trim_if_needed(self):
        """Remove old messages if approaching context limit."""
        while self._estimate_tokens() > self.max_context * 0.8:
            if len(self.messages) <= 2:
                break
            # Remove the second message (keep the first user message)
            removed = self.messages.pop(1)
            # If we removed a user message, also remove the next (assistant response)
            if removed["role"] == "user" and self.messages[1:]:
                if self.messages[1]["role"] == "assistant":
                    self.messages.pop(1)

    def send(self, tools: list[dict] = None, **kwargs) -> anthropic.types.Message:
        self.trim_if_needed()
        params = {
            "model": self.model,
            "max_tokens": kwargs.get("max_tokens", 4096),
            "messages": self.messages,
        }
        if self.system:
            params["system"] = self.system
        if tools:
            params["tools"] = tools
        return client.messages.create(**params)

Computer Use

Claude can interact with computers through screenshots and mouse/keyboard actions.

def computer_use_agent(task: str) -> str:
    """Agent that uses Claude's computer use capability."""
    computer_tool = {
        "type": "computer_20250124",
        "name": "computer",
        "display_width_px": 1920,
        "display_height_px": 1080,
        "display_number": 1,
    }

    messages = [{"role": "user", "content": task}]

    for _ in range(30):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=[computer_tool],
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            return extract_text(response)

        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                # Execute the computer action and take a screenshot
                screenshot_b64 = execute_computer_action(block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": [
                        {
                            "type": "image",
                            "source": {
                                "type": "base64",
                                "media_type": "image/png",
                                "data": screenshot_b64,
                            },
                        }
                    ],
                })

        messages.append({"role": "user", "content": tool_results})

    return "Computer use agent reached step limit."


def execute_computer_action(action: dict) -> str:
    """Execute a computer action and return a screenshot as base64."""
    import subprocess
    import base64

    action_type = action.get("action")

    if action_type == "screenshot":
        pass  # Just take screenshot
    elif action_type == "click":
        x, y = action["coordinate"]
        subprocess.run(["xdotool", "mousemove", str(x), str(y), "click", "1"])
    elif action_type == "type":
        text = action["text"]
        subprocess.run(["xdotool", "type", "--clearmodifiers", text])
    elif action_type == "key":
        key = action["key"]
        subprocess.run(["xdotool", "key", key])
    elif action_type == "scroll":
        x, y = action["coordinate"]
        direction = action["direction"]
        button = "4" if direction == "up" else "5"
        subprocess.run(["xdotool", "mousemove", str(x), str(y),
                        "click", "--repeat", "3", button])

    # Take screenshot
    subprocess.run(["scrot", "/tmp/screenshot.png", "-o"])
    with open("/tmp/screenshot.png", "rb") as f:
        return base64.b64encode(f.read()).decode()

Streaming Agent Responses

Stream agent responses for real-time feedback during long-running tasks.

def streaming_agent(task: str, tools: list[dict], system: str):
    """Agent that streams responses for real-time output."""
    messages = [{"role": "user", "content": task}]

    for _ in range(20):
        collected_content = []
        current_tool_use = None

        with client.messages.stream(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system=system,
            tools=tools,
            messages=messages,
        ) as stream:
            for event in stream:
                if event.type == "content_block_start":
                    if event.content_block.type == "text":
                        pass  # Text block starting
                    elif event.content_block.type == "tool_use":
                        current_tool_use = {
                            "id": event.content_block.id,
                            "name": event.content_block.name,
                            "input_json": "",
                        }

                elif event.type == "content_block_delta":
                    if event.delta.type == "text_delta":
                        print(event.delta.text, end="", flush=True)
                    elif event.delta.type == "input_json_delta":
                        if current_tool_use:
                            current_tool_use["input_json"] += event.delta.partial_json

            # Get the full response
            response = stream.get_final_message()

        if response.stop_reason == "end_turn":
            return extract_text(response)

        messages.append({"role": "assistant", "content": response.content})

        # Execute tools
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                print(f"\n> Calling {block.name}...")
                result = execute_tool(block.name, block.input)
                print(f"> Result: {result[:100]}")
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })

        messages.append({"role": "user", "content": tool_results})

    return "Streaming agent reached max steps."

Claude Agent Best Practices

  1. Use claude-sonnet-4-20250514 for most agents — best balance of speed, cost, and capability. Use claude-opus-4-20250514 only for the most complex reasoning tasks.

  2. Keep system prompts under 1500 tokens — Longer system prompts eat into context and can dilute instructions.

  3. Return structured tool results — Claude parses structured text better than raw JSON blobs.

  4. Use stop_reason to control the loop"end_turn" means the agent is done. "tool_use" means it wants to use a tool. "max_tokens" means the response was cut off.

  5. Set reasonable max_tokens — 4096 is enough for most agent turns. Higher values waste time on long responses when the agent should be acting, not writing.

  6. Truncate tool outputs — If a tool returns 50KB of text, the agent will struggle. Cap outputs at 5-10KB and summarize if needed.

Install this skill directly: skilldb add ai-agent-orchestration-skills

Get CLI access →

Related Skills

agent-architecture

Core patterns for building AI agent systems: the observe-think-act loop, ReAct pattern implementation, tool-use cycles, memory systems (short-term and long-term), and planning strategies. Covers how to structure an agent's main loop, manage state between iterations, and wire together perception, reasoning, and action into a reliable autonomous system.

Ai Agent Orchestration368L

agent-error-recovery

Handling failures in AI agent systems: retry strategies with backoff, fallback tools, graceful degradation, human-in-the-loop escalation, stuck-loop detection, and context recovery after crashes. Covers practical patterns for making agents robust against tool failures, API errors, and reasoning dead-ends.

Ai Agent Orchestration470L

agent-evaluation

Testing and evaluating AI agents: trajectory evaluation, task completion metrics, tool-use accuracy measurement, regression testing, benchmark suites, and A/B testing agent configurations. Covers practical approaches to measuring whether agents are working correctly and improving over time.

Ai Agent Orchestration553L

agent-frameworks

Comparison of major AI agent frameworks: LangGraph, CrewAI, AutoGen, Semantic Kernel, and Claude Agent SDK. Covers when to use each framework, their trade-offs, core patterns, practical setup examples, and migration strategies between frameworks.

Ai Agent Orchestration433L

agent-guardrails

Safety and control systems for AI agents: input and output validation, action authorization, rate limiting, cost controls, content filtering, scope restriction, and audit logging. Covers practical implementations for keeping agents within bounds while maintaining their usefulness.

Ai Agent Orchestration564L

agent-memory

Memory systems for AI agents: conversation history management, summarization strategies, vector-based long-term memory, entity memory, episodic memory, and memory retrieval patterns. Covers practical implementations for giving agents persistent, searchable memory across sessions and within long-running tasks.

Ai Agent Orchestration443L