Skip to main content
Technology & EngineeringAi Agent Orchestration368 lines

agent-architecture

Core patterns for building AI agent systems: the observe-think-act loop, ReAct pattern implementation, tool-use cycles, memory systems (short-term and long-term), and planning strategies. Covers how to structure an agent's main loop, manage state between iterations, and wire together perception, reasoning, and action into a reliable autonomous system.

Quick Summary18 lines
Build reliable AI agents using proven architectural patterns: agent loops, ReAct, tool-use cycles, and memory systems.

## Key Points

- **Termination**: The agent stops when it responds without tool calls, or when `max_iterations` is hit.
- **State accumulation**: The full `messages` list acts as short-term memory.
- **Tool execution**: Happens outside the model — the model requests, your code executes.
1. THINK: Reason about what you know and what you need to do next.
2. ACT: Call exactly one tool to make progress.
3. OBSERVE: You will receive the tool result.
- **Return strings**: Models consume text. Convert all results to strings.
- **Truncate output**: Large tool results waste context. Cap at a reasonable length.
- **Catch exceptions**: Never let a tool crash the loop — return the error as text so the agent can adapt.
- **Descriptive names**: `search_database` beats `query` — the model uses names to decide what to call.
1. First, call the `make_plan` tool to outline your steps.
2. Then execute each step using available tools.
skilldb get ai-agent-orchestration-skills/agent-architectureFull skill: 368 lines
Paste into your CLAUDE.md or agent config

Agent Architecture

Build reliable AI agents using proven architectural patterns: agent loops, ReAct, tool-use cycles, and memory systems.


The Agent Loop: Observe-Think-Act

Every agent follows the same fundamental cycle. The loop runs until the agent decides it has completed the task or hits a termination condition.

import anthropic

client = anthropic.Anthropic()

def agent_loop(task: str, tools: list[dict], max_iterations: int = 10):
    """Core agent loop: observe, think, act, repeat."""
    messages = [{"role": "user", "content": task}]
    iteration = 0

    while iteration < max_iterations:
        iteration += 1

        # THINK: Send current state to the model
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system="You are a helpful agent. Use tools to accomplish the task. "
                   "When the task is complete, respond without tool calls.",
            tools=tools,
            messages=messages,
        )

        # CHECK: If no tool use, the agent is done
        if response.stop_reason == "end_turn":
            final_text = next(
                (b.text for b in response.content if b.type == "text"), ""
            )
            return final_text

        # ACT: Execute each tool call
        messages.append({"role": "assistant", "content": response.content})
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                })

        # OBSERVE: Feed results back into the loop
        messages.append({"role": "user", "content": tool_results})

    return "Max iterations reached without completion."

Key design decisions in this loop:

  • Termination: The agent stops when it responds without tool calls, or when max_iterations is hit.
  • State accumulation: The full messages list acts as short-term memory.
  • Tool execution: Happens outside the model — the model requests, your code executes.

The ReAct Pattern

ReAct (Reasoning + Acting) makes the agent explicitly reason before each action. The model produces a thought, then a tool call, then observes the result.

REACT_SYSTEM = """You are an agent that solves tasks step by step.

For each step:
1. THINK: Reason about what you know and what you need to do next.
2. ACT: Call exactly one tool to make progress.
3. OBSERVE: You will receive the tool result.

Continue until the task is complete, then give your final answer without tool calls.
Always explain your reasoning before acting."""

def react_agent(task: str, tools: list[dict]):
    messages = [{"role": "user", "content": task}]

    for _ in range(15):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system=REACT_SYSTEM,
            tools=tools,
            messages=messages,
        )

        if response.stop_reason == "end_turn":
            return extract_text(response)

        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                })

        messages.append({"role": "user", "content": tool_results})

    return "Agent did not finish within iteration limit."

The key difference from a plain loop: the system prompt forces the model to emit reasoning text before each tool call. This improves reliability because the model "shows its work."


Tool-Use Cycle Design

Structure your tool definitions so the agent can discover capabilities, handle errors, and compose actions.

tools = [
    {
        "name": "read_file",
        "description": "Read the contents of a file at the given path.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {
                    "type": "string",
                    "description": "Absolute path to the file."
                }
            },
            "required": ["path"],
        },
    },
    {
        "name": "write_file",
        "description": "Write content to a file. Creates the file if it does not exist.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Absolute path."},
                "content": {"type": "string", "description": "File content."},
            },
            "required": ["path", "content"],
        },
    },
    {
        "name": "run_command",
        "description": "Execute a shell command and return stdout/stderr.",
        "input_schema": {
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "The shell command."},
            },
            "required": ["command"],
        },
    },
]

def execute_tool(name: str, inputs: dict) -> str:
    """Route tool calls to implementations. Always return strings."""
    try:
        if name == "read_file":
            with open(inputs["path"]) as f:
                return f.read()
        elif name == "write_file":
            with open(inputs["path"], "w") as f:
                f.write(inputs["content"])
            return f"Wrote {len(inputs['content'])} chars to {inputs['path']}"
        elif name == "run_command":
            import subprocess
            result = subprocess.run(
                inputs["command"], shell=True,
                capture_output=True, text=True, timeout=30
            )
            output = result.stdout + result.stderr
            return output[:5000]  # Truncate large outputs
        else:
            return f"Unknown tool: {name}"
    except Exception as e:
        return f"Error: {type(e).__name__}: {e}"

Design principles for tools:

  • Return strings: Models consume text. Convert all results to strings.
  • Truncate output: Large tool results waste context. Cap at a reasonable length.
  • Catch exceptions: Never let a tool crash the loop — return the error as text so the agent can adapt.
  • Descriptive names: search_database beats query — the model uses names to decide what to call.

Memory Systems

Short-Term Memory (Conversation History)

The simplest memory: the messages array. Problems arise when conversations grow too long.

def manage_context_window(messages: list, max_tokens: int = 80000) -> list:
    """Trim old messages when approaching context limits."""
    import json
    estimated_tokens = len(json.dumps(messages)) // 4  # rough estimate

    if estimated_tokens < max_tokens:
        return messages

    # Keep system-critical messages: first user message + recent messages
    first_msg = messages[0]
    recent = messages[-20:]  # keep last 20 turns

    # Summarize the middle
    middle = messages[1:-20]
    summary = summarize_messages(middle)

    return [first_msg, {"role": "user", "content": f"[Previous conversation summary: {summary}]"}] + recent

Long-Term Memory (Persistent Store)

For agents that run across sessions, persist important information externally.

import json
from pathlib import Path

class AgentMemory:
    """Simple file-backed long-term memory for an agent."""

    def __init__(self, memory_dir: str = ".agent_memory"):
        self.dir = Path(memory_dir)
        self.dir.mkdir(exist_ok=True)
        self.facts_file = self.dir / "facts.json"
        self.facts: list[dict] = self._load()

    def _load(self) -> list[dict]:
        if self.facts_file.exists():
            return json.loads(self.facts_file.read_text())
        return []

    def store(self, fact: str, category: str = "general"):
        entry = {
            "fact": fact,
            "category": category,
            "timestamp": __import__("time").time(),
        }
        self.facts.append(entry)
        self.facts_file.write_text(json.dumps(self.facts, indent=2))

    def recall(self, query: str, top_k: int = 5) -> list[str]:
        """Simple keyword-based recall. Replace with vector search for production."""
        query_words = set(query.lower().split())
        scored = []
        for entry in self.facts:
            fact_words = set(entry["fact"].lower().split())
            overlap = len(query_words & fact_words)
            if overlap > 0:
                scored.append((overlap, entry["fact"]))
        scored.sort(reverse=True)
        return [fact for _, fact in scored[:top_k]]

    def inject_into_prompt(self, query: str) -> str:
        relevant = self.recall(query)
        if not relevant:
            return ""
        return "Relevant memories:\n" + "\n".join(f"- {f}" for f in relevant)

Planning Strategies

Agents benefit from planning before acting. A simple approach: ask the model to make a plan, then execute it step by step.

PLANNING_SYSTEM = """You are a task-planning agent.

When given a task:
1. First, call the `make_plan` tool to outline your steps.
2. Then execute each step using available tools.
3. After completing all steps, provide your final answer.

If a step fails, revise the plan and continue."""

plan_tool = {
    "name": "make_plan",
    "description": "Create a numbered plan for completing the task. "
                   "Call this before taking any other actions.",
    "input_schema": {
        "type": "object",
        "properties": {
            "steps": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Ordered list of steps to complete the task.",
            }
        },
        "required": ["steps"],
    },
}

def handle_plan(steps: list[str]) -> str:
    """Store the plan and return it for the agent to follow."""
    plan_text = "\n".join(f"{i+1}. {s}" for i, s in enumerate(steps))
    return f"Plan created:\n{plan_text}\n\nNow execute step 1."

Putting It Together: Full Agent Skeleton

class Agent:
    def __init__(self, tools, system_prompt, model="claude-sonnet-4-20250514"):
        self.client = anthropic.Anthropic()
        self.tools = tools
        self.system = system_prompt
        self.model = model
        self.memory = AgentMemory()

    def run(self, task: str, max_steps: int = 20) -> str:
        # Inject relevant memories
        context = self.memory.inject_into_prompt(task)
        full_task = f"{context}\n\nTask: {task}" if context else task
        messages = [{"role": "user", "content": full_task}]

        for step in range(max_steps):
            response = self.client.messages.create(
                model=self.model,
                max_tokens=4096,
                system=self.system,
                tools=self.tools,
                messages=messages,
            )

            if response.stop_reason == "end_turn":
                result = extract_text(response)
                self.memory.store(f"Completed task: {task[:100]}")
                return result

            messages.append({"role": "assistant", "content": response.content})
            tool_results = self._execute_tools(response)
            messages.append({"role": "user", "content": tool_results})

        return "Agent reached step limit."

    def _execute_tools(self, response) -> list[dict]:
        results = []
        for block in response.content:
            if block.type == "tool_use":
                output = execute_tool(block.name, block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": output,
                })
        return results

This skeleton provides: a loop with termination, tool execution with error handling, long-term memory injection, and a clean interface. Extend it by adding planning tools, guardrails, or multi-agent coordination as needed.

Install this skill directly: skilldb add ai-agent-orchestration-skills

Get CLI access →

Related Skills

agent-error-recovery

Handling failures in AI agent systems: retry strategies with backoff, fallback tools, graceful degradation, human-in-the-loop escalation, stuck-loop detection, and context recovery after crashes. Covers practical patterns for making agents robust against tool failures, API errors, and reasoning dead-ends.

Ai Agent Orchestration470L

agent-evaluation

Testing and evaluating AI agents: trajectory evaluation, task completion metrics, tool-use accuracy measurement, regression testing, benchmark suites, and A/B testing agent configurations. Covers practical approaches to measuring whether agents are working correctly and improving over time.

Ai Agent Orchestration553L

agent-frameworks

Comparison of major AI agent frameworks: LangGraph, CrewAI, AutoGen, Semantic Kernel, and Claude Agent SDK. Covers when to use each framework, their trade-offs, core patterns, practical setup examples, and migration strategies between frameworks.

Ai Agent Orchestration433L

agent-guardrails

Safety and control systems for AI agents: input and output validation, action authorization, rate limiting, cost controls, content filtering, scope restriction, and audit logging. Covers practical implementations for keeping agents within bounds while maintaining their usefulness.

Ai Agent Orchestration564L

agent-memory

Memory systems for AI agents: conversation history management, summarization strategies, vector-based long-term memory, entity memory, episodic memory, and memory retrieval patterns. Covers practical implementations for giving agents persistent, searchable memory across sessions and within long-running tasks.

Ai Agent Orchestration443L

agent-planning

Planning strategies for AI agents: chain-of-thought prompting, tree-of-thought exploration, plan-and-execute patterns, iterative refinement, task decomposition, and goal tracking. Covers practical implementations that make agents more reliable at complex, multi-step tasks by thinking before acting.

Ai Agent Orchestration459L