Skip to main content
Technology & EngineeringAi Agent Orchestration459 lines

agent-planning

Planning strategies for AI agents: chain-of-thought prompting, tree-of-thought exploration, plan-and-execute patterns, iterative refinement, task decomposition, and goal tracking. Covers practical implementations that make agents more reliable at complex, multi-step tasks by thinking before acting.

Quick Summary12 lines
Make agents more reliable at complex tasks by planning before acting: decompose tasks, track goals, and adapt plans when things change.

## Key Points

1. What do I know so far?
2. What do I need to find out or do next?
3. Which tool should I use and why?
- The strategy in 1-2 sentences
- Key steps involved
- Potential risks or failure modes
skilldb get ai-agent-orchestration-skills/agent-planningFull skill: 459 lines
Paste into your CLAUDE.md or agent config

Agent Planning

Make agents more reliable at complex tasks by planning before acting: decompose tasks, track goals, and adapt plans when things change.


Chain-of-Thought for Agents

The simplest planning approach: instruct the agent to think step-by-step before each action.

COT_SYSTEM = """You are a methodical agent that solves tasks step by step.

Before EVERY action, write your reasoning:
1. What do I know so far?
2. What do I need to find out or do next?
3. Which tool should I use and why?

Then call exactly one tool. After observing the result, reason again before the next action.

When the task is complete, give your final answer without tool calls."""

This costs nothing extra (no additional API calls) and measurably improves task completion on multi-step problems. The model naturally produces better tool call sequences when forced to reason explicitly.


Plan-and-Execute Pattern

Separate planning from execution. First, create a complete plan. Then execute it step by step, revising if needed.

import anthropic

client = anthropic.Anthropic()


def create_plan(task: str) -> list[str]:
    """Ask the model to create a plan before executing anything."""
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"""Create a step-by-step plan to accomplish this task.
Return ONLY a JSON array of strings, each string being one step.
Keep steps concrete and actionable.

Task: {task}

Example output:
["Search for the current price of AAPL stock", "Search for AAPL earnings data", "Calculate the P/E ratio", "Write a summary with the findings"]""",
        }],
    )

    import json
    text = response.content[0].text
    # Extract JSON array from response
    start = text.index("[")
    end = text.rindex("]") + 1
    return json.loads(text[start:end])


def plan_and_execute(task: str, tools: list[dict], max_revisions: int = 2):
    """Create a plan, execute it, revise if needed."""
    plan = create_plan(task)
    print(f"Plan ({len(plan)} steps):")
    for i, step in enumerate(plan):
        print(f"  {i+1}. {step}")

    results = []
    for step_num, step in enumerate(plan):
        print(f"\nExecuting step {step_num + 1}: {step}")

        step_prompt = (
            f"You are executing step {step_num + 1} of a plan.\n"
            f"Overall task: {task}\n\n"
            f"Previous results:\n"
        )
        for prev in results:
            step_prompt += f"- Step {prev['step']}: {prev['result'][:200]}\n"
        step_prompt += f"\nCurrent step: {step}\n\nComplete this step using tools."

        messages = [{"role": "user", "content": step_prompt}]

        for _ in range(5):
            response = client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=4096,
                tools=tools,
                messages=messages,
            )

            if response.stop_reason == "end_turn":
                result = extract_text(response)
                results.append({"step": step_num + 1, "description": step, "result": result})
                break

            messages.append({"role": "assistant", "content": response.content})
            tool_results = execute_all_tools(response)
            messages.append({"role": "user", "content": tool_results})

    # Final synthesis
    synthesis_prompt = (
        f"Task: {task}\n\n"
        f"All step results:\n"
    )
    for r in results:
        synthesis_prompt += f"\nStep {r['step']} ({r['description']}):\n{r['result']}\n"
    synthesis_prompt += "\nSynthesize these results into a final answer."

    final = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{"role": "user", "content": synthesis_prompt}],
    )
    return final.content[0].text

Tree-of-Thought

Explore multiple approaches and pick the best one. Useful when the agent might go down a wrong path.

def tree_of_thought(task: str, num_branches: int = 3) -> str:
    """Generate multiple approaches, evaluate them, pursue the best one."""

    # Step 1: Generate candidate approaches
    branch_prompt = f"""Given this task, propose {num_branches} different approaches.
For each approach, describe:
- The strategy in 1-2 sentences
- Key steps involved
- Potential risks or failure modes

Task: {task}

Return as a JSON array of objects with keys: strategy, steps (array), risks."""

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2048,
        messages=[{"role": "user", "content": branch_prompt}],
    )

    import json
    text = response.content[0].text
    start = text.index("[")
    end = text.rindex("]") + 1
    approaches = json.loads(text[start:end])

    # Step 2: Evaluate and rank approaches
    eval_prompt = f"""Task: {task}

Here are {len(approaches)} approaches. Evaluate each and pick the best one.
Consider: likelihood of success, efficiency, robustness.

Approaches:
"""
    for i, a in enumerate(approaches):
        eval_prompt += f"\n{i+1}. {a['strategy']}\n   Steps: {', '.join(a['steps'])}\n   Risks: {a['risks']}\n"

    eval_prompt += "\nReturn ONLY the number (1, 2, or 3) of the best approach."

    eval_response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=100,
        messages=[{"role": "user", "content": eval_prompt}],
    )

    # Extract the chosen approach number
    choice_text = eval_response.content[0].text.strip()
    choice = int("".join(c for c in choice_text if c.isdigit())[:1]) - 1
    chosen = approaches[min(choice, len(approaches) - 1)]

    print(f"Chosen approach: {chosen['strategy']}")

    # Step 3: Execute the chosen approach
    return plan_and_execute(
        f"{task}\n\nApproach: {chosen['strategy']}\nSteps: {', '.join(chosen['steps'])}",
        tools=available_tools,
    )

Task Decomposition

Break complex tasks into a tree of subtasks that can be tackled independently.

from dataclasses import dataclass, field


@dataclass
class TaskNode:
    id: str
    description: str
    status: str = "pending"  # pending, in_progress, completed, failed
    result: str = ""
    subtasks: list["TaskNode"] = field(default_factory=list)
    depends_on: list[str] = field(default_factory=list)


class TaskTree:
    """Hierarchical task decomposition with dependency tracking."""

    def __init__(self):
        self.nodes: dict[str, TaskNode] = {}
        self.root_id: str = ""

    def decompose(self, task: str) -> TaskNode:
        """Ask the model to decompose a task into subtasks."""
        prompt = f"""Decompose this task into 3-6 subtasks.
For each subtask, specify if it depends on another subtask completing first.

Task: {task}

Return JSON:
{{
  "subtasks": [
    {{"id": "1", "description": "...", "depends_on": []}},
    {{"id": "2", "description": "...", "depends_on": ["1"]}},
    ...
  ]
}}"""

        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=[{"role": "user", "content": prompt}],
        )

        import json
        text = response.content[0].text
        start = text.index("{")
        end = text.rindex("}") + 1
        data = json.loads(text[start:end])

        root = TaskNode(id="root", description=task)
        self.root_id = "root"
        self.nodes["root"] = root

        for st in data["subtasks"]:
            node = TaskNode(
                id=st["id"],
                description=st["description"],
                depends_on=st.get("depends_on", []),
            )
            self.nodes[node.id] = node
            root.subtasks.append(node)

        return root

    def get_ready_tasks(self) -> list[TaskNode]:
        """Get tasks whose dependencies are all completed."""
        ready = []
        for node in self.nodes.values():
            if node.status != "pending":
                continue
            if node.id == self.root_id:
                continue
            deps_met = all(
                self.nodes.get(dep, TaskNode(id="", description="")).status == "completed"
                for dep in node.depends_on
            )
            if deps_met:
                ready.append(node)
        return ready

    def complete_task(self, task_id: str, result: str):
        if task_id in self.nodes:
            self.nodes[task_id].status = "completed"
            self.nodes[task_id].result = result

    def all_complete(self) -> bool:
        return all(
            n.status == "completed"
            for n in self.nodes.values()
            if n.id != self.root_id
        )

    def format_status(self) -> str:
        lines = ["Task Status:"]
        for node in self.nodes.values():
            if node.id == self.root_id:
                continue
            deps = f" (depends on: {', '.join(node.depends_on)})" if node.depends_on else ""
            lines.append(f"  [{node.status.upper():12s}] {node.id}. {node.description}{deps}")
        return "\n".join(lines)

Goal Tracking

Give agents explicit goal awareness so they can monitor progress and course-correct.

class GoalTracker:
    """Track progress toward goals and subgoals."""

    def __init__(self, main_goal: str):
        self.main_goal = main_goal
        self.subgoals: list[dict] = []
        self.completed: list[dict] = []
        self.blocked: list[dict] = []

    def add_subgoal(self, description: str, priority: int = 1):
        self.subgoals.append({
            "description": description,
            "priority": priority,
            "status": "active",
        })
        self.subgoals.sort(key=lambda x: x["priority"], reverse=True)

    def complete_subgoal(self, description: str, result: str):
        for sg in self.subgoals:
            if sg["description"] == description:
                sg["status"] = "completed"
                sg["result"] = result
                self.completed.append(sg)
                self.subgoals.remove(sg)
                break

    def block_subgoal(self, description: str, reason: str):
        for sg in self.subgoals:
            if sg["description"] == description:
                sg["status"] = "blocked"
                sg["reason"] = reason
                self.blocked.append(sg)
                self.subgoals.remove(sg)
                break

    def format_for_prompt(self) -> str:
        lines = [f"## Goal: {self.main_goal}\n"]

        if self.completed:
            lines.append("Completed:")
            for c in self.completed:
                lines.append(f"  [DONE] {c['description']}: {c.get('result', '')[:80]}")

        if self.subgoals:
            lines.append("Active subgoals:")
            for sg in self.subgoals:
                lines.append(f"  [TODO] {sg['description']} (priority: {sg['priority']})")

        if self.blocked:
            lines.append("Blocked:")
            for b in self.blocked:
                lines.append(f"  [BLOCKED] {b['description']}: {b.get('reason', '')}")

        progress = len(self.completed) / max(len(self.completed) + len(self.subgoals) + len(self.blocked), 1)
        lines.append(f"\nProgress: {progress:.0%}")

        return "\n".join(lines)

Injecting Goal State into Agent Loop

def goal_aware_agent(task: str, tools: list[dict]) -> str:
    tracker = GoalTracker(task)

    # Initial decomposition
    plan = create_plan(task)
    for step in plan:
        tracker.add_subgoal(step)

    messages = [{"role": "user", "content": task}]

    for _ in range(20):
        # Inject goal status into the conversation
        goal_status = tracker.format_for_prompt()
        augmented_messages = messages + [
            {"role": "user", "content": f"Current goal status:\n{goal_status}\n\nContinue working."}
        ]

        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system="You are a goal-oriented agent. Check the goal status each turn. "
                   "Work on the highest-priority active subgoal. Mark subgoals as "
                   "complete when done.",
            tools=tools,
            messages=augmented_messages,
        )

        if response.stop_reason == "end_turn":
            return extract_text(response)

        messages.append({"role": "assistant", "content": response.content})
        tool_results = execute_all_tools(response)
        messages.append({"role": "user", "content": tool_results})

    return "Agent reached iteration limit."

Iterative Refinement

Some tasks benefit from multiple passes: draft, evaluate, improve, repeat.

def iterative_refine(task: str, max_rounds: int = 3) -> str:
    """Generate, critique, and improve through multiple rounds."""

    # Initial draft
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{"role": "user", "content": f"Complete this task:\n{task}"}],
    )
    current = response.content[0].text

    for round_num in range(max_rounds):
        # Critique
        critique_response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=[{
                "role": "user",
                "content": f"Task: {task}\n\nCurrent output:\n{current}\n\n"
                           f"List specific problems, gaps, or improvements needed. "
                           f"If the output is already excellent, respond with ONLY 'APPROVED'.",
            }],
        )
        critique = critique_response.content[0].text

        if "APPROVED" in critique:
            print(f"Approved after {round_num + 1} rounds.")
            break

        # Improve based on critique
        improve_response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=[{
                "role": "user",
                "content": f"Task: {task}\n\nCurrent output:\n{current}\n\n"
                           f"Critique:\n{critique}\n\n"
                           f"Produce an improved version addressing ALL critique points.",
            }],
        )
        current = improve_response.content[0].text

    return current

Choose the right planning strategy based on task complexity. Chain-of-thought costs nothing and works for most tasks. Plan-and-execute helps with tasks that have clear sequential stages. Tree-of-thought handles ambiguous problems where the first approach might fail. Iterative refinement suits quality-sensitive outputs like writing or code review.

Install this skill directly: skilldb add ai-agent-orchestration-skills

Get CLI access →

Related Skills

agent-architecture

Core patterns for building AI agent systems: the observe-think-act loop, ReAct pattern implementation, tool-use cycles, memory systems (short-term and long-term), and planning strategies. Covers how to structure an agent's main loop, manage state between iterations, and wire together perception, reasoning, and action into a reliable autonomous system.

Ai Agent Orchestration368L

agent-error-recovery

Handling failures in AI agent systems: retry strategies with backoff, fallback tools, graceful degradation, human-in-the-loop escalation, stuck-loop detection, and context recovery after crashes. Covers practical patterns for making agents robust against tool failures, API errors, and reasoning dead-ends.

Ai Agent Orchestration470L

agent-evaluation

Testing and evaluating AI agents: trajectory evaluation, task completion metrics, tool-use accuracy measurement, regression testing, benchmark suites, and A/B testing agent configurations. Covers practical approaches to measuring whether agents are working correctly and improving over time.

Ai Agent Orchestration553L

agent-frameworks

Comparison of major AI agent frameworks: LangGraph, CrewAI, AutoGen, Semantic Kernel, and Claude Agent SDK. Covers when to use each framework, their trade-offs, core patterns, practical setup examples, and migration strategies between frameworks.

Ai Agent Orchestration433L

agent-guardrails

Safety and control systems for AI agents: input and output validation, action authorization, rate limiting, cost controls, content filtering, scope restriction, and audit logging. Covers practical implementations for keeping agents within bounds while maintaining their usefulness.

Ai Agent Orchestration564L

agent-memory

Memory systems for AI agents: conversation history management, summarization strategies, vector-based long-term memory, entity memory, episodic memory, and memory retrieval patterns. Covers practical implementations for giving agents persistent, searchable memory across sessions and within long-running tasks.

Ai Agent Orchestration443L