Skip to main content
Technology & EngineeringMulti Agent Orchestration185 lines

Multi-Agent Handoff Patterns

Coordinate multiple specialized agents on a single task — when to hand

Quick Summary31 lines
A single agent with all the tools and all the prompts is sometimes the right choice. Sometimes the task is large or specialized enough that splitting into multiple agents — each with focused tools, prompts, and roles — produces better results.

## Key Points

- **The task has distinct phases** that benefit from different prompts. Research vs. writing vs. review.
- **The task uses different tool sets** that don't combine well. A code generator with browse tools vs. a code reviewer with execution tools.
- **The task is large** and a single agent can't fit it in context. Split among agents that each handle a subset.
- **The task benefits from review.** A draft agent generates; a review agent critiques. Iterate.
- **The task requires specialization** that fine-tunes or system prompts can't fully capture in one agent.
- Decompose the task.
- Select the right worker.
- Pass enough context for the worker to do its job.
- Synthesize results from multiple workers.
- Decide when the task is complete.
- **Brainstorming**, where multiple agents generate ideas.
- **Adversarial setups**, where one agent generates and another critiques.

## Quick Example

```
User → Supervisor → Worker A → Supervisor → Worker B → Supervisor → User
```

```
Blackboard:
  task_1: { status: "pending", type: "research", description: "..." }
  task_2: { status: "in_progress", agent: "worker_a", type: "writing" }
  task_3: { status: "complete", result: "..." }
```
skilldb get multi-agent-orchestration-skills/Multi-Agent Handoff PatternsFull skill: 185 lines
Paste into your CLAUDE.md or agent config

A single agent with all the tools and all the prompts is sometimes the right choice. Sometimes the task is large or specialized enough that splitting into multiple agents — each with focused tools, prompts, and roles — produces better results.

The challenge isn't running multiple agents; that's straightforward. The challenge is coordinating them. Handoffs lose context. Agents argue. Loops form. The orchestration layer matters more than the individual agents' capabilities.

When to Use Multiple Agents

Single agents are simpler. Use multiple agents only when:

  • The task has distinct phases that benefit from different prompts. Research vs. writing vs. review.
  • The task uses different tool sets that don't combine well. A code generator with browse tools vs. a code reviewer with execution tools.
  • The task is large and a single agent can't fit it in context. Split among agents that each handle a subset.
  • The task benefits from review. A draft agent generates; a review agent critiques. Iterate.
  • The task requires specialization that fine-tunes or system prompts can't fully capture in one agent.

If none of these apply, use a single agent. Multi-agent systems add complexity; they should be justified.

The Supervisor / Worker Pattern

Most common pattern. A supervisor agent (also called a router or planner) receives the user's task and decides which worker agent to invoke. Workers complete their subtasks and return results to the supervisor. The supervisor synthesizes and returns the final result.

User → Supervisor → Worker A → Supervisor → Worker B → Supervisor → User

The supervisor's responsibilities:

  • Decompose the task.
  • Select the right worker.
  • Pass enough context for the worker to do its job.
  • Synthesize results from multiple workers.
  • Decide when the task is complete.

The workers are simpler. Each has a focused prompt and tool set. They do one thing well.

This pattern scales: add more worker types as new specializations emerge. The supervisor is the only piece that knows the full picture.

The Peer Collaboration Pattern

Less hierarchical. Multiple agents work in parallel, each contributing to a shared workspace. There's no single supervisor; the agents coordinate through the workspace.

Useful for:

  • Brainstorming, where multiple agents generate ideas.
  • Adversarial setups, where one agent generates and another critiques.
  • Specialized review, where multiple agents check different aspects of a draft.

Risk: agents argue endlessly. They disagree, restate their positions, and don't converge. Need a stopping condition (e.g., a fixed number of rounds, a final reviewer with authority, a confidence threshold).

The Blackboard Pattern

Agents read from and write to a shared structured state — the blackboard. Each agent watches the blackboard for tasks it can do; when it sees one, it claims it, completes it, and writes the result back.

Blackboard:
  task_1: { status: "pending", type: "research", description: "..." }
  task_2: { status: "in_progress", agent: "worker_a", type: "writing" }
  task_3: { status: "complete", result: "..." }

Useful for asynchronous, long-running tasks. Each agent operates independently; the blackboard mediates.

Implementation: a database, a Redis hash, a typed JSON structure. Versioning matters; when two agents try to claim the same task, one wins.

Context Handoff

The hardest problem in multi-agent systems: passing context between agents.

Options:

Full Context Transfer

Pass the entire conversation history to the next agent. Simple but expensive (lots of tokens) and noisy (the next agent has to filter relevant info).

"Worker B, here is the full conversation so far. Continue from where Worker A left off."

Summary Handoff

The supervisor summarizes the relevant context for the worker. Worker doesn't see history; it sees a focused summary.

"Worker B, your task is X. Here is what's been established so far: ... Here are the constraints: ..."

Better for token efficiency. Risk: the summary loses important detail. Test the summarization; iterate on what gets included.

Structured Handoff

The supervisor passes a structured object: task, inputs, constraints, prior outputs. The worker has a clear contract.

{
  "task": "write_summary",
  "inputs": { "source_documents": [...] },
  "constraints": { "max_length": 500, "tone": "neutral" },
  "prior_outputs": { "research": "..." }
}

Most reliable for production systems. The structure is enforced; nothing is implicit.

Shared Memory

Agents read from a shared store. The handoff is a pointer, not a copy.

"Worker B, you're up. The research is in research_id_42. The constraints are at task_constraints_42."

Useful when context is large. The shared store handles caching, versioning, retrieval.

Avoiding Loops

Multi-agent systems can form loops:

  • Agent A asks Agent B; B asks A; both refuse to commit.
  • Agent A produces a draft; Agent B reviews and rejects; A produces again; B rejects again, indefinitely.
  • The supervisor delegates to A, then to B, then to A again, then to B...

Loop prevention:

  • Explicit step budget. Each task has a max number of agent turns. Exceed it: fail and report.
  • Progress detection. Each turn must produce new artifacts; if a turn produces nothing new, end.
  • Final-decider role. When agents disagree, a designated agent (or the user) decides.
  • Termination conditions. Each agent commits to a "done" condition; when met, returns and doesn't iterate further.

Cost and Latency

Multi-agent systems are expensive. Each agent turn is an LLM call; the cost is multiplicative.

For a 10-step task, single-agent is 10 calls; multi-agent with supervisor + 3 workers might be 25-40 calls.

Track:

  • Total cost per task.
  • Total latency per task.
  • Cost per agent role (which is the bottleneck).
  • Cache hit rates (re-running the same prompt should hit cache).

Optimization:

  • Cache supervisor decisions. Same task structure → same routing. Cache the supervisor's output.
  • Use cheaper models for workers. The supervisor needs reasoning; workers may not.
  • Parallelize when possible. If two workers can run independently, run them in parallel.
  • Short-circuit obvious tasks. Don't route trivial tasks through the supervisor; handle them directly.

Debugging

When a multi-agent task goes wrong, the question is: which agent did the wrong thing?

Logging:

  • Every agent invocation logged with input, output, prompt version.
  • The supervisor's decisions logged: which worker, why.
  • Cross-references: each worker output tied back to the supervisor request that caused it.

Visualization tools (LangSmith, Langfuse, custom dashboards) help. The conversation between agents becomes a tree; the user sees the tree, identifies the bad node, and investigates.

Without logging, debugging multi-agent systems is nearly impossible. Single-agent traces are confusing; multi-agent traces are unrunnable.

Anti-Patterns

Multi-agent for tasks that fit one agent. Adds cost, latency, complexity for no benefit. Single-agent first.

Loose handoffs. "Continue from where the last agent left off" with full context. Tokens wasted; agents confused. Structure the handoff.

No loop prevention. Agents argue endlessly; user waits. Step budgets and progress detection.

Identical model across agents. Each agent role might benefit from a different model. Calibrate per role.

No final-decider. Agents reach a stalemate; nothing decides. Designate a final arbiter.

No agent-level logging. Debugging is impossible. Log every agent call.

Install this skill directly: skilldb add multi-agent-orchestration-skills

Get CLI access →