Technology & EngineeringMulti Agent Orchestration214 lines

Building Agent Workflows with LangGraph

Use LangGraph (or equivalent state-machine frameworks) to express

Quick Summary29 lines

LangGraph and similar frameworks model agent workflows as state machines: nodes that do work, edges that decide what comes next, state that persists between nodes. The model is more constrained than free-form Python orchestration but produces more debuggable, more testable, more durable systems.

## Key Points

- **State.** A typed structure that flows through the graph. Each node reads from it, optionally writes to it.
- **Nodes.** Functions or LLM calls. Each node receives the state, does work, returns updates to apply to the state.
- **Edges.** Define which node runs next. Can be unconditional (always go from A to B) or conditional (run a function on state to decide).
- **Entry and exit points.** Where execution starts and where it ends.
- **Explicit fields.** Each piece of state is a named field with a type, not a `dict[str, Any]`.
- **Immutability of inputs.** Nodes return updates; they don't mutate state in place. The framework merges updates.
- **Append for accumulation.** State that accumulates over time (messages, tool calls) uses an `add` reducer; the framework handles the append.
- **Reset for replacements.** State that should be replaced (current task, last result) uses a `replace` reducer.
- The LLM might return an invalid agent name. Validate; route to a fallback if the response doesn't match.
- The LLM might pick the same agent twice in a row, looping. Track the route history; bail or change strategy if looping.
- The LLM's decisions might be inconsistent. Cache when input state is the same.
- **Long-running workflows** that exceed a single request lifetime.

## Quick Example

```python
for chunk in app.stream(initial_state, config=config):
    print(chunk)
```

```
plan → execute_step → check_progress → (re-plan | execute_next | done)
```

skilldb get multi-agent-orchestration-skills/Building Agent Workflows with LangGraphFull skill: 214 lines

Paste into your CLAUDE.md or agent config

LangGraph and similar frameworks model agent workflows as state machines: nodes that do work, edges that decide what comes next, state that persists between nodes. The model is more constrained than free-form Python orchestration but produces more debuggable, more testable, more durable systems.

This skill covers the patterns for using state-machine frameworks effectively. It's framework-agnostic in spirit, though concrete examples use LangGraph syntax.

The State Machine Model

A LangGraph workflow has:

State. A typed structure that flows through the graph. Each node reads from it, optionally writes to it.
Nodes. Functions or LLM calls. Each node receives the state, does work, returns updates to apply to the state.
Edges. Define which node runs next. Can be unconditional (always go from A to B) or conditional (run a function on state to decide).
Entry and exit points. Where execution starts and where it ends.

The graph is explicit. You can draw it. You can name every node. You can trace every edge. This is the value: agentic workflows that would otherwise be implicit control flow inside Python become first-class artifacts.

Designing the State

The state is the most important design decision. Get this right and the workflow is easy; get it wrong and every node fights it.

Principles:

Explicit fields. Each piece of state is a named field with a type, not a dict[str, Any].
Immutability of inputs. Nodes return updates; they don't mutate state in place. The framework merges updates.
Append for accumulation. State that accumulates over time (messages, tool calls) uses an add reducer; the framework handles the append.
Reset for replacements. State that should be replaced (current task, last result) uses a replace reducer.

from typing import TypedDict, Annotated
from operator import add

class State(TypedDict):
    messages: Annotated[list[Message], add]
    current_task: str
    research_results: list[str]
    final_answer: str | None

Avoid: a single context field that accumulates everything as a blob. The fields decompose the state into queryable, debuggable pieces.

Designing the Nodes

Each node has a single responsibility. The agent that does research is a node; the agent that writes is another node; the agent that reviews is a third.

def research_node(state: State) -> State:
    query = state["current_task"]
    results = search_tool.run(query)
    return {"research_results": results}

def write_node(state: State) -> State:
    research = state["research_results"]
    response = llm.invoke(f"Write a summary based on: {research}")
    return {"final_answer": response.content}

Nodes are pure-ish functions of state. They take state in, return updates. They don't have side effects beyond what's necessary.

Conditional Edges

Edges decide where to go next. Most edges are unconditional ("research → write → review"). The interesting ones are conditional.

def route_after_review(state: State) -> str:
    if state["review_passed"]:
        return "publish"
    elif state["review_attempt"] >= 3:
        return "human_escalation"
    else:
        return "rewrite"

graph.add_conditional_edges("review", route_after_review)

The router is a function of state. It decides the next node deterministically given the state.

Conditional edges are where agentic decisions happen. The supervisor agent's "which worker should run next" is implemented as a conditional edge with a routing function powered by an LLM call.

The LLM-Powered Router

The supervisor pattern in LangGraph:

def supervisor_router(state: State) -> str:
    response = llm.invoke([
        ("system", "You are a supervisor. Decide which agent should run next."),
        ("user", f"Current state: {state['current_task']}. "
                 f"Available agents: research, write, review, publish."),
    ])
    return response.content.strip()  # returns the agent name

Hazards:

The LLM might return an invalid agent name. Validate; route to a fallback if the response doesn't match.
The LLM might pick the same agent twice in a row, looping. Track the route history; bail or change strategy if looping.
The LLM's decisions might be inconsistent. Cache when input state is the same.

Use temperature 0 for routing decisions; low temperature for consistency.

Persistence and Resumption

LangGraph supports checkpointing — saving state at each step. Useful for:

Long-running workflows that exceed a single request lifetime.
Human-in-the-loop steps where the workflow waits for user input.
Failure recovery: if a node errors, restart from the last checkpoint.

from langgraph.checkpoint.postgres import PostgresSaver

checkpointer = PostgresSaver(connection_string="postgres://...")
app = graph.compile(checkpointer=checkpointer)

config = {"configurable": {"thread_id": "user-42-task-123"}}
result = app.invoke(initial_state, config=config)
# Later, resume:
state = app.get_state(config)
result = app.invoke(None, config=config)

Checkpointing turns workflows from synchronous functions into durable processes. The thread ID identifies the workflow run; the checkpointer persists state at each node boundary.

Streaming and Observability

LangGraph supports streaming intermediate state and outputs. Useful for UIs that show progress and for debugging.

for chunk in app.stream(initial_state, config=config):
    print(chunk)

Each chunk is the state delta from one node. The UI can render progress as it arrives.

Logging: every node invocation, every edge decision, every state change. Production systems need this for incident investigation. LangSmith and similar tools handle visualization; without them, you're reading raw logs.

Subgraphs

For complex workflows, use subgraphs. A "research" subgraph might have its own internal nodes (query generation → search → filter → summarize) but appear as a single node in the parent graph.

Subgraphs:

Encapsulate complexity.
Are independently testable.
Can be reused across workflows.

The parent passes state to the subgraph; the subgraph returns its result; the parent integrates.

Testing

Test the graph and its nodes separately.

Node tests: each node is a function. Pass in a fixture state; assert on the returned updates. Mock LLM calls and tool calls.

Graph tests: pass in initial state; run the graph; assert on the final state. Mock LLM responses to deterministic values.

Integration tests: run the full graph end-to-end against a real LLM with realistic inputs. These are slow and expensive; run in CI but not on every commit.

Common Patterns

The Re-Plan Pattern

Plan the work; execute step 1; if step 1's result invalidates the plan, re-plan; otherwise execute step 2.

plan → execute_step → check_progress → (re-plan | execute_next | done)

Useful for tasks where progress reveals new constraints.

The Self-Correction Pattern

Generate; check; if failed, generate again with feedback; otherwise return.

generate → check → (return | generate_with_feedback)

The check node is often an LLM call evaluating the generation against criteria.

The Hierarchical Pattern

Top-level supervisor invokes mid-level supervisors; each mid-level invokes specialized workers.

top_supervisor → (mid_supervisor_a | mid_supervisor_b)
mid_supervisor_a → (worker_1 | worker_2 | worker_3)

Useful for complex tasks that decompose into subtasks.

Anti-Patterns

State as a single dict. Untyped, unstructured, accumulating blob. Decompose into named typed fields.

Mutable state. Nodes mutating state in place. Causes ordering bugs. Return updates instead.

Looping without budgets. Conditional edges that can loop forever. Add a step counter; bail at a budget.

Untested nodes. Nodes only tested as part of the full graph. Slow, hard to debug. Test nodes individually with fixtures.

No checkpointing for long workflows. A failure mid-way loses all progress. Checkpoint at every node boundary.

LLM router with high temperature. Routing decisions are inconsistent. Set temperature 0.

Install this skill directly: skilldb add multi-agent-orchestration-skills

Get CLI access →

Building Agent Workflows with LangGraph

The State Machine Model

Designing the State

Designing the Nodes

Conditional Edges

The LLM-Powered Router

Persistence and Resumption

Streaming and Observability

Subgraphs

Testing

Common Patterns

The Re-Plan Pattern

The Self-Correction Pattern

The Hierarchical Pattern

Anti-Patterns

Related Skills

Multi-Agent Handoff Patterns

Agent Tool Design Principles

Evaluation-Driven Agent Development

Adversarial Code Review

API Design Testing

Architecture