tool-calling
Implementing tool and function calling across Claude, OpenAI, and Gemini APIs. Covers schema design best practices, parallel tool calls, error handling, tool result formatting, dynamic tool registration, and patterns for building composable tool sets that agents can use reliably.
Implement robust tool/function calling across major LLM APIs. Design schemas, handle errors, and format results for reliable agent tool use. ## Key Points - **Structured text over raw JSON**: Models parse structured text faster than nested JSON. - **Include counts**: "Found 3 results" helps the model decide if it needs to search again. - **Truncate intelligently**: Cut at sentence boundaries, not mid-word. - **Include actionable error messages**: Tell the agent what it can do differently.
skilldb get ai-agent-orchestration-skills/tool-callingFull skill: 461 linesTool Calling
Implement robust tool/function calling across major LLM APIs. Design schemas, handle errors, and format results for reliable agent tool use.
Claude Tool Calling
Claude uses a tools array with JSON Schema for input validation.
import anthropic
client = anthropic.Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get current weather for a city. Returns temperature, "
"conditions, and humidity.",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g. 'San Francisco'",
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units. Defaults to celsius.",
},
},
"required": ["city"],
},
},
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
)
# Response contains tool_use blocks
for block in response.content:
if block.type == "tool_use":
print(f"Tool: {block.name}")
print(f"Input: {block.input}")
print(f"ID: {block.id}")
Returning Tool Results to Claude
# After executing the tool, send results back
messages = [
{"role": "user", "content": "What's the weather in Tokyo?"},
{"role": "assistant", "content": response.content},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": block.id,
"content": '{"temp": 22, "conditions": "partly cloudy", "humidity": 65}',
}
],
},
]
final = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages,
)
OpenAI Tool Calling
OpenAI uses tools with a function wrapper and type: "function".
from openai import OpenAI
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name."},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
},
},
"required": ["city"],
},
},
},
]
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Weather in Tokyo?"}],
tools=tools,
)
# Extract tool calls
message = response.choices[0].message
if message.tool_calls:
for tc in message.tool_calls:
print(f"Function: {tc.function.name}")
print(f"Args: {tc.function.arguments}") # JSON string
print(f"ID: {tc.id}")
Returning Results to OpenAI
import json
messages = [
{"role": "user", "content": "Weather in Tokyo?"},
message, # assistant message with tool_calls
{
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps({"temp": 22, "conditions": "partly cloudy"}),
},
]
final = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
)
Gemini Tool Calling
from google import genai
from google.genai import types
client = genai.Client()
weather_tool = types.Tool(
function_declarations=[
types.FunctionDeclaration(
name="get_weather",
description="Get current weather for a city.",
parameters=types.Schema(
type=types.Type.OBJECT,
properties={
"city": types.Schema(type=types.Type.STRING),
"units": types.Schema(
type=types.Type.STRING,
enum=["celsius", "fahrenheit"],
),
},
required=["city"],
),
)
]
)
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Weather in Tokyo?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
# Check for function calls in the response
for part in response.candidates[0].content.parts:
if part.function_call:
print(f"Function: {part.function_call.name}")
print(f"Args: {part.function_call.args}")
Schema Design Best Practices
Good schemas make agents more reliable. Bad schemas cause hallucinated parameters and wrong tool selection.
Write Descriptive Tool Descriptions
# BAD: vague, the model will misuse this
{
"name": "query",
"description": "Run a query.",
"input_schema": {
"type": "object",
"properties": {
"q": {"type": "string"},
},
},
}
# GOOD: specific, with examples
{
"name": "search_products",
"description": "Search the product catalog by keyword. Returns up to 10 "
"matching products with name, price, and availability. "
"Use this when the user asks about products, inventory, or pricing.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search keywords, e.g. 'red running shoes size 10'",
},
"category": {
"type": "string",
"enum": ["shoes", "clothing", "accessories", "electronics"],
"description": "Optional category filter.",
},
"max_results": {
"type": "integer",
"description": "Number of results to return (1-50). Default: 10.",
},
},
"required": ["query"],
},
}
Use Enums to Constrain Values
# Enums prevent the model from inventing invalid values
"status_filter": {
"type": "string",
"enum": ["open", "closed", "in_progress", "blocked"],
"description": "Filter tasks by status.",
}
Nest Objects for Complex Inputs
{
"name": "create_issue",
"description": "Create a new issue in the project tracker.",
"input_schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"body": {"type": "string"},
"assignee": {
"type": "object",
"properties": {
"name": {"type": "string"},
"team": {"type": "string"},
},
"required": ["name"],
},
"labels": {
"type": "array",
"items": {"type": "string"},
"description": "Labels like 'bug', 'feature', 'urgent'.",
},
},
"required": ["title"],
},
}
Parallel Tool Calls
Claude and OpenAI can return multiple tool calls in a single response. Handle all of them before continuing.
def process_response(response) -> list[dict]:
"""Execute all tool calls from a response in parallel."""
import concurrent.futures
tool_calls = [b for b in response.content if b.type == "tool_use"]
if not tool_calls:
return []
results = []
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
futures = {
executor.submit(execute_tool, tc.name, tc.input): tc
for tc in tool_calls
}
for future in concurrent.futures.as_completed(futures):
tc = futures[future]
try:
output = future.result(timeout=30)
except Exception as e:
output = f"Error executing {tc.name}: {e}"
results.append({
"type": "tool_result",
"tool_use_id": tc.id,
"content": str(output),
})
return results
Error Handling
Always return errors as tool results — never crash the loop.
def execute_tool_safe(name: str, inputs: dict) -> str:
"""Execute a tool with comprehensive error handling."""
# Validate tool exists
if name not in TOOL_REGISTRY:
return f"Error: Unknown tool '{name}'. Available: {list(TOOL_REGISTRY.keys())}"
# Validate required params
schema = TOOL_REGISTRY[name]["schema"]
for param in schema.get("required", []):
if param not in inputs:
return f"Error: Missing required parameter '{param}' for tool '{name}'."
# Execute with timeout
try:
import signal
def timeout_handler(signum, frame):
raise TimeoutError("Tool execution timed out after 30 seconds")
result = TOOL_REGISTRY[name]["handler"](inputs)
return str(result)[:10000] # Cap output length
except TimeoutError as e:
return f"Error: {e}"
except PermissionError:
return f"Error: Permission denied executing '{name}'."
except Exception as e:
return f"Error in {name}: {type(e).__name__}: {e}"
Dynamic Tool Registration
Build tool sets that agents can extend at runtime.
class ToolRegistry:
def __init__(self):
self._tools: dict[str, dict] = {}
def register(self, name: str, description: str, schema: dict, handler):
"""Register a tool with its schema and handler."""
self._tools[name] = {
"definition": {
"name": name,
"description": description,
"input_schema": schema,
},
"handler": handler,
}
def get_definitions(self) -> list[dict]:
"""Return tool definitions for the API call."""
return [t["definition"] for t in self._tools.values()]
def execute(self, name: str, inputs: dict) -> str:
if name not in self._tools:
return f"Unknown tool: {name}"
try:
result = self._tools[name]["handler"](inputs)
return str(result)
except Exception as e:
return f"Error: {e}"
# Usage
registry = ToolRegistry()
registry.register(
"add_numbers",
"Add two numbers together.",
{
"type": "object",
"properties": {
"a": {"type": "number"},
"b": {"type": "number"},
},
"required": ["a", "b"],
},
lambda inputs: inputs["a"] + inputs["b"],
)
# Pass to API
response = client.messages.create(
model="claude-sonnet-4-20250514",
tools=registry.get_definitions(),
messages=messages,
max_tokens=1024,
)
Tool Result Formatting
Format results so the model can easily parse and reason about them.
def format_search_results(results: list[dict]) -> str:
"""Format search results as structured text the model can parse."""
if not results:
return "No results found."
lines = [f"Found {len(results)} results:\n"]
for i, r in enumerate(results, 1):
lines.append(f"{i}. {r['title']}")
lines.append(f" URL: {r['url']}")
lines.append(f" Snippet: {r['snippet'][:200]}")
lines.append("")
return "\n".join(lines)
def format_error_result(error: Exception, tool_name: str) -> str:
"""Format errors with enough context for the agent to recover."""
return (
f"Tool '{tool_name}' failed.\n"
f"Error type: {type(error).__name__}\n"
f"Message: {str(error)}\n"
f"Suggestion: Check your inputs and try again, or use an alternative approach."
)
Rules for formatting:
- Structured text over raw JSON: Models parse structured text faster than nested JSON.
- Include counts: "Found 3 results" helps the model decide if it needs to search again.
- Truncate intelligently: Cut at sentence boundaries, not mid-word.
- Include actionable error messages: Tell the agent what it can do differently.
Install this skill directly: skilldb add ai-agent-orchestration-skills
Related Skills
agent-architecture
Core patterns for building AI agent systems: the observe-think-act loop, ReAct pattern implementation, tool-use cycles, memory systems (short-term and long-term), and planning strategies. Covers how to structure an agent's main loop, manage state between iterations, and wire together perception, reasoning, and action into a reliable autonomous system.
agent-error-recovery
Handling failures in AI agent systems: retry strategies with backoff, fallback tools, graceful degradation, human-in-the-loop escalation, stuck-loop detection, and context recovery after crashes. Covers practical patterns for making agents robust against tool failures, API errors, and reasoning dead-ends.
agent-evaluation
Testing and evaluating AI agents: trajectory evaluation, task completion metrics, tool-use accuracy measurement, regression testing, benchmark suites, and A/B testing agent configurations. Covers practical approaches to measuring whether agents are working correctly and improving over time.
agent-frameworks
Comparison of major AI agent frameworks: LangGraph, CrewAI, AutoGen, Semantic Kernel, and Claude Agent SDK. Covers when to use each framework, their trade-offs, core patterns, practical setup examples, and migration strategies between frameworks.
agent-guardrails
Safety and control systems for AI agents: input and output validation, action authorization, rate limiting, cost controls, content filtering, scope restriction, and audit logging. Covers practical implementations for keeping agents within bounds while maintaining their usefulness.
agent-memory
Memory systems for AI agents: conversation history management, summarization strategies, vector-based long-term memory, entity memory, episodic memory, and memory retrieval patterns. Covers practical implementations for giving agents persistent, searchable memory across sessions and within long-running tasks.