obsidian/wiki/agent-sdk/agent-loop.md
2026-04-17 12:52:04 +01:00

8.5 KiB
Raw Permalink Blame History

title aliases tags sources created updated
How the Agent Loop Works
agent-loop
sdk-agent-loop
message-lifecycle
agent-sdk
agent-loop
messages
tools
context-window
sessions
raw/How the agent loop works.md
2026-04-17 2026-04-17

Overview

The Agent SDK embeds Claude Code's autonomous execution loop in your application. When you call query(), Claude evaluates the prompt, calls tools, receives results, and repeats — until the task is done or a limit is hit.

The Loop Cycle

  1. Receive prompt — Claude gets prompt + system prompt + tool definitions + conversation history. SDK yields SystemMessage(subtype="init").
  2. Evaluate and respond — Claude decides: text, tool calls, or both. SDK yields AssistantMessage.
  3. Execute tools — SDK runs each requested tool and returns results. Hooks can intercept before execution.
  4. Repeat — Steps 23 form one turn. Loop continues until Claude produces a response with no tool calls.
  5. Return result — SDK yields final AssistantMessage (text only) + ResultMessage (text, cost, token usage, session ID).

Turns

  • A turn = one round trip: Claude outputs tool calls → SDK executes → results fed back to Claude.
  • Turns happen without yielding control to your code.
  • The loop ends when Claude responds with no tool calls.
  • Cap turns with max_turns / maxTurns (counts tool-use turns only).
  • Cap cost with max_budget_usd / maxBudgetUsd.

Message Types

Type When emitted Key content
SystemMessage Session start + after compaction subtype="init" or "compact_boundary"
AssistantMessage After each Claude response Text blocks + tool call blocks
UserMessage After each tool execution Tool result content
StreamEvent Only with partial messages enabled Raw API streaming deltas
ResultMessage End of loop Final text, cost, usage, session ID, subtype

Handling patterns:

  • Final result only → handle ResultMessage
  • Progress updates → handle AssistantMessage
  • Live streaming → enable include_partial_messages / includePartialMessages

Python: isinstance(message, ResultMessage)
TypeScript: message.type === "result" — note content is at message.message.content, not message.content

Tool Execution

Built-in Tools

Category Tools
File ops Read, Edit, Write
Search Glob, Grep
Execution Bash
Web WebSearch, WebFetch
Discovery ToolSearch (load tools on-demand)
Orchestration Agent, Skill, AskUserQuestion, TodoWrite

Extend with MCP servers, custom tool handlers, or skill setting sources.

Parallel Execution

  • Read-only tools (Read, Glob, Grep, read-only MCP tools) → run concurrently
  • State-mutating tools (Edit, Write, Bash) → run sequentially
  • Custom tools → sequential by default; opt into parallel via readOnly / readOnlyHint annotation

Tool Permissions

  • allowed_tools / allowedTools — auto-approve listed tools
  • disallowed_tools / disallowedTools — hard block regardless of other settings
  • Scope individual tools: "Bash(npm *)" allows only npm commands

When denied, Claude receives a rejection message and adjusts approach.

Control Options

Permission Modes

Mode Behavior
"default" Uncovered tools trigger approval callback; no callback = deny
"acceptEdits" Auto-approves file edits + mkdir, touch, mv, cp; other Bash follows default
"plan" No tool execution; Claude produces a plan only
"dontAsk" Never prompts; pre-approved tools run, rest denied
"auto" (TS only) Model classifier approves/denies each tool call
"bypassPermissions" All allowed tools run without asking; not allowed as root on Unix

Effort Levels

Level Good for
"low" File lookups, listing directories
"medium" Routine edits, standard tasks
"high" Refactors, debugging (TypeScript SDK default)
"xhigh" Coding + agentic tasks; recommended on Opus 4.7
"max" Multi-step deep analysis
  • Python SDK: unset by default (defers to model default)
  • effort is independent of Extended Thinking — they can be combined freely

Context Window

Context accumulates across all turns within a session — it does not reset.

Source Impact
System prompt Small, fixed; always present
CLAUDE.md Full content every request (prompt-cached after first)
Tool definitions Each tool adds its schema
Conversation history Grows with every turn
Skill descriptions Short summaries; full content only on invocation

Large tool outputs (big files, verbose commands) consume significant context fast.

Automatic Compaction

When context approaches its limit, the SDK summarizes older history. A compact_boundary message is emitted. Instructions from early prompts may be lost — put persistent rules in CLAUDE.md, not in the initial prompt.

Customize compaction:

  • CLAUDE.md section — tell the compactor what to preserve (free-form header, matched by intent)
  • PreCompact hook — archive full transcript before compaction
  • Manual — send /compact as a prompt string

Context Efficiency Tips

  • Use subagents for subtasks — each starts with a fresh context; only final result returns to parent
  • Scope subagent tool lists to the minimum needed
  • Use ToolSearch to load MCP tools on-demand instead of preloading all
  • Set effort: "low" for routine read-only tasks

Sessions and Continuity

  • Capture ResultMessage.session_id to resume a session later
  • Resuming restores the full context: files read, analysis performed, actions taken
  • Sessions can be forked to branch into a different approach
  • Python ClaudeSDKClient manages session IDs automatically across calls

Result Subtypes

Subtype Meaning result field?
success Task completed normally Yes
error_max_turns Hit maxTurns limit No
error_max_budget_usd Hit maxBudgetUsd limit No
error_during_execution API failure or cancellation No
error_max_structured_output_retries Structured output validation failed No

Always check subtype before reading result. All subtypes carry total_cost_usd, usage, num_turns, session_id.

stop_reason values: end_turn, max_tokens, refusal — check to detect model refusals.

Hooks

Hooks run in your application process (not inside the agent context window — no context cost).

Hook When Common use
PreToolUse Before tool executes Validate inputs, block dangerous commands
PostToolUse After tool returns Audit outputs, trigger side effects
UserPromptSubmit When prompt is sent Inject context
Stop When agent finishes Validate result, save state
SubagentStart/Stop Subagent spawns/completes Track parallel tasks
PreCompact Before compaction Archive transcript

A PreToolUse hook that rejects a call prevents execution; Claude receives the rejection and adjusts.

See wiki/agent-sdk/hooks-guide for full event list and callback API.

Key Takeaways

  • The agent loop = evaluate → tool calls → results → repeat, until no tool calls remain
  • A turn is one round trip; max_turns counts tool-use turns only (not the final text-only response)
  • ResultMessage always arrives last — always iterate the stream to completion, never break early
  • Context accumulates across turns; use subagents, ToolSearch, and CLAUDE.md to keep it efficient
  • Automatic compaction preserves recent history but can lose early instructions — put rules in CLAUDE.md
  • permission_mode + allowed_tools + disallowed_tools together determine what actually runs
  • effort controls reasoning depth per turn; lower effort = lower cost for simple tasks
  • Session IDs enable resumption, continuation, and forking of any session

Sources