8.5 KiB
| title | aliases | tags | sources | created | updated | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| How the Agent Loop Works |
|
|
|
2026-04-17 | 2026-04-17 |
Overview
The Agent SDK embeds Claude Code's autonomous execution loop in your application. When you call query(), Claude evaluates the prompt, calls tools, receives results, and repeats — until the task is done or a limit is hit.
The Loop Cycle
- Receive prompt — Claude gets prompt + system prompt + tool definitions + conversation history. SDK yields
SystemMessage(subtype="init"). - Evaluate and respond — Claude decides: text, tool calls, or both. SDK yields
AssistantMessage. - Execute tools — SDK runs each requested tool and returns results. Hooks can intercept before execution.
- Repeat — Steps 2–3 form one turn. Loop continues until Claude produces a response with no tool calls.
- Return result — SDK yields final
AssistantMessage(text only) +ResultMessage(text, cost, token usage, session ID).
Turns
- A turn = one round trip: Claude outputs tool calls → SDK executes → results fed back to Claude.
- Turns happen without yielding control to your code.
- The loop ends when Claude responds with no tool calls.
- Cap turns with
max_turns/maxTurns(counts tool-use turns only). - Cap cost with
max_budget_usd/maxBudgetUsd.
Message Types
| Type | When emitted | Key content |
|---|---|---|
SystemMessage |
Session start + after compaction | subtype="init" or "compact_boundary" |
AssistantMessage |
After each Claude response | Text blocks + tool call blocks |
UserMessage |
After each tool execution | Tool result content |
StreamEvent |
Only with partial messages enabled | Raw API streaming deltas |
ResultMessage |
End of loop | Final text, cost, usage, session ID, subtype |
Handling patterns:
- Final result only → handle
ResultMessage - Progress updates → handle
AssistantMessage - Live streaming → enable
include_partial_messages/includePartialMessages
Python: isinstance(message, ResultMessage)
TypeScript: message.type === "result" — note content is at message.message.content, not message.content
Tool Execution
Built-in Tools
| Category | Tools |
|---|---|
| File ops | Read, Edit, Write |
| Search | Glob, Grep |
| Execution | Bash |
| Web | WebSearch, WebFetch |
| Discovery | ToolSearch (load tools on-demand) |
| Orchestration | Agent, Skill, AskUserQuestion, TodoWrite |
Extend with MCP servers, custom tool handlers, or skill setting sources.
Parallel Execution
- Read-only tools (
Read,Glob,Grep, read-only MCP tools) → run concurrently - State-mutating tools (
Edit,Write,Bash) → run sequentially - Custom tools → sequential by default; opt into parallel via
readOnly/readOnlyHintannotation
Tool Permissions
allowed_tools/allowedTools— auto-approve listed toolsdisallowed_tools/disallowedTools— hard block regardless of other settings- Scope individual tools:
"Bash(npm *)"allows only npm commands
When denied, Claude receives a rejection message and adjusts approach.
Control Options
Permission Modes
| Mode | Behavior |
|---|---|
"default" |
Uncovered tools trigger approval callback; no callback = deny |
"acceptEdits" |
Auto-approves file edits + mkdir, touch, mv, cp; other Bash follows default |
"plan" |
No tool execution; Claude produces a plan only |
"dontAsk" |
Never prompts; pre-approved tools run, rest denied |
"auto" (TS only) |
Model classifier approves/denies each tool call |
"bypassPermissions" |
All allowed tools run without asking; not allowed as root on Unix |
Effort Levels
| Level | Good for |
|---|---|
"low" |
File lookups, listing directories |
"medium" |
Routine edits, standard tasks |
"high" |
Refactors, debugging (TypeScript SDK default) |
"xhigh" |
Coding + agentic tasks; recommended on Opus 4.7 |
"max" |
Multi-step deep analysis |
- Python SDK: unset by default (defers to model default)
effortis independent of Extended Thinking — they can be combined freely
Context Window
Context accumulates across all turns within a session — it does not reset.
| Source | Impact |
|---|---|
| System prompt | Small, fixed; always present |
| CLAUDE.md | Full content every request (prompt-cached after first) |
| Tool definitions | Each tool adds its schema |
| Conversation history | Grows with every turn |
| Skill descriptions | Short summaries; full content only on invocation |
Large tool outputs (big files, verbose commands) consume significant context fast.
Automatic Compaction
When context approaches its limit, the SDK summarizes older history. A compact_boundary message is emitted. Instructions from early prompts may be lost — put persistent rules in CLAUDE.md, not in the initial prompt.
Customize compaction:
- CLAUDE.md section — tell the compactor what to preserve (free-form header, matched by intent)
PreCompacthook — archive full transcript before compaction- Manual — send
/compactas a prompt string
Context Efficiency Tips
- Use subagents for subtasks — each starts with a fresh context; only final result returns to parent
- Scope subagent tool lists to the minimum needed
- Use
ToolSearchto load MCP tools on-demand instead of preloading all - Set
effort: "low"for routine read-only tasks
Sessions and Continuity
- Capture
ResultMessage.session_idto resume a session later - Resuming restores the full context: files read, analysis performed, actions taken
- Sessions can be forked to branch into a different approach
- Python
ClaudeSDKClientmanages session IDs automatically across calls
Result Subtypes
| Subtype | Meaning | result field? |
|---|---|---|
success |
Task completed normally | Yes |
error_max_turns |
Hit maxTurns limit |
No |
error_max_budget_usd |
Hit maxBudgetUsd limit |
No |
error_during_execution |
API failure or cancellation | No |
error_max_structured_output_retries |
Structured output validation failed | No |
Always check subtype before reading result. All subtypes carry total_cost_usd, usage, num_turns, session_id.
stop_reason values: end_turn, max_tokens, refusal — check to detect model refusals.
Hooks
Hooks run in your application process (not inside the agent context window — no context cost).
| Hook | When | Common use |
|---|---|---|
PreToolUse |
Before tool executes | Validate inputs, block dangerous commands |
PostToolUse |
After tool returns | Audit outputs, trigger side effects |
UserPromptSubmit |
When prompt is sent | Inject context |
Stop |
When agent finishes | Validate result, save state |
SubagentStart/Stop |
Subagent spawns/completes | Track parallel tasks |
PreCompact |
Before compaction | Archive transcript |
A PreToolUse hook that rejects a call prevents execution; Claude receives the rejection and adjusts.
See wiki/agent-sdk/hooks-guide for full event list and callback API.
Key Takeaways
- The agent loop = evaluate → tool calls → results → repeat, until no tool calls remain
- A turn is one round trip;
max_turnscounts tool-use turns only (not the final text-only response) ResultMessagealways arrives last — always iterate the stream to completion, never break early- Context accumulates across turns; use subagents, ToolSearch, and CLAUDE.md to keep it efficient
- Automatic compaction preserves recent history but can lose early instructions — put rules in CLAUDE.md
permission_mode+allowed_tools+disallowed_toolstogether determine what actually runseffortcontrols reasoning depth per turn; lower effort = lower cost for simple tasks- Session IDs enable resumption, continuation, and forking of any session
Related Articles
- wiki/agent-sdk/overview
- wiki/agent-sdk/hooks-guide
- wiki/agent-sdk/configure-permissions
- wiki/agent-sdk/mcp-integration
- wiki/agent-sdk/hosting-production
- wiki/agent-sdk/python-api-reference
- wiki/agent-sdk/typescript-api-reference
Sources
raw/How the agent loop works.md— sourced from https://code.claude.com/docs/en/agent-sdk/agent-loop