obsidian/wiki/agent-sdk/agent-loop.md
2026-04-17 12:52:04 +01:00

194 lines
8.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "How the Agent Loop Works"
aliases: [agent-loop, sdk-agent-loop, message-lifecycle]
tags: [agent-sdk, agent-loop, messages, tools, context-window, sessions]
sources: [raw/How the agent loop works.md]
created: 2026-04-17
updated: 2026-04-17
---
## Overview
The Agent SDK embeds Claude Code's autonomous execution loop in your application. When you call `query()`, Claude evaluates the prompt, calls tools, receives results, and repeats — until the task is done or a limit is hit.
## The Loop Cycle
1. **Receive prompt** — Claude gets prompt + system prompt + tool definitions + conversation history. SDK yields `SystemMessage(subtype="init")`.
2. **Evaluate and respond** — Claude decides: text, tool calls, or both. SDK yields `AssistantMessage`.
3. **Execute tools** — SDK runs each requested tool and returns results. Hooks can intercept before execution.
4. **Repeat** — Steps 23 form one *turn*. Loop continues until Claude produces a response with **no tool calls**.
5. **Return result** — SDK yields final `AssistantMessage` (text only) + `ResultMessage` (text, cost, token usage, session ID).
## Turns
- A **turn** = one round trip: Claude outputs tool calls → SDK executes → results fed back to Claude.
- Turns happen without yielding control to your code.
- The loop ends when Claude responds with no tool calls.
- Cap turns with `max_turns` / `maxTurns` (counts tool-use turns only).
- Cap cost with `max_budget_usd` / `maxBudgetUsd`.
## Message Types
| Type | When emitted | Key content |
|------|-------------|-------------|
| `SystemMessage` | Session start + after compaction | `subtype="init"` or `"compact_boundary"` |
| `AssistantMessage` | After each Claude response | Text blocks + tool call blocks |
| `UserMessage` | After each tool execution | Tool result content |
| `StreamEvent` | Only with partial messages enabled | Raw API streaming deltas |
| `ResultMessage` | End of loop | Final text, cost, usage, session ID, `subtype` |
**Handling patterns:**
- Final result only → handle `ResultMessage`
- Progress updates → handle `AssistantMessage`
- Live streaming → enable `include_partial_messages` / `includePartialMessages`
**Python:** `isinstance(message, ResultMessage)`
**TypeScript:** `message.type === "result"` — note content is at `message.message.content`, not `message.content`
## Tool Execution
### Built-in Tools
| Category | Tools |
|----------|-------|
| File ops | `Read`, `Edit`, `Write` |
| Search | `Glob`, `Grep` |
| Execution | `Bash` |
| Web | `WebSearch`, `WebFetch` |
| Discovery | `ToolSearch` (load tools on-demand) |
| Orchestration | `Agent`, `Skill`, `AskUserQuestion`, `TodoWrite` |
Extend with MCP servers, custom tool handlers, or skill setting sources.
### Parallel Execution
- Read-only tools (`Read`, `Glob`, `Grep`, read-only MCP tools) → run **concurrently**
- State-mutating tools (`Edit`, `Write`, `Bash`) → run **sequentially**
- Custom tools → sequential by default; opt into parallel via `readOnly` / `readOnlyHint` annotation
### Tool Permissions
- `allowed_tools` / `allowedTools` — auto-approve listed tools
- `disallowed_tools` / `disallowedTools` — hard block regardless of other settings
- Scope individual tools: `"Bash(npm *)"` allows only npm commands
When denied, Claude receives a rejection message and adjusts approach.
## Control Options
### Permission Modes
| Mode | Behavior |
|------|----------|
| `"default"` | Uncovered tools trigger approval callback; no callback = deny |
| `"acceptEdits"` | Auto-approves file edits + `mkdir`, `touch`, `mv`, `cp`; other Bash follows default |
| `"plan"` | No tool execution; Claude produces a plan only |
| `"dontAsk"` | Never prompts; pre-approved tools run, rest denied |
| `"auto"` (TS only) | Model classifier approves/denies each tool call |
| `"bypassPermissions"` | All allowed tools run without asking; not allowed as root on Unix |
### Effort Levels
| Level | Good for |
|-------|----------|
| `"low"` | File lookups, listing directories |
| `"medium"` | Routine edits, standard tasks |
| `"high"` | Refactors, debugging (TypeScript SDK default) |
| `"xhigh"` | Coding + agentic tasks; recommended on Opus 4.7 |
| `"max"` | Multi-step deep analysis |
- Python SDK: unset by default (defers to model default)
- `effort` is independent of Extended Thinking — they can be combined freely
## Context Window
Context **accumulates across all turns** within a session — it does not reset.
| Source | Impact |
|--------|--------|
| System prompt | Small, fixed; always present |
| CLAUDE.md | Full content every request (prompt-cached after first) |
| Tool definitions | Each tool adds its schema |
| Conversation history | Grows with every turn |
| Skill descriptions | Short summaries; full content only on invocation |
Large tool outputs (big files, verbose commands) consume significant context fast.
### Automatic Compaction
When context approaches its limit, the SDK summarizes older history. A `compact_boundary` message is emitted. Instructions from early prompts may be lost — **put persistent rules in CLAUDE.md**, not in the initial prompt.
Customize compaction:
- **CLAUDE.md section** — tell the compactor what to preserve (free-form header, matched by intent)
- **`PreCompact` hook** — archive full transcript before compaction
- **Manual** — send `/compact` as a prompt string
### Context Efficiency Tips
- Use subagents for subtasks — each starts with a fresh context; only final result returns to parent
- Scope subagent tool lists to the minimum needed
- Use `ToolSearch` to load MCP tools on-demand instead of preloading all
- Set `effort: "low"` for routine read-only tasks
## Sessions and Continuity
- Capture `ResultMessage.session_id` to resume a session later
- Resuming restores the full context: files read, analysis performed, actions taken
- Sessions can be forked to branch into a different approach
- Python `ClaudeSDKClient` manages session IDs automatically across calls
## Result Subtypes
| Subtype | Meaning | `result` field? |
|---------|---------|----------------|
| `success` | Task completed normally | Yes |
| `error_max_turns` | Hit `maxTurns` limit | No |
| `error_max_budget_usd` | Hit `maxBudgetUsd` limit | No |
| `error_during_execution` | API failure or cancellation | No |
| `error_max_structured_output_retries` | Structured output validation failed | No |
Always check `subtype` before reading `result`. All subtypes carry `total_cost_usd`, `usage`, `num_turns`, `session_id`.
`stop_reason` values: `end_turn`, `max_tokens`, `refusal` — check to detect model refusals.
## Hooks
Hooks run in your application process (not inside the agent context window — no context cost).
| Hook | When | Common use |
|------|------|-----------|
| `PreToolUse` | Before tool executes | Validate inputs, block dangerous commands |
| `PostToolUse` | After tool returns | Audit outputs, trigger side effects |
| `UserPromptSubmit` | When prompt is sent | Inject context |
| `Stop` | When agent finishes | Validate result, save state |
| `SubagentStart/Stop` | Subagent spawns/completes | Track parallel tasks |
| `PreCompact` | Before compaction | Archive transcript |
A `PreToolUse` hook that rejects a call prevents execution; Claude receives the rejection and adjusts.
See [[wiki/agent-sdk/hooks-guide|Hooks Guide]] for full event list and callback API.
## Key Takeaways
- The agent loop = evaluate → tool calls → results → repeat, until no tool calls remain
- A **turn** is one round trip; `max_turns` counts tool-use turns only (not the final text-only response)
- `ResultMessage` always arrives last — always iterate the stream to completion, never break early
- Context accumulates across turns; use subagents, ToolSearch, and CLAUDE.md to keep it efficient
- Automatic compaction preserves recent history but can lose early instructions — put rules in CLAUDE.md
- `permission_mode` + `allowed_tools` + `disallowed_tools` together determine what actually runs
- `effort` controls reasoning depth per turn; lower effort = lower cost for simple tasks
- Session IDs enable resumption, continuation, and forking of any session
## Related Articles
- [[wiki/agent-sdk/overview|Agent SDK Overview]]
- [[wiki/agent-sdk/hooks-guide|Hooks Guide]]
- [[wiki/agent-sdk/configure-permissions|Configure Permissions]]
- [[wiki/agent-sdk/mcp-integration|MCP Integration]]
- [[wiki/agent-sdk/hosting-production|Hosting & Production]]
- [[wiki/agent-sdk/python-api-reference|Python API Reference]]
- [[wiki/agent-sdk/typescript-api-reference|TypeScript API Reference]]
## Sources
- `raw/How the agent loop works.md` — sourced from https://code.claude.com/docs/en/agent-sdk/agent-loop