194 lines
8.5 KiB
Markdown
194 lines
8.5 KiB
Markdown
---
|
||
title: "How the Agent Loop Works"
|
||
aliases: [agent-loop, sdk-agent-loop, message-lifecycle]
|
||
tags: [agent-sdk, agent-loop, messages, tools, context-window, sessions]
|
||
sources: [raw/How the agent loop works.md]
|
||
created: 2026-04-17
|
||
updated: 2026-04-17
|
||
---
|
||
|
||
## Overview
|
||
|
||
The Agent SDK embeds Claude Code's autonomous execution loop in your application. When you call `query()`, Claude evaluates the prompt, calls tools, receives results, and repeats — until the task is done or a limit is hit.
|
||
|
||
## The Loop Cycle
|
||
|
||
1. **Receive prompt** — Claude gets prompt + system prompt + tool definitions + conversation history. SDK yields `SystemMessage(subtype="init")`.
|
||
2. **Evaluate and respond** — Claude decides: text, tool calls, or both. SDK yields `AssistantMessage`.
|
||
3. **Execute tools** — SDK runs each requested tool and returns results. Hooks can intercept before execution.
|
||
4. **Repeat** — Steps 2–3 form one *turn*. Loop continues until Claude produces a response with **no tool calls**.
|
||
5. **Return result** — SDK yields final `AssistantMessage` (text only) + `ResultMessage` (text, cost, token usage, session ID).
|
||
|
||
## Turns
|
||
|
||
- A **turn** = one round trip: Claude outputs tool calls → SDK executes → results fed back to Claude.
|
||
- Turns happen without yielding control to your code.
|
||
- The loop ends when Claude responds with no tool calls.
|
||
- Cap turns with `max_turns` / `maxTurns` (counts tool-use turns only).
|
||
- Cap cost with `max_budget_usd` / `maxBudgetUsd`.
|
||
|
||
## Message Types
|
||
|
||
| Type | When emitted | Key content |
|
||
|------|-------------|-------------|
|
||
| `SystemMessage` | Session start + after compaction | `subtype="init"` or `"compact_boundary"` |
|
||
| `AssistantMessage` | After each Claude response | Text blocks + tool call blocks |
|
||
| `UserMessage` | After each tool execution | Tool result content |
|
||
| `StreamEvent` | Only with partial messages enabled | Raw API streaming deltas |
|
||
| `ResultMessage` | End of loop | Final text, cost, usage, session ID, `subtype` |
|
||
|
||
**Handling patterns:**
|
||
- Final result only → handle `ResultMessage`
|
||
- Progress updates → handle `AssistantMessage`
|
||
- Live streaming → enable `include_partial_messages` / `includePartialMessages`
|
||
|
||
**Python:** `isinstance(message, ResultMessage)`
|
||
**TypeScript:** `message.type === "result"` — note content is at `message.message.content`, not `message.content`
|
||
|
||
## Tool Execution
|
||
|
||
### Built-in Tools
|
||
|
||
| Category | Tools |
|
||
|----------|-------|
|
||
| File ops | `Read`, `Edit`, `Write` |
|
||
| Search | `Glob`, `Grep` |
|
||
| Execution | `Bash` |
|
||
| Web | `WebSearch`, `WebFetch` |
|
||
| Discovery | `ToolSearch` (load tools on-demand) |
|
||
| Orchestration | `Agent`, `Skill`, `AskUserQuestion`, `TodoWrite` |
|
||
|
||
Extend with MCP servers, custom tool handlers, or skill setting sources.
|
||
|
||
### Parallel Execution
|
||
|
||
- Read-only tools (`Read`, `Glob`, `Grep`, read-only MCP tools) → run **concurrently**
|
||
- State-mutating tools (`Edit`, `Write`, `Bash`) → run **sequentially**
|
||
- Custom tools → sequential by default; opt into parallel via `readOnly` / `readOnlyHint` annotation
|
||
|
||
### Tool Permissions
|
||
|
||
- `allowed_tools` / `allowedTools` — auto-approve listed tools
|
||
- `disallowed_tools` / `disallowedTools` — hard block regardless of other settings
|
||
- Scope individual tools: `"Bash(npm *)"` allows only npm commands
|
||
|
||
When denied, Claude receives a rejection message and adjusts approach.
|
||
|
||
## Control Options
|
||
|
||
### Permission Modes
|
||
|
||
| Mode | Behavior |
|
||
|------|----------|
|
||
| `"default"` | Uncovered tools trigger approval callback; no callback = deny |
|
||
| `"acceptEdits"` | Auto-approves file edits + `mkdir`, `touch`, `mv`, `cp`; other Bash follows default |
|
||
| `"plan"` | No tool execution; Claude produces a plan only |
|
||
| `"dontAsk"` | Never prompts; pre-approved tools run, rest denied |
|
||
| `"auto"` (TS only) | Model classifier approves/denies each tool call |
|
||
| `"bypassPermissions"` | All allowed tools run without asking; not allowed as root on Unix |
|
||
|
||
### Effort Levels
|
||
|
||
| Level | Good for |
|
||
|-------|----------|
|
||
| `"low"` | File lookups, listing directories |
|
||
| `"medium"` | Routine edits, standard tasks |
|
||
| `"high"` | Refactors, debugging (TypeScript SDK default) |
|
||
| `"xhigh"` | Coding + agentic tasks; recommended on Opus 4.7 |
|
||
| `"max"` | Multi-step deep analysis |
|
||
|
||
- Python SDK: unset by default (defers to model default)
|
||
- `effort` is independent of Extended Thinking — they can be combined freely
|
||
|
||
## Context Window
|
||
|
||
Context **accumulates across all turns** within a session — it does not reset.
|
||
|
||
| Source | Impact |
|
||
|--------|--------|
|
||
| System prompt | Small, fixed; always present |
|
||
| CLAUDE.md | Full content every request (prompt-cached after first) |
|
||
| Tool definitions | Each tool adds its schema |
|
||
| Conversation history | Grows with every turn |
|
||
| Skill descriptions | Short summaries; full content only on invocation |
|
||
|
||
Large tool outputs (big files, verbose commands) consume significant context fast.
|
||
|
||
### Automatic Compaction
|
||
|
||
When context approaches its limit, the SDK summarizes older history. A `compact_boundary` message is emitted. Instructions from early prompts may be lost — **put persistent rules in CLAUDE.md**, not in the initial prompt.
|
||
|
||
Customize compaction:
|
||
- **CLAUDE.md section** — tell the compactor what to preserve (free-form header, matched by intent)
|
||
- **`PreCompact` hook** — archive full transcript before compaction
|
||
- **Manual** — send `/compact` as a prompt string
|
||
|
||
### Context Efficiency Tips
|
||
|
||
- Use subagents for subtasks — each starts with a fresh context; only final result returns to parent
|
||
- Scope subagent tool lists to the minimum needed
|
||
- Use `ToolSearch` to load MCP tools on-demand instead of preloading all
|
||
- Set `effort: "low"` for routine read-only tasks
|
||
|
||
## Sessions and Continuity
|
||
|
||
- Capture `ResultMessage.session_id` to resume a session later
|
||
- Resuming restores the full context: files read, analysis performed, actions taken
|
||
- Sessions can be forked to branch into a different approach
|
||
- Python `ClaudeSDKClient` manages session IDs automatically across calls
|
||
|
||
## Result Subtypes
|
||
|
||
| Subtype | Meaning | `result` field? |
|
||
|---------|---------|----------------|
|
||
| `success` | Task completed normally | Yes |
|
||
| `error_max_turns` | Hit `maxTurns` limit | No |
|
||
| `error_max_budget_usd` | Hit `maxBudgetUsd` limit | No |
|
||
| `error_during_execution` | API failure or cancellation | No |
|
||
| `error_max_structured_output_retries` | Structured output validation failed | No |
|
||
|
||
Always check `subtype` before reading `result`. All subtypes carry `total_cost_usd`, `usage`, `num_turns`, `session_id`.
|
||
|
||
`stop_reason` values: `end_turn`, `max_tokens`, `refusal` — check to detect model refusals.
|
||
|
||
## Hooks
|
||
|
||
Hooks run in your application process (not inside the agent context window — no context cost).
|
||
|
||
| Hook | When | Common use |
|
||
|------|------|-----------|
|
||
| `PreToolUse` | Before tool executes | Validate inputs, block dangerous commands |
|
||
| `PostToolUse` | After tool returns | Audit outputs, trigger side effects |
|
||
| `UserPromptSubmit` | When prompt is sent | Inject context |
|
||
| `Stop` | When agent finishes | Validate result, save state |
|
||
| `SubagentStart/Stop` | Subagent spawns/completes | Track parallel tasks |
|
||
| `PreCompact` | Before compaction | Archive transcript |
|
||
|
||
A `PreToolUse` hook that rejects a call prevents execution; Claude receives the rejection and adjusts.
|
||
|
||
See [[wiki/agent-sdk/hooks-guide|Hooks Guide]] for full event list and callback API.
|
||
|
||
## Key Takeaways
|
||
|
||
- The agent loop = evaluate → tool calls → results → repeat, until no tool calls remain
|
||
- A **turn** is one round trip; `max_turns` counts tool-use turns only (not the final text-only response)
|
||
- `ResultMessage` always arrives last — always iterate the stream to completion, never break early
|
||
- Context accumulates across turns; use subagents, ToolSearch, and CLAUDE.md to keep it efficient
|
||
- Automatic compaction preserves recent history but can lose early instructions — put rules in CLAUDE.md
|
||
- `permission_mode` + `allowed_tools` + `disallowed_tools` together determine what actually runs
|
||
- `effort` controls reasoning depth per turn; lower effort = lower cost for simple tasks
|
||
- Session IDs enable resumption, continuation, and forking of any session
|
||
|
||
## Related Articles
|
||
|
||
- [[wiki/agent-sdk/overview|Agent SDK Overview]]
|
||
- [[wiki/agent-sdk/hooks-guide|Hooks Guide]]
|
||
- [[wiki/agent-sdk/configure-permissions|Configure Permissions]]
|
||
- [[wiki/agent-sdk/mcp-integration|MCP Integration]]
|
||
- [[wiki/agent-sdk/hosting-production|Hosting & Production]]
|
||
- [[wiki/agent-sdk/python-api-reference|Python API Reference]]
|
||
- [[wiki/agent-sdk/typescript-api-reference|TypeScript API Reference]]
|
||
|
||
## Sources
|
||
|
||
- `raw/How the agent loop works.md` — sourced from https://code.claude.com/docs/en/agent-sdk/agent-loop
|