--- title: "Streaming Output in Real-Time" aliases: [streaming, partial-messages, stream-events] tags: [agent-sdk, streaming, python, typescript, real-time] sources: [raw/Stream responses in real-time.md] created: 2026-04-17 updated: 2026-04-17 --- ## Overview By default the Agent SDK yields complete `AssistantMessage` objects after each full response. Enable **partial message streaming** to receive incremental tokens and tool-call deltas as they arrive. - **Python**: set `include_partial_messages=True` in `ClaudeAgentOptions` - **TypeScript**: set `includePartialMessages: true` in options ## How It Works When streaming is enabled, the SDK emits `StreamEvent` messages wrapping raw Claude API events **before** the final `AssistantMessage`. Your loop must check message type first, then inspect the nested event: ```python async for message in query(prompt="...", options=options): if isinstance(message, StreamEvent): event = message.event if event.get("type") == "content_block_delta": delta = event.get("delta", {}) if delta.get("type") == "text_delta": print(delta.get("text", ""), end="", flush=True) ``` ## StreamEvent Structure | Field | Type | Description | |-------|------|-------------| | `uuid` | str | Unique event identifier | | `session_id` | str | Session identifier | | `event` | dict | Raw Claude API stream event | | `parent_tool_use_id` | str \| None | Set when event is from a subagent | **TypeScript name:** `SDKPartialAssistantMessage` with `type: 'stream_event'` ## Common Event Types | Event Type | Description | |------------|-------------| | `message_start` | New message begins | | `content_block_start` | New text or tool-use block begins | | `content_block_delta` | Incremental update (`text_delta` or `input_json_delta`) | | `content_block_stop` | Block complete | | `message_delta` | Stop reason, usage counts | | `message_stop` | Message complete | ## Message Flow ``` StreamEvent (message_start) StreamEvent (content_block_start) ← text block StreamEvent (content_block_delta) ← text chunks ... StreamEvent (content_block_stop) StreamEvent (content_block_start) ← tool_use block StreamEvent (content_block_delta) ← input_json_delta chunks ... StreamEvent (content_block_stop) StreamEvent (message_delta / message_stop) AssistantMessage ← complete message ... tool executes ... ResultMessage ← final result ``` Without streaming enabled you receive: `SystemMessage`, `AssistantMessage`, `ResultMessage`, and `SDKCompactBoundaryMessage` (TypeScript) / `SystemMessage` with subtype `"compact_boundary"` (Python). ## Streaming Text Look for `content_block_delta` → `delta.type == "text_delta"` → `delta.text`. ## Streaming Tool Calls Three events to watch: | Event | Action | |-------|--------| | `content_block_start` + `content_block.type == "tool_use"` | Tool starting — capture `name` | | `content_block_delta` + `delta.type == "input_json_delta"` | Accumulate `partial_json` | | `content_block_stop` | Tool call complete — use accumulated JSON | ## Building a Streaming UI Use an `in_tool` flag to switch between rendering text tokens and showing a `[Using ToolName...]` status indicator: ```python in_tool = False async for message in query(prompt="...", options=options): if isinstance(message, StreamEvent): event = message.event t = event.get("type") if t == "content_block_start": cb = event.get("content_block", {}) if cb.get("type") == "tool_use": print(f"\n[Using {cb['name']}...]", end="", flush=True) in_tool = True elif t == "content_block_delta": d = event.get("delta", {}) if d.get("type") == "text_delta" and not in_tool: sys.stdout.write(d.get("text", "")) sys.stdout.flush() elif t == "content_block_stop" and in_tool: print(" done", flush=True) in_tool = False ``` ## Known Limitations - **Extended thinking**: when `max_thinking_tokens` / `maxThinkingTokens` is set, `StreamEvent` messages are **not** emitted — only complete messages arrive. Thinking is off by default, so streaming works unless you explicitly enable it. - **Structured output**: JSON result appears only in the final `ResultMessage.structured_output`, never as streaming deltas. See [[wiki/agent-sdk/structured-outputs|structured-outputs]]. ## Key Takeaways - Set `include_partial_messages=True` (Python) / `includePartialMessages: true` (TypeScript) to opt in. - Events arrive as `StreamEvent` wrappers around raw Claude API streaming events — you accumulate text/JSON yourself. - Text: `content_block_delta` → `text_delta` → `delta.text` - Tool input: `content_block_delta` → `input_json_delta` → `delta.partial_json` - The complete `AssistantMessage` still arrives after all deltas — you don't have to reconstruct it. - Incompatible with extended thinking; structured output only in final `ResultMessage`. ## Related - [[wiki/agent-sdk/agent-loop|agent-loop]] — full message lifecycle, turn structure, compaction - [[wiki/agent-sdk/python-api-reference|python-api-reference]] — `ClaudeAgentOptions`, `StreamEvent`, all types - [[wiki/agent-sdk/typescript-api-reference|typescript-api-reference]] — `SDKPartialAssistantMessage`, `includePartialMessages` - [[wiki/agent-sdk/structured-outputs|structured-outputs]] — JSON results from agents (not streaming) - [[wiki/agent-sdk/user-input-approvals|user-input-approvals]] — `canUseTool`, Python streaming workaround ## Sources - `raw/Stream responses in real-time.md` — official Agent SDK streaming output docs