Observability with OpenTelemetry

Export traces, metrics, and log events from the Agent SDK to any OTLP-compatible backend (Honeycomb, Datadog, Grafana, Langfuse, self-hosted collector).

How Telemetry Flows

The SDK runs the Claude Code CLI as a child process — the CLI emits telemetry, not the SDK itself
Configuration is passed via environment variables inherited by the child process
Two configuration strategies:
- Process environment (recommended for production): set vars in shell/container/orchestrator — all query() calls pick them up automatically
- Per-call options.env: use when different agents need different telemetry settings
  - Python: env merges on top of inherited environment
  - TypeScript: env replaces inherited environment — always include ...process.env

Three Signals

Signal	What it contains	Enable with
Metrics	Token/cost counters, sessions, lines of code, tool decisions	`OTEL_METRICS_EXPORTER`
Log events	Structured records per prompt, API request, error, tool result	`OTEL_LOGS_EXPORTER`
Traces (beta)	Spans per interaction, model request, tool call, hook	`OTEL_TRACES_EXPORTER` + `CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1`

Enabling Telemetry

Telemetry is off by default. Minimum required:

OTEL_ENV = {
    "CLAUDE_CODE_ENABLE_TELEMETRY": "1",
    "CLAUDE_CODE_ENHANCED_TELEMETRY_BETA": "1",   # required for traces
    "OTEL_TRACES_EXPORTER": "otlp",
    "OTEL_METRICS_EXPORTER": "otlp",
    "OTEL_LOGS_EXPORTER": "otlp",
    "OTEL_EXPORTER_OTLP_PROTOCOL": "http/protobuf",
    "OTEL_EXPORTER_OTLP_ENDPOINT": "http://collector.example.com:4318",
    "OTEL_EXPORTER_OTLP_HEADERS": "Authorization=Bearer your-token",
}

options = ClaudeAgentOptions(env=OTEL_ENV)
async for message in query(prompt="...", options=options):
    print(message)

Do not use console exporter — the SDK uses stdout as its message channel. Use a local OTLP collector or Jaeger for local inspection instead.

Flushing Short-Lived Calls

Default export intervals are slow (metrics: 60s, traces/logs: 5s). For short tasks, lower the intervals:

"OTEL_METRIC_EXPORT_INTERVAL": "1000",   # ms
"OTEL_LOGS_EXPORT_INTERVAL": "1000",
"OTEL_TRACES_EXPORT_INTERVAL": "1000",

The CLI flushes on clean exit but is bounded by a timeout — spans can be dropped if the collector is slow
Spans are lost entirely if the process is killed before CLI shutdown

Span Names (Traces)

Span	Wraps
`claude_code.interaction`	One full agent turn (prompt → response)
`claude_code.llm_request`	Single Claude API call; carries model, latency, token counts
`claude_code.tool`	Tool invocation; child spans: `claude_code.tool.blocked_on_user`, `claude_code.tool.execution`
`claude_code.hook`	Hook execution

All spans carry session.id — filter on it to group multi-turn sessions into one timeline
Set OTEL_METRICS_INCLUDE_SESSION_ID=false to omit the attribute

Tagging Telemetry

Override the default service.name = "claude-code" when running multiple agents:

options = ClaudeAgentOptions(
    env={
        "OTEL_SERVICE_NAME": "support-triage-agent",
        "OTEL_RESOURCE_ATTRIBUTES": "service.version=1.4.0,deployment.environment=production",
    },
)

Sensitive Data Controls

Content is not recorded by default. Opt-in variables:

Variable	Adds
`OTEL_LOG_USER_PROMPTS=1`	Prompt text on events and interaction span
`OTEL_LOG_TOOL_DETAILS=1`	Tool input args (file paths, shell commands) on tool_result events
`OTEL_LOG_TOOL_CONTENT=1`	Full tool input/output bodies as span events (max 60 KB, requires tracing enabled)

Leave unset unless your observability pipeline is approved to store the data your agent handles.

Key Takeaways

Telemetry comes from the CLI child process, not the SDK — configure via env vars
Must set CLAUDE_CODE_ENABLE_TELEMETRY=1 plus at least one OTEL_*_EXPORTER
Traces require an additional beta flag: CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1
TypeScript options.env replaces the environment — always spread ...process.env
Lower export intervals for short-lived agent calls to avoid dropped spans
Content (prompts, tool I/O) is redacted by default; three opt-in vars add it back
Use OTEL_SERVICE_NAME to distinguish multiple agents in the same collector

wiki/agent-sdk/hosting-production — set OTEL vars at container/orchestrator level
wiki/agent-sdk/agent-loop — understand what each span represents
wiki/agent-sdk/hooks-guide — claude_code.hook spans wrap hook executions
wiki/claude-code/monitoring-usage — full list of env vars, metric names, event names

Sources

raw/Observability with OpenTelemetry.md

5.2 KiB Raw Blame History