Hosting the Agent SDK in Production

The Claude Agent SDK is not stateless — it runs a persistent shell process that executes commands and manages files. This makes hosting it fundamentally different from calling a REST API.

System Requirements

Resource	Minimum
Runtime	Python 3.10+ or Node.js 18+
Node.js	Always required (SDK spawns Claude Code CLI internally; bundled, no separate install)
RAM	1 GiB recommended
Disk	5 GiB recommended
CPU	1 core recommended
Network	Outbound HTTPS to `api.anthropic.com`; optional MCP server access

Sandbox Provider Options

Managed container sandboxes purpose-built for AI code execution:

Modal Sandbox — has demo implementation (Slack gif creator)
Cloudflare Sandboxes — open-source sandbox-sdk
Daytona
E2B
Fly Machines
Vercel Sandbox

For self-hosted isolation (Docker, gVisor, Firecracker) see the Secure Deployment guide.

Deployment Patterns

Pattern 1: Ephemeral Sessions

New container per task, destroyed on completion. Best for one-off tasks.

Bug investigation & fix
Invoice/document processing
Translation batches
Image/video transformations

Pattern 2: Long-Running Sessions

Persistent containers, often running multiple Claude Agent processes per container. Best for proactive agents that act without user input.

Email triage agent
Per-user site builder (exposes container ports)
High-frequency chatbots (Slack, etc.)

Pattern 3: Hybrid Sessions

Ephemeral containers hydrated from a database or SDK session resumption. Best for intermittent-interaction workflows.

Personal project manager with context persistence
Deep multi-hour research (save, resume)
Multi-turn customer support tickets

Pattern 4: Single Container, Multiple Agents

Multiple SDK processes share one container. Least common — requires coordination to prevent agents overwriting each other.

Agent simulations (games, multi-agent environments)

Key Operational Notes

Communication: Expose HTTP/WebSocket ports from within the container to reach SDK instances externally.
Cost: Tokens dominate; container overhead starts ~$0.05/hr minimum depending on provider.
Idle shutdown: Tune idle timeout per provider based on expected user response cadence.
CLI versioning: Claude Code CLI uses semver — breaking changes are versioned, minor updates are safe to auto-apply.
Monitoring: Containers are standard servers — use your existing backend logging infrastructure.
Session timeout: No hard timeout, but set maxTurns to prevent infinite loops.

Key Takeaways

SDK runs as a long-running process, not a stateless call — it needs a persistent container environment.
Choose deployment pattern based on task lifetime: ephemeral (one-off) → long-running (proactive) → hybrid (intermittent) → single-container multi-agent (collaboration).
Every instance needs Node.js even for Python SDK (Claude Code CLI is bundled).
Minimum recommended specs: 1 GiB RAM, 5 GiB disk, 1 CPU.
Use maxTurns to guard against infinite agent loops.
Standard backend observability (logs, metrics) works as-is — containers are just servers.

wiki/agent-sdk/configure-permissions — control what tools agents can use inside containers
wiki/agent-sdk/hooks-guide — automate container lifecycle events
wiki/agent-sdk/mcp-integration — extend agents with external tool servers
wiki/agent-sdk/user-input-approvals — handle human-in-the-loop approvals in hosted agents
wiki/architecture/docker-compose — self-hosted container orchestration

Sources

raw/Hosting the Agent SDK.md

4 KiB Raw Blame History