obsidian/wiki/agent-sdk/hosting-production.md
2026-04-17 12:51:19 +01:00

4 KiB

title aliases tags sources created updated
Hosting the Agent SDK in Production
agent-sdk-hosting
sdk-deployment
sdk-production
agent-sdk
hosting
deployment
containers
production
raw/Hosting the Agent SDK.md
2026-04-17 2026-04-17

Hosting the Agent SDK in Production

The Claude Agent SDK is not stateless — it runs a persistent shell process that executes commands and manages files. This makes hosting it fundamentally different from calling a REST API.

System Requirements

Resource Minimum
Runtime Python 3.10+ or Node.js 18+
Node.js Always required (SDK spawns Claude Code CLI internally; bundled, no separate install)
RAM 1 GiB recommended
Disk 5 GiB recommended
CPU 1 core recommended
Network Outbound HTTPS to api.anthropic.com; optional MCP server access

Sandbox Provider Options

Managed container sandboxes purpose-built for AI code execution:

  • Modal Sandbox — has demo implementation (Slack gif creator)
  • Cloudflare Sandboxes — open-source sandbox-sdk
  • Daytona
  • E2B
  • Fly Machines
  • Vercel Sandbox

For self-hosted isolation (Docker, gVisor, Firecracker) see the Secure Deployment guide.

Deployment Patterns

Pattern 1: Ephemeral Sessions

New container per task, destroyed on completion. Best for one-off tasks.

  • Bug investigation & fix
  • Invoice/document processing
  • Translation batches
  • Image/video transformations

Pattern 2: Long-Running Sessions

Persistent containers, often running multiple Claude Agent processes per container. Best for proactive agents that act without user input.

  • Email triage agent
  • Per-user site builder (exposes container ports)
  • High-frequency chatbots (Slack, etc.)

Pattern 3: Hybrid Sessions

Ephemeral containers hydrated from a database or SDK session resumption. Best for intermittent-interaction workflows.

  • Personal project manager with context persistence
  • Deep multi-hour research (save, resume)
  • Multi-turn customer support tickets

Pattern 4: Single Container, Multiple Agents

Multiple SDK processes share one container. Least common — requires coordination to prevent agents overwriting each other.

  • Agent simulations (games, multi-agent environments)

Key Operational Notes

  • Communication: Expose HTTP/WebSocket ports from within the container to reach SDK instances externally.
  • Cost: Tokens dominate; container overhead starts ~$0.05/hr minimum depending on provider.
  • Idle shutdown: Tune idle timeout per provider based on expected user response cadence.
  • CLI versioning: Claude Code CLI uses semver — breaking changes are versioned, minor updates are safe to auto-apply.
  • Monitoring: Containers are standard servers — use your existing backend logging infrastructure.
  • Session timeout: No hard timeout, but set maxTurns to prevent infinite loops.

Key Takeaways

  • SDK runs as a long-running process, not a stateless call — it needs a persistent container environment.
  • Choose deployment pattern based on task lifetime: ephemeral (one-off) → long-running (proactive) → hybrid (intermittent) → single-container multi-agent (collaboration).
  • Every instance needs Node.js even for Python SDK (Claude Code CLI is bundled).
  • Minimum recommended specs: 1 GiB RAM, 5 GiB disk, 1 CPU.
  • Use maxTurns to guard against infinite agent loops.
  • Standard backend observability (logs, metrics) works as-is — containers are just servers.

Sources

  • raw/Hosting the Agent SDK.md