--- title: "Securely Deploying AI Agents" aliases: [agent-security, secure-agent-deployment, agent-hardening] tags: [security, agent-sdk, deployment, isolation, credentials, docker, proxy] sources: [raw/Securely deploying AI agents.md] created: 2026-04-17 updated: 2026-04-17 --- # Securely Deploying AI Agents Unlike deterministic software, Claude Code and the Agent SDK generate actions dynamically based on context — making them susceptible to **prompt injection**: malicious instructions embedded in files, webpages, or user input that redirect agent behavior. Defense in depth is the answer. Not every deployment needs maximum hardening. A developer running locally has different needs than a multi-tenant production system processing untrusted content. ## Threat Model - **Prompt injection** — content processed by the agent (READMEs, web pages, files) may contain adversarial instructions - **Model error** — unexpected actions even without adversarial input - **Credential exposure** — agents accessing APIs may leak secrets if not isolated - **Resource abuse** — unbounded memory/CPU/process spawning in multi-tenant environments ## Built-in Security Features | Feature | What it does | |---------|-------------| | **Permissions system** | Allow/block/prompt per tool or bash command; glob patterns; org-wide policies | | **Command AST parsing** | Parses bash into AST before execution; unrecognized constructs and `eval` always require approval | | **Web search summarization** | Summarizes search results instead of passing raw HTML into context | | **Sandbox mode** | Optional OS-level filesystem + network restrictions (see [[wiki/agent-sdk/configure-permissions\|configure-permissions]]) | ## Security Principles ### Least Privilege | Resource | Restriction | |----------|------------| | Filesystem | Mount only needed dirs; prefer read-only | | Network | Restrict to specific endpoints via proxy | | Credentials | Inject via proxy — never expose directly | | System capabilities | Drop Linux capabilities in containers | ### Defense in Depth Layer multiple controls: container isolation → network restrictions → filesystem controls → proxy-level request validation. Each layer limits blast radius if another fails. ## Isolation Technologies | Technology | Isolation | Perf overhead | Complexity | |-----------|-----------|--------------|------------| | sandbox-runtime | Good | Very low | Low | | Docker containers | Setup-dependent | Low | Medium | | gVisor | Excellent | Medium–High | Medium | | VMs (Firecracker/QEMU) | Excellent | High | Medium–High | ### sandbox-runtime Lightweight, no Docker needed. Uses OS primitives (`bubblewrap` on Linux, `sandbox-exec` on macOS). ```sh npm install @anthropic-ai/sandbox-runtime ``` - Filesystem: restricts read/write to configured paths - Network: routes all traffic through built-in proxy with domain allowlists - Limitation: shares host kernel — not suitable for kernel-level isolation requirements ### Hardened Docker Container ```sh docker run \ --cap-drop ALL \ --security-opt no-new-privileges \ --security-opt seccomp=/path/to/seccomp-profile.json \ --read-only \ --tmpfs /tmp:rw,noexec,nosuid,size=100m \ --network none \ --memory 2g \ --pids-limit 100 \ --user 1000:1000 \ -v /path/to/code:/workspace:ro \ -v /var/run/proxy.sock:/var/run/proxy.sock:ro \ agent-image ``` Key flags: - `--cap-drop ALL` — removes `NET_ADMIN`, `SYS_ADMIN`, etc. - `--network none` — no network interfaces; agent communicates only via mounted Unix socket to host proxy - `--read-only` + `--tmpfs` — immutable root fs with ephemeral scratch space - `-v ...:/workspace:ro` — never mount `~/.ssh`, `~/.aws`, `~/.config` ### gVisor Intercepts syscalls in userspace — the agent never directly touches the host kernel. ```json // /etc/docker/daemon.json { "runtimes": { "runsc": { "path": "/usr/local/bin/runsc" } } } ``` ```sh docker run --runtime=runsc agent-image ``` Performance: CPU-bound ≈ 0% overhead; file I/O can be 10–200× slower for heavy open/close patterns. ### Firecracker MicroVMs - Boot time < 125ms, < 5 MiB overhead - Agent VM has no external network — all traffic routed via `vsock` to host proxy - Suitable for per-request isolation in multi-tenant systems ### Cloud Deployments 1. Private subnet with no internet gateway 2. Cloud firewall (AWS SG / GCP VPC) blocks all egress except to proxy 3. Proxy (e.g. Envoy with `credential_injector`) validates, allowlists, injects creds, logs 4. Minimal IAM permissions on agent's service account ## Credential Management **Core pattern:** run a proxy *outside* the agent's security boundary that injects credentials. The agent never sees the actual secret. Benefits: - Credentials stored in one place, not distributed to agents - Proxy enforces endpoint allowlists - All requests logged for audit ### Proxy Configuration **Option 1 — sampling requests only:** ```sh export ANTHROPIC_BASE_URL="http://localhost:8080" ``` **Option 2 — system-wide (all HTTP traffic):** ```sh export HTTP_PROXY="http://localhost:8080" export HTTPS_PROXY="http://localhost:8080" ``` Note: `HTTP_PROXY`/`HTTPS_PROXY` creates opaque TLS tunnels for HTTPS — proxy can't inspect/modify without TLS termination. Node.js `fetch()` ignores these by default; set `NODE_USE_ENV_PROXY=1` in Node 24+. ### Proxy Options | Proxy | Use case | |-------|---------| | Envoy | Production; `credential_injector` filter | | mitmproxy | TLS-terminating; inspect/modify HTTPS | | Squid | ACL-based caching proxy | | LiteLLM | LLM gateway with rate limiting | ### Credentials for Other Services **MCP/custom tools (preferred):** Agent calls a tool; the actual authenticated request happens outside the agent boundary. No TLS interception needed. **TLS-terminating proxy:** Install proxy's CA cert in agent's trust store + configure `HTTP_PROXY`. Use `proxychains` or iptables for programs that bypass env vars. ## Filesystem Configuration ### Files to Exclude Before Mounting | File | Risk | |------|------| | `.env`, `.env.local` | API keys, DB passwords | | `~/.aws/credentials` | AWS access keys | | `~/.config/gcloud/application_default_credentials.json` | GCP tokens | | `~/.kube/config` | Kubernetes credentials | | `*.pem`, `*.key` | Private keys | | `.npmrc`, `.pypirc` | Registry tokens | | `*-service-account.json` | GCP service account keys | ### Writable Workspace Options | Approach | Persistence | Use case | |----------|-------------|---------| | `--tmpfs` | Ephemeral (cleared on stop) | CI/CD, stateless agents | | Overlay filesystem | Inspect then apply/discard | Review-before-commit workflows | | Named volume (separate dir) | Persistent | Output collection | ## Key Takeaways - **Prompt injection is the primary threat** — content the agent processes can redirect its behavior; built-in summarization and permissions help, but aren't sufficient alone - **Proxy pattern is the gold standard for credentials** — agent never sees secrets; proxy outside the boundary injects them and enforces allowlists - **`--network none` + Unix socket** is the strongest container network control — agent can only reach what the host proxy allows - **gVisor for multi-tenant or untrusted content** — reduces kernel attack surface significantly despite I/O overhead - **Never mount sensitive credential directories** — `~/.ssh`, `~/.aws`, `~/.config` must stay outside the agent's view - **`ANTHROPIC_BASE_URL` vs `HTTP_PROXY`** — former routes only sampling calls in plaintext; latter routes all traffic but creates opaque HTTPS tunnels - **Least privilege is layered** — filesystem (read-only mounts) + network (allowlists) + capabilities (`--cap-drop ALL`) + process limits (`--pids-limit`) ## Related - [[wiki/agent-sdk/configure-permissions\|configure-permissions]] — permission modes, allow/deny rules, evaluation order - [[wiki/agent-sdk/hosting-production\|hosting-production]] — container requirements, deployment patterns - [[wiki/agent-sdk/sdk-hooks\|sdk-hooks]] — PreToolUse/PostToolUse callbacks for runtime control - [[wiki/agent-sdk/mcp-integration\|mcp-integration]] — MCP servers as a credential-safe tool boundary - [[wiki/architecture/docker-compose\|Docker Compose patterns]] — container orchestration context ## Sources - `raw/Securely deploying AI agents.md` - Official docs: `https://code.claude.com/docs/en/agent-sdk/secure-deployment`