obsidian/wiki/agent-sdk/secure-deployment.md
2026-04-17 13:11:43 +01:00

202 lines
8.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Securely Deploying AI Agents"
aliases: [agent-security, secure-agent-deployment, agent-hardening]
tags: [security, agent-sdk, deployment, isolation, credentials, docker, proxy]
sources: [raw/Securely deploying AI agents.md]
created: 2026-04-17
updated: 2026-04-17
---
# Securely Deploying AI Agents
Unlike deterministic software, Claude Code and the Agent SDK generate actions dynamically based on context — making them susceptible to **prompt injection**: malicious instructions embedded in files, webpages, or user input that redirect agent behavior. Defense in depth is the answer.
Not every deployment needs maximum hardening. A developer running locally has different needs than a multi-tenant production system processing untrusted content.
## Threat Model
- **Prompt injection** — content processed by the agent (READMEs, web pages, files) may contain adversarial instructions
- **Model error** — unexpected actions even without adversarial input
- **Credential exposure** — agents accessing APIs may leak secrets if not isolated
- **Resource abuse** — unbounded memory/CPU/process spawning in multi-tenant environments
## Built-in Security Features
| Feature | What it does |
|---------|-------------|
| **Permissions system** | Allow/block/prompt per tool or bash command; glob patterns; org-wide policies |
| **Command AST parsing** | Parses bash into AST before execution; unrecognized constructs and `eval` always require approval |
| **Web search summarization** | Summarizes search results instead of passing raw HTML into context |
| **Sandbox mode** | Optional OS-level filesystem + network restrictions (see [[wiki/agent-sdk/configure-permissions\|configure-permissions]]) |
## Security Principles
### Least Privilege
| Resource | Restriction |
|----------|------------|
| Filesystem | Mount only needed dirs; prefer read-only |
| Network | Restrict to specific endpoints via proxy |
| Credentials | Inject via proxy — never expose directly |
| System capabilities | Drop Linux capabilities in containers |
### Defense in Depth
Layer multiple controls: container isolation → network restrictions → filesystem controls → proxy-level request validation. Each layer limits blast radius if another fails.
## Isolation Technologies
| Technology | Isolation | Perf overhead | Complexity |
|-----------|-----------|--------------|------------|
| sandbox-runtime | Good | Very low | Low |
| Docker containers | Setup-dependent | Low | Medium |
| gVisor | Excellent | MediumHigh | Medium |
| VMs (Firecracker/QEMU) | Excellent | High | MediumHigh |
### sandbox-runtime
Lightweight, no Docker needed. Uses OS primitives (`bubblewrap` on Linux, `sandbox-exec` on macOS).
```sh
npm install @anthropic-ai/sandbox-runtime
```
- Filesystem: restricts read/write to configured paths
- Network: routes all traffic through built-in proxy with domain allowlists
- Limitation: shares host kernel — not suitable for kernel-level isolation requirements
### Hardened Docker Container
```sh
docker run \
--cap-drop ALL \
--security-opt no-new-privileges \
--security-opt seccomp=/path/to/seccomp-profile.json \
--read-only \
--tmpfs /tmp:rw,noexec,nosuid,size=100m \
--network none \
--memory 2g \
--pids-limit 100 \
--user 1000:1000 \
-v /path/to/code:/workspace:ro \
-v /var/run/proxy.sock:/var/run/proxy.sock:ro \
agent-image
```
Key flags:
- `--cap-drop ALL` — removes `NET_ADMIN`, `SYS_ADMIN`, etc.
- `--network none` — no network interfaces; agent communicates only via mounted Unix socket to host proxy
- `--read-only` + `--tmpfs` — immutable root fs with ephemeral scratch space
- `-v ...:/workspace:ro` — never mount `~/.ssh`, `~/.aws`, `~/.config`
### gVisor
Intercepts syscalls in userspace — the agent never directly touches the host kernel.
```json
// /etc/docker/daemon.json
{ "runtimes": { "runsc": { "path": "/usr/local/bin/runsc" } } }
```
```sh
docker run --runtime=runsc agent-image
```
Performance: CPU-bound ≈ 0% overhead; file I/O can be 10200× slower for heavy open/close patterns.
### Firecracker MicroVMs
- Boot time < 125ms, < 5 MiB overhead
- Agent VM has no external network all traffic routed via `vsock` to host proxy
- Suitable for per-request isolation in multi-tenant systems
### Cloud Deployments
1. Private subnet with no internet gateway
2. Cloud firewall (AWS SG / GCP VPC) blocks all egress except to proxy
3. Proxy (e.g. Envoy with `credential_injector`) validates, allowlists, injects creds, logs
4. Minimal IAM permissions on agent's service account
## Credential Management
**Core pattern:** run a proxy *outside* the agent's security boundary that injects credentials. The agent never sees the actual secret.
Benefits:
- Credentials stored in one place, not distributed to agents
- Proxy enforces endpoint allowlists
- All requests logged for audit
### Proxy Configuration
**Option 1 — sampling requests only:**
```sh
export ANTHROPIC_BASE_URL="http://localhost:8080"
```
**Option 2 — system-wide (all HTTP traffic):**
```sh
export HTTP_PROXY="http://localhost:8080"
export HTTPS_PROXY="http://localhost:8080"
```
Note: `HTTP_PROXY`/`HTTPS_PROXY` creates opaque TLS tunnels for HTTPS proxy can't inspect/modify without TLS termination. Node.js `fetch()` ignores these by default; set `NODE_USE_ENV_PROXY=1` in Node 24+.
### Proxy Options
| Proxy | Use case |
|-------|---------|
| Envoy | Production; `credential_injector` filter |
| mitmproxy | TLS-terminating; inspect/modify HTTPS |
| Squid | ACL-based caching proxy |
| LiteLLM | LLM gateway with rate limiting |
### Credentials for Other Services
**MCP/custom tools (preferred):** Agent calls a tool; the actual authenticated request happens outside the agent boundary. No TLS interception needed.
**TLS-terminating proxy:** Install proxy's CA cert in agent's trust store + configure `HTTP_PROXY`. Use `proxychains` or iptables for programs that bypass env vars.
## Filesystem Configuration
### Files to Exclude Before Mounting
| File | Risk |
|------|------|
| `.env`, `.env.local` | API keys, DB passwords |
| `~/.aws/credentials` | AWS access keys |
| `~/.config/gcloud/application_default_credentials.json` | GCP tokens |
| `~/.kube/config` | Kubernetes credentials |
| `*.pem`, `*.key` | Private keys |
| `.npmrc`, `.pypirc` | Registry tokens |
| `*-service-account.json` | GCP service account keys |
### Writable Workspace Options
| Approach | Persistence | Use case |
|----------|-------------|---------|
| `--tmpfs` | Ephemeral (cleared on stop) | CI/CD, stateless agents |
| Overlay filesystem | Inspect then apply/discard | Review-before-commit workflows |
| Named volume (separate dir) | Persistent | Output collection |
## Key Takeaways
- **Prompt injection is the primary threat** content the agent processes can redirect its behavior; built-in summarization and permissions help, but aren't sufficient alone
- **Proxy pattern is the gold standard for credentials** agent never sees secrets; proxy outside the boundary injects them and enforces allowlists
- **`--network none` + Unix socket** is the strongest container network control agent can only reach what the host proxy allows
- **gVisor for multi-tenant or untrusted content** reduces kernel attack surface significantly despite I/O overhead
- **Never mount sensitive credential directories** `~/.ssh`, `~/.aws`, `~/.config` must stay outside the agent's view
- **`ANTHROPIC_BASE_URL` vs `HTTP_PROXY`** former routes only sampling calls in plaintext; latter routes all traffic but creates opaque HTTPS tunnels
- **Least privilege is layered** filesystem (read-only mounts) + network (allowlists) + capabilities (`--cap-drop ALL`) + process limits (`--pids-limit`)
## Related
- [[wiki/agent-sdk/configure-permissions\|configure-permissions]] permission modes, allow/deny rules, evaluation order
- [[wiki/agent-sdk/hosting-production\|hosting-production]] container requirements, deployment patterns
- [[wiki/agent-sdk/sdk-hooks\|sdk-hooks]] PreToolUse/PostToolUse callbacks for runtime control
- [[wiki/agent-sdk/mcp-integration\|mcp-integration]] MCP servers as a credential-safe tool boundary
- [[wiki/architecture/docker-compose\|Docker Compose patterns]] container orchestration context
## Sources
- `raw/Securely deploying AI agents.md`
- Official docs: `https://code.claude.com/docs/en/agent-sdk/secure-deployment`