---
title: "Securely Deploying AI Agents"
aliases: [agent-security, secure-agent-deployment, agent-hardening]
tags: [security, agent-sdk, deployment, isolation, credentials, docker, proxy]
sources: [raw/Securely deploying AI agents.md]
created: 2026-04-17
updated: 2026-04-17
---

# Securely Deploying AI Agents

Unlike deterministic software, Claude Code and the Agent SDK generate actions dynamically based on context — making them susceptible to **prompt injection**: malicious instructions embedded in files, webpages, or user input that redirect agent behavior. Defense in depth is the answer.

Not every deployment needs maximum hardening. A developer running locally has different needs than a multi-tenant production system processing untrusted content.

## Threat Model

- **Prompt injection** — content processed by the agent (READMEs, web pages, files) may contain adversarial instructions
- **Model error** — unexpected actions even without adversarial input
- **Credential exposure** — agents accessing APIs may leak secrets if not isolated
- **Resource abuse** — unbounded memory/CPU/process spawning in multi-tenant environments

## Built-in Security Features

| Feature | What it does |
|---------|-------------|
| **Permissions system** | Allow/block/prompt per tool or bash command; glob patterns; org-wide policies |
| **Command AST parsing** | Parses bash into AST before execution; unrecognized constructs and `eval` always require approval |
| **Web search summarization** | Summarizes search results instead of passing raw HTML into context |
| **Sandbox mode** | Optional OS-level filesystem + network restrictions (see [[wiki/agent-sdk/configure-permissions\|configure-permissions]]) |

## Security Principles

### Least Privilege

| Resource | Restriction |
|----------|------------|
| Filesystem | Mount only needed dirs; prefer read-only |
| Network | Restrict to specific endpoints via proxy |
| Credentials | Inject via proxy — never expose directly |
| System capabilities | Drop Linux capabilities in containers |

### Defense in Depth

Layer multiple controls: container isolation → network restrictions → filesystem controls → proxy-level request validation. Each layer limits blast radius if another fails.

## Isolation Technologies

| Technology | Isolation | Perf overhead | Complexity |
|-----------|-----------|--------------|------------|
| sandbox-runtime | Good | Very low | Low |
| Docker containers | Setup-dependent | Low | Medium |
| gVisor | Excellent | Medium–High | Medium |
| VMs (Firecracker/QEMU) | Excellent | High | Medium–High |

### sandbox-runtime

Lightweight, no Docker needed. Uses OS primitives (`bubblewrap` on Linux, `sandbox-exec` on macOS).

```sh
npm install @anthropic-ai/sandbox-runtime
```

- Filesystem: restricts read/write to configured paths
- Network: routes all traffic through built-in proxy with domain allowlists
- Limitation: shares host kernel — not suitable for kernel-level isolation requirements

### Hardened Docker Container

```sh
docker run \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  --security-opt seccomp=/path/to/seccomp-profile.json \
  --read-only \
  --tmpfs /tmp:rw,noexec,nosuid,size=100m \
  --network none \
  --memory 2g \
  --pids-limit 100 \
  --user 1000:1000 \
  -v /path/to/code:/workspace:ro \
  -v /var/run/proxy.sock:/var/run/proxy.sock:ro \
  agent-image
```

Key flags:
- `--cap-drop ALL` — removes `NET_ADMIN`, `SYS_ADMIN`, etc.
- `--network none` — no network interfaces; agent communicates only via mounted Unix socket to host proxy
- `--read-only` + `--tmpfs` — immutable root fs with ephemeral scratch space
- `-v ...:/workspace:ro` — never mount `~/.ssh`, `~/.aws`, `~/.config`

### gVisor

Intercepts syscalls in userspace — the agent never directly touches the host kernel.

```json
// /etc/docker/daemon.json
{ "runtimes": { "runsc": { "path": "/usr/local/bin/runsc" } } }
```

```sh
docker run --runtime=runsc agent-image
```

Performance: CPU-bound ≈ 0% overhead; file I/O can be 10–200× slower for heavy open/close patterns.

### Firecracker MicroVMs

- Boot time < 125ms, < 5 MiB overhead
- Agent VM has no external network — all traffic routed via `vsock` to host proxy
- Suitable for per-request isolation in multi-tenant systems

### Cloud Deployments

1. Private subnet with no internet gateway
2. Cloud firewall (AWS SG / GCP VPC) blocks all egress except to proxy
3. Proxy (e.g. Envoy with `credential_injector`) validates, allowlists, injects creds, logs
4. Minimal IAM permissions on agent's service account

## Credential Management

**Core pattern:** run a proxy *outside* the agent's security boundary that injects credentials. The agent never sees the actual secret.

Benefits:
- Credentials stored in one place, not distributed to agents
- Proxy enforces endpoint allowlists
- All requests logged for audit

### Proxy Configuration

**Option 1 — sampling requests only:**
```sh
export ANTHROPIC_BASE_URL="http://localhost:8080"
```

**Option 2 — system-wide (all HTTP traffic):**
```sh
export HTTP_PROXY="http://localhost:8080"
export HTTPS_PROXY="http://localhost:8080"
```

Note: `HTTP_PROXY`/`HTTPS_PROXY` creates opaque TLS tunnels for HTTPS — proxy can't inspect/modify without TLS termination. Node.js `fetch()` ignores these by default; set `NODE_USE_ENV_PROXY=1` in Node 24+.

### Proxy Options

| Proxy | Use case |
|-------|---------|
| Envoy | Production; `credential_injector` filter |
| mitmproxy | TLS-terminating; inspect/modify HTTPS |
| Squid | ACL-based caching proxy |
| LiteLLM | LLM gateway with rate limiting |

### Credentials for Other Services

**MCP/custom tools (preferred):** Agent calls a tool; the actual authenticated request happens outside the agent boundary. No TLS interception needed.

**TLS-terminating proxy:** Install proxy's CA cert in agent's trust store + configure `HTTP_PROXY`. Use `proxychains` or iptables for programs that bypass env vars.

## Filesystem Configuration

### Files to Exclude Before Mounting

| File | Risk |
|------|------|
| `.env`, `.env.local` | API keys, DB passwords |
| `~/.aws/credentials` | AWS access keys |
| `~/.config/gcloud/application_default_credentials.json` | GCP tokens |
| `~/.kube/config` | Kubernetes credentials |
| `*.pem`, `*.key` | Private keys |
| `.npmrc`, `.pypirc` | Registry tokens |
| `*-service-account.json` | GCP service account keys |

### Writable Workspace Options

| Approach | Persistence | Use case |
|----------|-------------|---------|
| `--tmpfs` | Ephemeral (cleared on stop) | CI/CD, stateless agents |
| Overlay filesystem | Inspect then apply/discard | Review-before-commit workflows |
| Named volume (separate dir) | Persistent | Output collection |

## Key Takeaways

- **Prompt injection is the primary threat** — content the agent processes can redirect its behavior; built-in summarization and permissions help, but aren't sufficient alone
- **Proxy pattern is the gold standard for credentials** — agent never sees secrets; proxy outside the boundary injects them and enforces allowlists
- **`--network none` + Unix socket** is the strongest container network control — agent can only reach what the host proxy allows
- **gVisor for multi-tenant or untrusted content** — reduces kernel attack surface significantly despite I/O overhead
- **Never mount sensitive credential directories** — `~/.ssh`, `~/.aws`, `~/.config` must stay outside the agent's view
- **`ANTHROPIC_BASE_URL` vs `HTTP_PROXY`** — former routes only sampling calls in plaintext; latter routes all traffic but creates opaque HTTPS tunnels
- **Least privilege is layered** — filesystem (read-only mounts) + network (allowlists) + capabilities (`--cap-drop ALL`) + process limits (`--pids-limit`)

## Related

- [[wiki/agent-sdk/configure-permissions\|configure-permissions]] — permission modes, allow/deny rules, evaluation order
- [[wiki/agent-sdk/hosting-production\|hosting-production]] — container requirements, deployment patterns
- [[wiki/agent-sdk/sdk-hooks\|sdk-hooks]] — PreToolUse/PostToolUse callbacks for runtime control
- [[wiki/agent-sdk/mcp-integration\|mcp-integration]] — MCP servers as a credential-safe tool boundary
- [[wiki/architecture/docker-compose\|Docker Compose patterns]] — container orchestration context

## Sources

- `raw/Securely deploying AI agents.md`
- Official docs: `https://code.claude.com/docs/en/agent-sdk/secure-deployment`