obsidian/wiki/claude-code/lmstudio-rest-api.md
2026-04-30 14:42:43 +01:00

75 lines
3.7 KiB
Markdown

---
title: "LM Studio REST API (v1)"
aliases: [lmstudio-api, lm-studio-rest, lmstudio-v1]
tags: [lmstudio, rest-api, local-inference, openai-compat, anthropic-compat, mcp]
sources: [raw/LM Studio API.md]
created: 2026-04-30
updated: 2026-04-30
---
# LM Studio REST API (v1)
LM Studio 0.4.0 introduced the native **v1 REST API** at `/api/v1/*`. It sits alongside OpenAI-compatible and Anthropic-compatible endpoints and offers the richest feature set for local inference.
## v1 vs v0
The old v0 API (`/api/v0/*`) is superseded. Migrate to `/api/v1/*` for:
- **Stateful chats** — server keeps conversation context across turns
- **MCP via API** — use MCPs configured in LM Studio directly from requests
- **Authentication** — API token support
- **Model management** — download, load, unload via API
## Supported Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
| `/api/v1/chat` | POST | Inference (native) |
| `/api/v1/models` | GET | List loaded models |
| `/api/v1/models/load` | POST | Load a model into VRAM |
| `/api/v1/models/unload` | POST | Unload a model |
| `/api/v1/models/download` | POST | Download a model |
| `/api/v1/models/download/status` | GET | Poll download progress |
## Inference Endpoint Comparison
Four endpoints can run inference. Pick based on which features you need:
| Feature | `/api/v1/chat` | `/v1/responses` (OAI) | `/v1/chat/completions` (OAI) | `/v1/messages` (Anthropic) |
|---|:---:|:---:|:---:|:---:|
| Streaming | ✅ | ✅ | ✅ | ✅ |
| Stateful chat | ✅ | ✅ | ❌ | ❌ |
| Remote MCPs | ✅ | ✅ | ❌ | ❌ |
| LM Studio MCPs | ✅ | ✅ | ❌ | ❌ |
| Custom tools | ❌ | ✅ | ✅ | ✅ |
| Assistant messages in request | ❌ | ✅ | ✅ | ✅ |
| Model load streaming events | ✅ | ❌ | ❌ | ❌ |
| Prompt processing events | ✅ | ❌ | ❌ | ❌ |
| Specify context length | ✅ | ❌ | ❌ | ❌ |
**Decision guide:**
- Need MCP tools + stateful chat → `/api/v1/chat` or `/v1/responses`
- Need custom tool definitions → `/v1/responses`, `/v1/chat/completions`, or `/v1/messages`
- Dropping in existing OpenAI SDK code → `/v1/chat/completions`
- Dropping in existing Anthropic SDK code → `/v1/messages`
## Key Takeaways
- The **native `/api/v1/chat`** endpoint has exclusive features: stateful chat, LM Studio MCPs, model-load events, prompt-processing events, and per-request context length.
- **`/v1/responses`** (OpenAI Responses API compat) is the best of both worlds — stateful + MCP + custom tools.
- **`/v1/chat/completions`** is the broadest drop-in for existing OpenAI code but loses statefulness and MCP.
- **`/v1/messages`** lets you redirect the Anthropic SDK to a local model with minimal code change (see [[wiki/claude-code/lmstudio-anthropic-compat|lmstudio-anthropic-compat]]).
- Model management endpoints let you fully automate the model lifecycle — download → load → infer → unload — without touching the GUI.
- API token auth is available for securing the local server (useful when exposed on a LAN).
## Related Articles
- [[wiki/claude-code/lmstudio-anthropic-compat|lmstudio-anthropic-compat]] — redirect Claude Code / Anthropic SDK to LM Studio via env vars
- [[wiki/claude-code/lmstudio-chat-completions|lmstudio-chat-completions]] — OpenAI `/v1/chat/completions` usage, params, debugging
- [[wiki/claude-code/lmstudio-embeddings|lmstudio-embeddings]] — `/v1/embeddings` for local RAG pipelines
- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|lmstudio-idle-ttl-auto-evict]] — memory management: TTL and auto-evict
- [[wiki/agent-sdk/overview|agent-sdk/overview]] — build multi-agent systems that call local models
## Sources
- `raw/LM Studio API.md` — clipped from lmstudio.ai/docs/developer/rest