| title |
aliases |
tags |
sources |
created |
updated |
| LM Studio REST API (v1) |
| lmstudio-api |
| lm-studio-rest |
| lmstudio-v1 |
|
| lmstudio |
| rest-api |
| local-inference |
| openai-compat |
| anthropic-compat |
| mcp |
|
|
2026-04-30 |
2026-04-30 |
LM Studio REST API (v1)
LM Studio 0.4.0 introduced the native v1 REST API at /api/v1/*. It sits alongside OpenAI-compatible and Anthropic-compatible endpoints and offers the richest feature set for local inference.
v1 vs v0
The old v0 API (/api/v0/*) is superseded. Migrate to /api/v1/* for:
- Stateful chats — server keeps conversation context across turns
- MCP via API — use MCPs configured in LM Studio directly from requests
- Authentication — API token support
- Model management — download, load, unload via API
Supported Endpoints
| Endpoint |
Method |
Purpose |
/api/v1/chat |
POST |
Inference (native) |
/api/v1/models |
GET |
List loaded models |
/api/v1/models/load |
POST |
Load a model into VRAM |
/api/v1/models/unload |
POST |
Unload a model |
/api/v1/models/download |
POST |
Download a model |
/api/v1/models/download/status |
GET |
Poll download progress |
Inference Endpoint Comparison
Four endpoints can run inference. Pick based on which features you need:
| Feature |
/api/v1/chat |
/v1/responses (OAI) |
/v1/chat/completions (OAI) |
/v1/messages (Anthropic) |
| Streaming |
✅ |
✅ |
✅ |
✅ |
| Stateful chat |
✅ |
✅ |
❌ |
❌ |
| Remote MCPs |
✅ |
✅ |
❌ |
❌ |
| LM Studio MCPs |
✅ |
✅ |
❌ |
❌ |
| Custom tools |
❌ |
✅ |
✅ |
✅ |
| Assistant messages in request |
❌ |
✅ |
✅ |
✅ |
| Model load streaming events |
✅ |
❌ |
❌ |
❌ |
| Prompt processing events |
✅ |
❌ |
❌ |
❌ |
| Specify context length |
✅ |
❌ |
❌ |
❌ |
Decision guide:
- Need MCP tools + stateful chat →
/api/v1/chat or /v1/responses
- Need custom tool definitions →
/v1/responses, /v1/chat/completions, or /v1/messages
- Dropping in existing OpenAI SDK code →
/v1/chat/completions
- Dropping in existing Anthropic SDK code →
/v1/messages
Key Takeaways
- The native
/api/v1/chat endpoint has exclusive features: stateful chat, LM Studio MCPs, model-load events, prompt-processing events, and per-request context length.
/v1/responses (OpenAI Responses API compat) is the best of both worlds — stateful + MCP + custom tools.
/v1/chat/completions is the broadest drop-in for existing OpenAI code but loses statefulness and MCP.
/v1/messages lets you redirect the Anthropic SDK to a local model with minimal code change (see wiki/claude-code/lmstudio-anthropic-compat).
- Model management endpoints let you fully automate the model lifecycle — download → load → infer → unload — without touching the GUI.
- API token auth is available for securing the local server (useful when exposed on a LAN).
Related Articles
Sources
raw/LM Studio API.md — clipped from lmstudio.ai/docs/developer/rest