LM Studio REST API (v1)

LM Studio 0.4.0 introduced the native v1 REST API at /api/v1/*. It sits alongside OpenAI-compatible and Anthropic-compatible endpoints and offers the richest feature set for local inference.

v1 vs v0

The old v0 API (/api/v0/*) is superseded. Migrate to /api/v1/* for:

Stateful chats — server keeps conversation context across turns
MCP via API — use MCPs configured in LM Studio directly from requests
Authentication — API token support
Model management — download, load, unload via API

Supported Endpoints

Endpoint	Method	Purpose
`/api/v1/chat`	POST	Inference (native)
`/api/v1/models`	GET	List loaded models
`/api/v1/models/load`	POST	Load a model into VRAM
`/api/v1/models/unload`	POST	Unload a model
`/api/v1/models/download`	POST	Download a model
`/api/v1/models/download/status`	GET	Poll download progress

Inference Endpoint Comparison

Four endpoints can run inference. Pick based on which features you need:

Feature	`/api/v1/chat`	`/v1/responses` (OAI)	`/v1/chat/completions` (OAI)	`/v1/messages` (Anthropic)
Streaming	✅	✅	✅	✅
Stateful chat	✅	✅	❌	❌
Remote MCPs	✅	✅	❌	❌
LM Studio MCPs	✅	✅	❌	❌
Custom tools	❌	✅	✅	✅
Assistant messages in request	❌	✅	✅	✅
Model load streaming events	✅	❌	❌	❌
Prompt processing events	✅	❌	❌	❌
Specify context length	✅	❌	❌	❌

Decision guide:

Need MCP tools + stateful chat → /api/v1/chat or /v1/responses
Need custom tool definitions → /v1/responses, /v1/chat/completions, or /v1/messages
Dropping in existing OpenAI SDK code → /v1/chat/completions
Dropping in existing Anthropic SDK code → /v1/messages

Key Takeaways

The native /api/v1/chat endpoint has exclusive features: stateful chat, LM Studio MCPs, model-load events, prompt-processing events, and per-request context length.
/v1/responses (OpenAI Responses API compat) is the best of both worlds — stateful + MCP + custom tools.
/v1/chat/completions is the broadest drop-in for existing OpenAI code but loses statefulness and MCP.
/v1/messages lets you redirect the Anthropic SDK to a local model with minimal code change (see wiki/claude-code/lmstudio-anthropic-compat).
Model management endpoints let you fully automate the model lifecycle — download → load → infer → unload — without touching the GUI.
API token auth is available for securing the local server (useful when exposed on a LAN).

wiki/claude-code/lmstudio-anthropic-compat — redirect Claude Code / Anthropic SDK to LM Studio via env vars
wiki/claude-code/lmstudio-chat-completions — OpenAI /v1/chat/completions usage, params, debugging
wiki/claude-code/lmstudio-embeddings — /v1/embeddings for local RAG pipelines
wiki/claude-code/lmstudio-idle-ttl-auto-evict — memory management: TTL and auto-evict
wiki/agent-sdk/overview — build multi-agent systems that call local models

Sources

raw/LM Studio API.md — clipped from lmstudio.ai/docs/developer/rest

3.7 KiB Raw Permalink Blame History