3.6 KiB
3.6 KiB
| title | aliases | tags | sources | created | updated | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LM Studio Responses API |
|
|
|
2026-04-30 | 2026-04-30 |
LM Studio Responses API
LM Studio exposes /v1/responses — an OpenAI Responses API-compatible endpoint with support for streaming, reasoning effort, stateful multi-turn via previous_response_id, and Remote MCP tools.
Base URL: http://localhost:1234/v1/responses
Basic Request (non-streaming)
curl http://localhost:1234/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-20b",
"input": "Provide a prime number less than 50",
"reasoning": { "effort": "low" }
}'
input— plain string prompt (no messages array required)reasoning.effort—"low"|"medium"|"high"(model-dependent)
Stateful Follow-up
Carry conversation state across calls using previous_response_id:
curl http://localhost:1234/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-20b",
"input": "Multiply it by 2",
"previous_response_id": "resp_123"
}'
- The
idfield from any prior response becomes theprevious_response_idof the next - No need to replay the full message history client-side
Streaming
curl http://localhost:1234/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-20b",
"input": "Hello",
"stream": true
}'
SSE events emitted:
| Event | Description |
|---|---|
response.created |
Response object initialised |
response.output_text.delta |
Incremental text chunk |
response.completed |
Final event, full response included |
Remote MCP Tools (opt-in)
Enable in LM Studio: Developer → Settings → Remote MCP.
curl http://localhost:1234/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "ibm/granite-4-micro",
"input": "What is the top trending model on hugging face?",
"tools": [
{
"type": "mcp",
"server_label": "huggingface",
"server_url": "https://huggingface.co/mcp",
"allowed_tools": ["model_search"]
}
]
}'
server_label— arbitrary identifier for this MCP serverserver_url— remote MCP server URLallowed_tools— allowlist of tool names the model may call
Key Takeaways
/v1/responsesis an OpenAI Responses API drop-in; swap base URL onlyprevious_response_idenables multi-turn without replaying history — simpler than maintaining a messages array- Streaming uses standard SSE; listen for
response.output_text.deltafor incremental chunks - Remote MCP tools are per-request and opt-in — must enable the feature in LM Studio settings first
reasoning.effortcontrols thinking depth; not all models support it
Related
- wiki/claude-code/lmstudio-openai-compat-endpoints — overview of all 5 OAI-compatible endpoints
- wiki/claude-code/lmstudio-chat-completions —
/v1/chat/completionswith full param reference - wiki/claude-code/lmstudio-messages-api —
/v1/messagesAnthropic-compat with streaming + tool-use - wiki/claude-code/lmstudio-rest-api — native endpoint feature comparison table
- wiki/claude-code/mcp-integration — Claude Code MCP setup and server patterns
Sources
raw/Responses.md— LM Studio developer docs:/v1/responsesendpoint