LM Studio — MCP via API

Requires LM Studio 0.4.0+. MCP servers provide tools that models can call during chat requests via /api/v1/chat.

Two Server Modes

Feature	Ephemeral	mcp.json
Specified via	`integrations` → `"type": "ephemeral_mcp"`	`integrations` → `"type": "plugin"`
Config	Per-request only	Pre-configured in `mcp.json`
Use case	One-off / remote tools	Frequent use, tools needing `command` (local processes)
Server ID	`server_label` in integration	`id` (e.g. `mcp/playwright`)
Custom headers	`headers` field	Configured in `mcp.json`

Ephemeral MCP Servers

Defined inline per-request — no pre-configuration needed.

curl http://localhost:1234/api/v1/chat \
  -H "Authorization: Bearer $LM_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm/granite-4-micro",
    "input": "What is the top trending model on hugging face?",
    "integrations": [
      {
        "type": "ephemeral_mcp",
        "server_label": "huggingface",
        "server_url": "https://huggingface.co/mcp",
        "allowed_tools": ["model_search"]
      }
    ],
    "context_length": 8000
  }'

Response output contains typed entries: reasoning, message, and tool_call objects. Each tool_call includes the tool name, arguments, output, and provider_info identifying the server.

mcp.json Pre-configured Servers

Recommended for servers that run local commands (e.g. microsoft/playwright-mcp) or are used frequently.

curl http://localhost:1234/api/v1/chat \
  -H "Authorization: Bearer $LM_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm/granite-4-micro",
    "input": "Open lmstudio.ai",
    "integrations": ["mcp/playwright"],
    "context_length": 8000,
    "temperature": 0
  }'

integrations can be a plain string array when referencing pre-configured servers
provider_info.type will be "plugin" (vs "ephemeral_mcp" for inline)

Restricting Tool Access

Use allowed_tools on either integration type:

"allowed_tools": ["model_search"]

Limits which tools the model can call from that server
Speeds up prompt processing — fewer tool definitions in context
If omitted, all server tools are available

Custom Headers (Ephemeral)

For authenticated remote MCP endpoints:

{
  "type": "ephemeral_mcp",
  "server_label": "huggingface",
  "server_url": "https://huggingface.co/mcp",
  "allowed_tools": ["model_search"],
  "headers": {
    "Authorization": "Bearer <YOUR_HF_TOKEN>"
  }
}

Key Takeaways

LM Studio exposes MCP tool calling through its native /api/v1/chat endpoint (not the OpenAI-compat route)
Two modes: ephemeral (inline, per-request) vs mcp.json (pre-configured, recommended for local/frequent servers)
allowed_tools works on both modes — use it to reduce context size and restrict scope
Tool call results appear inline in the output array alongside reasoning and message entries
Auth headers for remote MCP servers go in the headers field on ephemeral integrations
The wiki/claude-code/lmstudio-responses-api also supports Remote MCP via tools — different endpoint, same concept

wiki/claude-code/lmstudio-responses-api — /v1/responses endpoint also supports Remote MCP tools
wiki/claude-code/lmstudio-tool-use — function calling (non-MCP) patterns
wiki/claude-code/lmstudio-server-settings — toggle per-request MCPs and mcp.json access in the UI
wiki/claude-code/mcp-integration — MCP concepts: transports, scopes, OAuth

Sources

raw/Using MCP via API.md — LM Studio docs, 2026-04-30

4 KiB Raw Blame History