obsidian/wiki/claude-code/lmstudio-mcp-via-api.md
2026-04-30 14:42:43 +01:00

4 KiB

title aliases tags sources created updated
LM Studio — MCP via API
lmstudio-mcp-api
mcp-lmstudio
lm-studio-mcp
lmstudio
mcp
api
tool-use
integration
raw/Using MCP via API.md
2026-04-30 2026-04-30

LM Studio — MCP via API

Requires LM Studio 0.4.0+. MCP servers provide tools that models can call during chat requests via /api/v1/chat.

Two Server Modes

Feature Ephemeral mcp.json
Specified via integrations"type": "ephemeral_mcp" integrations"type": "plugin"
Config Per-request only Pre-configured in mcp.json
Use case One-off / remote tools Frequent use, tools needing command (local processes)
Server ID server_label in integration id (e.g. mcp/playwright)
Custom headers headers field Configured in mcp.json

Ephemeral MCP Servers

Defined inline per-request — no pre-configuration needed.

curl http://localhost:1234/api/v1/chat \
  -H "Authorization: Bearer $LM_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm/granite-4-micro",
    "input": "What is the top trending model on hugging face?",
    "integrations": [
      {
        "type": "ephemeral_mcp",
        "server_label": "huggingface",
        "server_url": "https://huggingface.co/mcp",
        "allowed_tools": ["model_search"]
      }
    ],
    "context_length": 8000
  }'

Response output contains typed entries: reasoning, message, and tool_call objects. Each tool_call includes the tool name, arguments, output, and provider_info identifying the server.

mcp.json Pre-configured Servers

Recommended for servers that run local commands (e.g. microsoft/playwright-mcp) or are used frequently.

curl http://localhost:1234/api/v1/chat \
  -H "Authorization: Bearer $LM_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm/granite-4-micro",
    "input": "Open lmstudio.ai",
    "integrations": ["mcp/playwright"],
    "context_length": 8000,
    "temperature": 0
  }'
  • integrations can be a plain string array when referencing pre-configured servers
  • provider_info.type will be "plugin" (vs "ephemeral_mcp" for inline)

Restricting Tool Access

Use allowed_tools on either integration type:

"allowed_tools": ["model_search"]
  • Limits which tools the model can call from that server
  • Speeds up prompt processing — fewer tool definitions in context
  • If omitted, all server tools are available

Custom Headers (Ephemeral)

For authenticated remote MCP endpoints:

{
  "type": "ephemeral_mcp",
  "server_label": "huggingface",
  "server_url": "https://huggingface.co/mcp",
  "allowed_tools": ["model_search"],
  "headers": {
    "Authorization": "Bearer <YOUR_HF_TOKEN>"
  }
}

Key Takeaways

  • LM Studio exposes MCP tool calling through its native /api/v1/chat endpoint (not the OpenAI-compat route)
  • Two modes: ephemeral (inline, per-request) vs mcp.json (pre-configured, recommended for local/frequent servers)
  • allowed_tools works on both modes — use it to reduce context size and restrict scope
  • Tool call results appear inline in the output array alongside reasoning and message entries
  • Auth headers for remote MCP servers go in the headers field on ephemeral integrations
  • The wiki/claude-code/lmstudio-responses-api also supports Remote MCP via tools — different endpoint, same concept

Sources

  • raw/Using MCP via API.md — LM Studio docs, 2026-04-30