obsidian/wiki/claude-code/lmstudio-chat-completions.md
2026-04-30 14:27:25 +01:00

2.6 KiB

title aliases tags sources created updated
LM Studio — OpenAI Chat Completions Endpoint
lmstudio-openai-chat
lm-studio-chat-completions
lmstudio
openai-compat
local-llm
api
chat
raw/Chat Completions.md
2026-04-30 2026-04-30

LM Studio — OpenAI Chat Completions Endpoint

LM Studio exposes an OpenAI-compatible POST /v1/chat/completions endpoint. Any client built for OpenAI can point at LM Studio with two changes: base_url and api_key.

Endpoint

Field Value
Method POST
URL http://localhost:1234/v1/chat/completions
Auth any string (e.g. "lm-studio")
  • Prompt template is applied automatically for chat-tuned models
  • Stream with stream: true for token-by-token output
  • Inspect actual model input with lms log stream in a second terminal

Python Example

from openai import OpenAI

client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

completion = client.chat.completions.create(
  model="model-identifier",
  messages=[
    {"role": "system", "content": "Always answer in rhymes."},
    {"role": "user",   "content": "Introduce yourself."}
  ],
  temperature=0.7,
)

print(completion.choices[0].message)

Replace "model-identifier" with the exact model name shown in LM Studio's UI.

Supported Payload Parameters

model            top_p            top_k
messages         temperature      max_tokens
stream           stop             presence_penalty
frequency_penalty  logit_bias     repeat_penalty
seed

top_k and repeat_penalty are LM Studio extensions not in the OpenAI spec — they work here but not against the real OpenAI API.

Debugging

lms log stream   # live view of what the model actually receives

Key Takeaways

  • Drop-in replacement for openai.chat.completions.create — change base_url only
  • api_key value is ignored; pass any non-empty string
  • Chat-tuned models get their prompt template applied automatically
  • top_k and repeat_penalty are bonus params unavailable on real OpenAI
  • Use lms log stream to verify the exact prompt being sent to the model

Sources