| title |
aliases |
tags |
sources |
created |
updated |
| LM Studio Structured Output |
| lmstudio-json-schema |
| structured-output-lmstudio |
|
| lmstudio |
| structured-output |
| json-schema |
| openai-compat |
| local-llm |
|
|
2026-04-30 |
2026-04-30 |
LM Studio Structured Output
Enforce a specific JSON shape on LLM responses by passing a JSON schema to /v1/chat/completions. Compatible with OpenAI's Structured Output API format.
How It Works
- Add a
response_format field to the chat completions request
- Provide a
json_schema with a name, optional strict, and a schema object
- The model is constrained to return valid JSON matching that schema
- Response arrives as a string in
choices[0].message.content — parse it with json.loads()
Server Setup
lms server start
# or enable from Developer tab in LM Studio UI
Install the CLI first if needed:
npx lmstudio install-cli
request_format Shape
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "my_schema",
"strict": "true",
"schema": {
"type": "object",
"properties": {
"field": { "type": "string" }
},
"required": ["field"]
}
}
}
cURL Example
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "{{model}}",
"messages": [
{"role": "system", "content": "You are a helpful jokester."},
{"role": "user", "content": "Tell me a joke."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "joke_response",
"strict": "true",
"schema": {
"type": "object",
"properties": { "joke": {"type": "string"} },
"required": ["joke"]
}
}
},
"temperature": 0.7,
"max_tokens": 50,
"stream": false
}'
Python Example
from openai import OpenAI
import json
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
character_schema = {
"type": "json_schema",
"json_schema": {
"name": "characters",
"schema": {
"type": "object",
"properties": {
"characters": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"occupation": {"type": "string"},
"personality": {"type": "string"},
"background": {"type": "string"}
},
"required": ["name", "occupation", "personality", "background"]
},
"minItems": 1
}
},
"required": ["characters"]
}
}
}
response = client.chat.completions.create(
model="your-model",
messages=[
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Create 1-3 fictional characters"}
],
response_format=character_schema,
)
results = json.loads(response.choices[0].message.content)
print(json.dumps(results, indent=2))
Structured Output Engines
Key Takeaways
- Use
response_format.type = "json_schema" — same shape as OpenAI's Structured Outputs API
- Works with any OpenAI-compatible client SDK (Python, TS, etc.) just by pointing
base_url at localhost
- Response is always a string in
choices[0].message.content — always call json.loads() on it
- Not all models support this: models below 7B parameters often cannot do structured output — check the model card
- GGUF uses grammar sampling; MLX uses Outlines — both constrain tokens at generation time, not post-hoc
- All standard
/v1/chat/completions params (temperature, max_tokens, stream, etc.) still apply
Related