obsidian/wiki/claude-code/lmstudio-structured-output.md
2026-04-30 14:42:43 +01:00

150 lines
4.6 KiB
Markdown

---
title: "LM Studio Structured Output"
aliases: [lmstudio-json-schema, structured-output-lmstudio]
tags: [lmstudio, structured-output, json-schema, openai-compat, local-llm]
sources: [raw/Structured Output.md]
created: 2026-04-30
updated: 2026-04-30
---
# LM Studio Structured Output
Enforce a specific JSON shape on LLM responses by passing a JSON schema to `/v1/chat/completions`. Compatible with OpenAI's Structured Output API format.
## How It Works
- Add a `response_format` field to the chat completions request
- Provide a `json_schema` with a `name`, optional `strict`, and a `schema` object
- The model is constrained to return valid JSON matching that schema
- Response arrives as a string in `choices[0].message.content` — parse it with `json.loads()`
## Server Setup
```bash
lms server start
# or enable from Developer tab in LM Studio UI
```
Install the CLI first if needed:
```bash
npx lmstudio install-cli
```
## request_format Shape
```json
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "my_schema",
"strict": "true",
"schema": {
"type": "object",
"properties": {
"field": { "type": "string" }
},
"required": ["field"]
}
}
}
```
## cURL Example
```bash
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "{{model}}",
"messages": [
{"role": "system", "content": "You are a helpful jokester."},
{"role": "user", "content": "Tell me a joke."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "joke_response",
"strict": "true",
"schema": {
"type": "object",
"properties": { "joke": {"type": "string"} },
"required": ["joke"]
}
}
},
"temperature": 0.7,
"max_tokens": 50,
"stream": false
}'
```
## Python Example
```python
from openai import OpenAI
import json
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
character_schema = {
"type": "json_schema",
"json_schema": {
"name": "characters",
"schema": {
"type": "object",
"properties": {
"characters": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"occupation": {"type": "string"},
"personality": {"type": "string"},
"background": {"type": "string"}
},
"required": ["name", "occupation", "personality", "background"]
},
"minItems": 1
}
},
"required": ["characters"]
}
}
}
response = client.chat.completions.create(
model="your-model",
messages=[
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Create 1-3 fictional characters"}
],
response_format=character_schema,
)
results = json.loads(response.choices[0].message.content)
print(json.dumps(results, indent=2))
```
## Structured Output Engines
| Model Format | Engine |
|---|---|
| GGUF | `llama.cpp` grammar-based sampling |
| MLX | [Outlines](https://github.com/dottxt-ai/outlines) via [lmstudio-ai/mlx-engine](https://github.com/lmstudio-ai/mlx-engine) |
## Key Takeaways
- Use `response_format.type = "json_schema"` — same shape as OpenAI's Structured Outputs API
- Works with any OpenAI-compatible client SDK (Python, TS, etc.) just by pointing `base_url` at localhost
- Response is always a **string** in `choices[0].message.content` — always call `json.loads()` on it
- Not all models support this: **models below 7B parameters often cannot do structured output** — check the model card
- GGUF uses grammar sampling; MLX uses Outlines — both constrain tokens at generation time, not post-hoc
- All standard `/v1/chat/completions` params (temperature, max_tokens, stream, etc.) still apply
## Related
- [[wiki/claude-code/lmstudio-chat-completions|lmstudio-chat-completions]] — full parameter reference for the completions endpoint
- [[wiki/claude-code/lmstudio-openai-compat-endpoints|lmstudio-openai-compat-endpoints]] — overview of all OpenAI-compat endpoints
- [[wiki/claude-code/lmstudio-responses-api|lmstudio-responses-api]] — stateful responses with streaming and Remote MCP tools
- [[wiki/claude-code/lmstudio-rest-api|lmstudio-rest-api]] — native LM Studio API and endpoint feature comparison