150 lines
4.6 KiB
Markdown
150 lines
4.6 KiB
Markdown
---
|
|
title: "LM Studio Structured Output"
|
|
aliases: [lmstudio-json-schema, structured-output-lmstudio]
|
|
tags: [lmstudio, structured-output, json-schema, openai-compat, local-llm]
|
|
sources: [raw/Structured Output.md]
|
|
created: 2026-04-30
|
|
updated: 2026-04-30
|
|
---
|
|
|
|
# LM Studio Structured Output
|
|
|
|
Enforce a specific JSON shape on LLM responses by passing a JSON schema to `/v1/chat/completions`. Compatible with OpenAI's Structured Output API format.
|
|
|
|
## How It Works
|
|
|
|
- Add a `response_format` field to the chat completions request
|
|
- Provide a `json_schema` with a `name`, optional `strict`, and a `schema` object
|
|
- The model is constrained to return valid JSON matching that schema
|
|
- Response arrives as a string in `choices[0].message.content` — parse it with `json.loads()`
|
|
|
|
## Server Setup
|
|
|
|
```bash
|
|
lms server start
|
|
# or enable from Developer tab in LM Studio UI
|
|
```
|
|
|
|
Install the CLI first if needed:
|
|
```bash
|
|
npx lmstudio install-cli
|
|
```
|
|
|
|
## request_format Shape
|
|
|
|
```json
|
|
"response_format": {
|
|
"type": "json_schema",
|
|
"json_schema": {
|
|
"name": "my_schema",
|
|
"strict": "true",
|
|
"schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"field": { "type": "string" }
|
|
},
|
|
"required": ["field"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## cURL Example
|
|
|
|
```bash
|
|
curl http://localhost:1234/v1/chat/completions \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "{{model}}",
|
|
"messages": [
|
|
{"role": "system", "content": "You are a helpful jokester."},
|
|
{"role": "user", "content": "Tell me a joke."}
|
|
],
|
|
"response_format": {
|
|
"type": "json_schema",
|
|
"json_schema": {
|
|
"name": "joke_response",
|
|
"strict": "true",
|
|
"schema": {
|
|
"type": "object",
|
|
"properties": { "joke": {"type": "string"} },
|
|
"required": ["joke"]
|
|
}
|
|
}
|
|
},
|
|
"temperature": 0.7,
|
|
"max_tokens": 50,
|
|
"stream": false
|
|
}'
|
|
```
|
|
|
|
## Python Example
|
|
|
|
```python
|
|
from openai import OpenAI
|
|
import json
|
|
|
|
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
|
|
|
|
character_schema = {
|
|
"type": "json_schema",
|
|
"json_schema": {
|
|
"name": "characters",
|
|
"schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"characters": {
|
|
"type": "array",
|
|
"items": {
|
|
"type": "object",
|
|
"properties": {
|
|
"name": {"type": "string"},
|
|
"occupation": {"type": "string"},
|
|
"personality": {"type": "string"},
|
|
"background": {"type": "string"}
|
|
},
|
|
"required": ["name", "occupation", "personality", "background"]
|
|
},
|
|
"minItems": 1
|
|
}
|
|
},
|
|
"required": ["characters"]
|
|
}
|
|
}
|
|
}
|
|
|
|
response = client.chat.completions.create(
|
|
model="your-model",
|
|
messages=[
|
|
{"role": "system", "content": "You are a helpful AI assistant."},
|
|
{"role": "user", "content": "Create 1-3 fictional characters"}
|
|
],
|
|
response_format=character_schema,
|
|
)
|
|
|
|
results = json.loads(response.choices[0].message.content)
|
|
print(json.dumps(results, indent=2))
|
|
```
|
|
|
|
## Structured Output Engines
|
|
|
|
| Model Format | Engine |
|
|
|---|---|
|
|
| GGUF | `llama.cpp` grammar-based sampling |
|
|
| MLX | [Outlines](https://github.com/dottxt-ai/outlines) via [lmstudio-ai/mlx-engine](https://github.com/lmstudio-ai/mlx-engine) |
|
|
|
|
## Key Takeaways
|
|
|
|
- Use `response_format.type = "json_schema"` — same shape as OpenAI's Structured Outputs API
|
|
- Works with any OpenAI-compatible client SDK (Python, TS, etc.) just by pointing `base_url` at localhost
|
|
- Response is always a **string** in `choices[0].message.content` — always call `json.loads()` on it
|
|
- Not all models support this: **models below 7B parameters often cannot do structured output** — check the model card
|
|
- GGUF uses grammar sampling; MLX uses Outlines — both constrain tokens at generation time, not post-hoc
|
|
- All standard `/v1/chat/completions` params (temperature, max_tokens, stream, etc.) still apply
|
|
|
|
## Related
|
|
|
|
- [[wiki/claude-code/lmstudio-chat-completions|lmstudio-chat-completions]] — full parameter reference for the completions endpoint
|
|
- [[wiki/claude-code/lmstudio-openai-compat-endpoints|lmstudio-openai-compat-endpoints]] — overview of all OpenAI-compat endpoints
|
|
- [[wiki/claude-code/lmstudio-responses-api|lmstudio-responses-api]] — stateful responses with streaming and Remote MCP tools
|
|
- [[wiki/claude-code/lmstudio-rest-api|lmstudio-rest-api]] — native LM Studio API and endpoint feature comparison
|