5.4 KiB
5.4 KiB
| title | aliases | tags | sources | created | updated | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LM Studio Tool Use (Function Calling) |
|
|
|
2026-04-30 | 2026-04-30 |
LM Studio Tool Use (Function Calling)
Tool use lets LLMs request calls to external functions/APIs via LM Studio's OpenAI-compatible /v1/chat/completions and /v1/responses endpoints. Your code executes the actual functions and feeds results back.
Key Takeaways
- LLMs cannot execute code — they output structured text requesting a tool call; your code runs it
- Uses the same format as OpenAI's Function Calling API — any OpenAI SDK works
- Tool definitions are injected into the system prompt via the model's chat template
- Two support tiers: Native (model trained for tool use) and Default (fallback prompt injection)
- After tool execution, re-prompt the model without tools to get a plain-text final answer
- Streaming tool calls arrive in chunks — accumulate
delta.tool_callsbefore executing
High-Level Flow
Setup LLM + tool list
→ Get user input
→ LLM prompted with messages
→ Needs tools?
Yes → Tool Response → Execute tools → Add results to messages → re-prompt
No → Normal response → loop back
Tool Definition Format
{
"type": "function",
"function": {
"name": "get_delivery_date",
"description": "Get the delivery date for a customer's order",
"parameters": {
"type": "object",
"properties": {
"order_id": { "type": "string" }
},
"required": ["order_id"]
}
}
}
Pass as the tools array in the request body — identical to OpenAI's spec.
Response Parsing
- Tool call detected:
choices[0].message.tool_callsarray is populated;finish_reason = "tool_calls" - No tool call: response lands in
choices[0].message.contentas normal text - If the model outputs a malformed tool call, LM Studio falls back to
content— uselms log streamto debug
Multi-Turn Pattern (Python)
from openai import OpenAI
import json
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
# 1. First call — with tools
response = client.chat.completions.create(
model="lmstudio-community/qwen2.5-7b-instruct",
messages=messages,
tools=tools,
)
# 2. Execute the requested tool
tool_call = response.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
result = my_function(**args)
# 3. Append both the assistant's tool-call message and the tool result
messages += [
{"role": "assistant", "tool_calls": [tool_call]},
{"role": "tool", "content": json.dumps(result), "tool_call_id": tool_call.id},
]
# 4. Second call — WITHOUT tools for final plain-text answer
final = client.chat.completions.create(model=model, messages=messages)
print(final.choices[0].message.content)
Native vs Default Support
| Level | What it means | Quality |
|---|---|---|
| Native | Model has a tool-use chat template + LM Studio parses its format | Best |
| Default | LM Studio injects a custom system prompt + converts tool role to user |
Variable |
Models with Native Support (as of 2024-11)
- Qwen — Qwen2.5-7B-Instruct (GGUF / MLX)
- Llama — Llama-3.1 / 3.2 8B-Instruct (GGUF / MLX)
- Mistral — Ministral-8B-Instruct-2410 (GGUF / MLX)
Native models show a hammer badge in the LM Studio UI.
Streaming Tool Calls
# Accumulate chunks — name and arguments arrive in pieces
for chunk in stream:
delta = chunk.choices[0].delta
if delta.tool_calls:
for tc in delta.tool_calls:
# Append tc.id, tc.function.name, tc.function.arguments fragments
Execute only after the stream ends and tool_calls is fully assembled.
Quick Start
# Start server
lms server start
# Load a model
lms load
# Debug raw prompts (see how tools are injected)
lms log stream
# curl single-turn example
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "lmstudio-community/qwen2.5-7b-instruct",
"messages": [{"role": "user", "content": "Search dell products under $50"}],
"tools": [...]}'
Troubleshooting
- No
tool_callsin response — model output was malformed; runlms log streamto inspect the raw prompt and output - Smaller models — may not follow the tool call format reliably; prefer ≥7B models with native support
- Default mode weirdness — check the injected system prompt via
lms log stream; the format uses[TOOL_REQUEST]...[END_TOOL_REQUEST]tags
Related
- wiki/claude-code/lmstudio-chat-completions — full
/v1/chat/completionsparam reference - wiki/claude-code/lmstudio-openai-compat-endpoints — all 5 compatible endpoints
- wiki/claude-code/lmstudio-responses-api —
/v1/responseswith Remote MCP tools - wiki/claude-code/lmstudio-structured-output — enforce JSON schema on responses
- wiki/claude-code/lmstudio-messages-api — Anthropic-compat tool use examples
Sources
raw/Tool Use.md— LM Studio official docs (lmstudio.ai/docs/developer/openai-compat/tools), published 2024-11-19