158 lines
5.4 KiB
Markdown
158 lines
5.4 KiB
Markdown
---
|
|
title: "LM Studio Tool Use (Function Calling)"
|
|
aliases: [lmstudio-function-calling, lmstudio-tools]
|
|
tags: [lmstudio, tool-use, function-calling, openai-compat, python, local-llm]
|
|
sources: [raw/Tool Use.md]
|
|
created: 2026-04-30
|
|
updated: 2026-04-30
|
|
---
|
|
|
|
# LM Studio Tool Use (Function Calling)
|
|
|
|
Tool use lets LLMs *request* calls to external functions/APIs via LM Studio's OpenAI-compatible `/v1/chat/completions` and `/v1/responses` endpoints. Your code executes the actual functions and feeds results back.
|
|
|
|
## Key Takeaways
|
|
|
|
- LLMs **cannot execute code** — they output structured text requesting a tool call; your code runs it
|
|
- Uses the same format as OpenAI's Function Calling API — any OpenAI SDK works
|
|
- Tool definitions are injected into the system prompt via the model's chat template
|
|
- Two support tiers: **Native** (model trained for tool use) and **Default** (fallback prompt injection)
|
|
- After tool execution, re-prompt the model *without* tools to get a plain-text final answer
|
|
- Streaming tool calls arrive in chunks — accumulate `delta.tool_calls` before executing
|
|
|
|
## High-Level Flow
|
|
|
|
```
|
|
Setup LLM + tool list
|
|
→ Get user input
|
|
→ LLM prompted with messages
|
|
→ Needs tools?
|
|
Yes → Tool Response → Execute tools → Add results to messages → re-prompt
|
|
No → Normal response → loop back
|
|
```
|
|
|
|
## Tool Definition Format
|
|
|
|
```json
|
|
{
|
|
"type": "function",
|
|
"function": {
|
|
"name": "get_delivery_date",
|
|
"description": "Get the delivery date for a customer's order",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"order_id": { "type": "string" }
|
|
},
|
|
"required": ["order_id"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Pass as the `tools` array in the request body — identical to OpenAI's spec.
|
|
|
|
## Response Parsing
|
|
|
|
- Tool call detected: `choices[0].message.tool_calls` array is populated; `finish_reason = "tool_calls"`
|
|
- No tool call: response lands in `choices[0].message.content` as normal text
|
|
- If the model outputs a malformed tool call, LM Studio falls back to `content` — use `lms log stream` to debug
|
|
|
|
## Multi-Turn Pattern (Python)
|
|
|
|
```python
|
|
from openai import OpenAI
|
|
import json
|
|
|
|
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
|
|
|
|
# 1. First call — with tools
|
|
response = client.chat.completions.create(
|
|
model="lmstudio-community/qwen2.5-7b-instruct",
|
|
messages=messages,
|
|
tools=tools,
|
|
)
|
|
|
|
# 2. Execute the requested tool
|
|
tool_call = response.choices[0].message.tool_calls[0]
|
|
args = json.loads(tool_call.function.arguments)
|
|
result = my_function(**args)
|
|
|
|
# 3. Append both the assistant's tool-call message and the tool result
|
|
messages += [
|
|
{"role": "assistant", "tool_calls": [tool_call]},
|
|
{"role": "tool", "content": json.dumps(result), "tool_call_id": tool_call.id},
|
|
]
|
|
|
|
# 4. Second call — WITHOUT tools for final plain-text answer
|
|
final = client.chat.completions.create(model=model, messages=messages)
|
|
print(final.choices[0].message.content)
|
|
```
|
|
|
|
## Native vs Default Support
|
|
|
|
| Level | What it means | Quality |
|
|
|-------|---------------|---------|
|
|
| **Native** | Model has a tool-use chat template + LM Studio parses its format | Best |
|
|
| **Default** | LM Studio injects a custom system prompt + converts `tool` role to `user` | Variable |
|
|
|
|
### Models with Native Support (as of 2024-11)
|
|
|
|
- **Qwen** — Qwen2.5-7B-Instruct (GGUF / MLX)
|
|
- **Llama** — Llama-3.1 / 3.2 8B-Instruct (GGUF / MLX)
|
|
- **Mistral** — Ministral-8B-Instruct-2410 (GGUF / MLX)
|
|
|
|
Native models show a hammer badge in the LM Studio UI.
|
|
|
|
## Streaming Tool Calls
|
|
|
|
```python
|
|
# Accumulate chunks — name and arguments arrive in pieces
|
|
for chunk in stream:
|
|
delta = chunk.choices[0].delta
|
|
if delta.tool_calls:
|
|
for tc in delta.tool_calls:
|
|
# Append tc.id, tc.function.name, tc.function.arguments fragments
|
|
```
|
|
|
|
Execute only after the stream ends and `tool_calls` is fully assembled.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Start server
|
|
lms server start
|
|
|
|
# Load a model
|
|
lms load
|
|
|
|
# Debug raw prompts (see how tools are injected)
|
|
lms log stream
|
|
```
|
|
|
|
```bash
|
|
# curl single-turn example
|
|
curl http://localhost:1234/v1/chat/completions \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"model": "lmstudio-community/qwen2.5-7b-instruct",
|
|
"messages": [{"role": "user", "content": "Search dell products under $50"}],
|
|
"tools": [...]}'
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
- **No `tool_calls` in response** — model output was malformed; run `lms log stream` to inspect the raw prompt and output
|
|
- **Smaller models** — may not follow the tool call format reliably; prefer ≥7B models with native support
|
|
- **Default mode weirdness** — check the injected system prompt via `lms log stream`; the format uses `[TOOL_REQUEST]...[END_TOOL_REQUEST]` tags
|
|
|
|
## Related
|
|
|
|
- [[wiki/claude-code/lmstudio-chat-completions|LM Studio Chat Completions]] — full `/v1/chat/completions` param reference
|
|
- [[wiki/claude-code/lmstudio-openai-compat-endpoints|LM Studio OpenAI Compat Endpoints]] — all 5 compatible endpoints
|
|
- [[wiki/claude-code/lmstudio-responses-api|LM Studio Responses API]] — `/v1/responses` with Remote MCP tools
|
|
- [[wiki/claude-code/lmstudio-structured-output|LM Studio Structured Output]] — enforce JSON schema on responses
|
|
- [[wiki/claude-code/lmstudio-messages-api|LM Studio Messages API]] — Anthropic-compat tool use examples
|
|
|
|
## Sources
|
|
|
|
- `raw/Tool Use.md` — LM Studio official docs (lmstudio.ai/docs/developer/openai-compat/tools), published 2024-11-19
|