obsidian/wiki/claude-code/lmstudio-headless-service.md
2026-04-30 14:42:43 +01:00

104 lines
3.5 KiB
Markdown

---
title: "LM Studio Headless / Service Mode"
aliases: [lmstudio-daemon, llmster, lmstudio-background-service]
tags: [lmstudio, local-llm, headless, daemon, jit-loading]
sources: [raw/Run LM Studio as a service (headless).md]
created: 2026-04-30
updated: 2026-04-30
---
# LM Studio Headless / Service Mode
GUI-less operation of LM Studio: run as a background daemon, start on machine login, and load models on demand via JIT.
## Two Approaches
| Approach | Best For | GUI Required? |
|----------|----------|---------------|
| **llmster** (recommended) | Linux servers, cloud, GPU rigs, headless machines | No |
| **Desktop app headless mode** | Machines with a GUI where app is already installed | Yes (hidden to tray) |
---
## Option 1: llmster (Recommended)
`llmster` is the core of the LM Studio desktop app, repackaged as a server-native daemon. No GUI dependency.
### Install
```bash
# Linux / Mac
curl -fsSL https://lmstudio.ai/install.sh | bash
# Windows (PowerShell)
irm https://lmstudio.ai/install.ps1 | iex
```
### Start the daemon
```bash
lms daemon up
```
- To auto-start on Linux boot, configure it as a **Linux Startup Task** (see LM Studio docs).
- Full CLI reference: `lms daemon --help`
---
## Option 2: Desktop App in Headless Mode
Works on Mac, Windows, Linux (with GUI). Useful if the desktop app is already installed.
### Run server on login
1. Open app settings (`Cmd/Ctrl` + `,`)
2. Enable **"Run LLM server on login"**
3. Exiting the app minimizes to tray — server keeps running
### Start server programmatically
```bash
lms server start
```
Last server state is saved and restored automatically on launch.
---
## Just-In-Time (JIT) Model Loading
Applies to **both** options. Useful when using LM Studio as a backend for other tools (Open WebUI, Claude Code, custom apps).
| JIT State | `/v1/models` returns | Inference behavior |
|-----------|---------------------|--------------------|
| **ON** | All downloaded models | Auto-loads model into VRAM on first call |
| **OFF** | Only models in VRAM | Must manually load model first |
### Auto-Unload
JIT-loaded models are **auto-evicted** after a period of inactivity — see [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] for TTL settings and per-request `ttl` field.
---
## Key Takeaways
- **llmster** is the preferred headless path — works on servers and CI without any GUI
- Desktop headless mode is a quick option for developer machines already running the app
- JIT loading eliminates manual `lms load` calls; models are loaded on first inference request
- JIT-loaded models auto-unload after inactivity (configurable TTL)
- Use `lms server start` to programmatically control the REST server state
- The OpenAI-compatible REST API (`/v1/...`) is available in both modes — see [[wiki/claude-code/lmstudio-openai-compat-endpoints|OpenAI Compat Endpoints]] and [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]]
---
## Related
- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — all endpoints and lifecycle management
- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] — memory management for JIT-loaded models
- [[wiki/claude-code/lmstudio-openai-compat-endpoints|OpenAI Compat Endpoints]] — drop-in base_url swap for any OpenAI client
- [[wiki/claude-code/lmstudio-anthropic-compat|Anthropic Compat Endpoints]] — redirect Claude Code / Anthropic SDK to local LM Studio
## Sources
- `raw/Run LM Studio as a service (headless).md`
- LM Studio docs: https://lmstudio.ai/docs/developer/core/headless