obsidian/wiki/claude-code/lmstudio-headless-service.md
2026-04-30 14:42:43 +01:00

3.5 KiB

title aliases tags sources created updated
LM Studio Headless / Service Mode
lmstudio-daemon
llmster
lmstudio-background-service
lmstudio
local-llm
headless
daemon
jit-loading
raw/Run LM Studio as a service (headless).md
2026-04-30 2026-04-30

LM Studio Headless / Service Mode

GUI-less operation of LM Studio: run as a background daemon, start on machine login, and load models on demand via JIT.

Two Approaches

Approach Best For GUI Required?
llmster (recommended) Linux servers, cloud, GPU rigs, headless machines No
Desktop app headless mode Machines with a GUI where app is already installed Yes (hidden to tray)

llmster is the core of the LM Studio desktop app, repackaged as a server-native daemon. No GUI dependency.

Install

# Linux / Mac
curl -fsSL https://lmstudio.ai/install.sh | bash

# Windows (PowerShell)
irm https://lmstudio.ai/install.ps1 | iex

Start the daemon

lms daemon up
  • To auto-start on Linux boot, configure it as a Linux Startup Task (see LM Studio docs).
  • Full CLI reference: lms daemon --help

Option 2: Desktop App in Headless Mode

Works on Mac, Windows, Linux (with GUI). Useful if the desktop app is already installed.

Run server on login

  1. Open app settings (Cmd/Ctrl + ,)
  2. Enable "Run LLM server on login"
  3. Exiting the app minimizes to tray — server keeps running

Start server programmatically

lms server start

Last server state is saved and restored automatically on launch.


Just-In-Time (JIT) Model Loading

Applies to both options. Useful when using LM Studio as a backend for other tools (Open WebUI, Claude Code, custom apps).

JIT State /v1/models returns Inference behavior
ON All downloaded models Auto-loads model into VRAM on first call
OFF Only models in VRAM Must manually load model first

Auto-Unload

JIT-loaded models are auto-evicted after a period of inactivity — see wiki/claude-code/lmstudio-idle-ttl-auto-evict for TTL settings and per-request ttl field.


Key Takeaways

  • llmster is the preferred headless path — works on servers and CI without any GUI
  • Desktop headless mode is a quick option for developer machines already running the app
  • JIT loading eliminates manual lms load calls; models are loaded on first inference request
  • JIT-loaded models auto-unload after inactivity (configurable TTL)
  • Use lms server start to programmatically control the REST server state
  • The OpenAI-compatible REST API (/v1/...) is available in both modes — see wiki/claude-code/lmstudio-openai-compat-endpoints and wiki/claude-code/lmstudio-rest-api

Sources