obsidian/wiki/claude-code/lmstudio-headless-service.md

---
title: "LM Studio Headless / Service Mode"
aliases: [lmstudio-daemon, llmster, lmstudio-background-service]
tags: [lmstudio, local-llm, headless, daemon, jit-loading]
sources: [raw/Run LM Studio as a service (headless).md]
created: 2026-04-30
updated: 2026-04-30
---

# LM Studio Headless / Service Mode

GUI-less operation of LM Studio: run as a background daemon, start on machine login, and load models on demand via JIT.

## Two Approaches

| Approach | Best For | GUI Required? |
|----------|----------|---------------|
| **llmster** (recommended) | Linux servers, cloud, GPU rigs, headless machines | No |
| **Desktop app headless mode** | Machines with a GUI where app is already installed | Yes (hidden to tray) |

---

## Option 1: llmster (Recommended)

`llmster` is the core of the LM Studio desktop app, repackaged as a server-native daemon. No GUI dependency.

### Install

```bash
# Linux / Mac
curl -fsSL https://lmstudio.ai/install.sh | bash

# Windows (PowerShell)
irm https://lmstudio.ai/install.ps1 | iex
```

### Start the daemon

```bash
lms daemon up
```

- To auto-start on Linux boot, configure it as a **Linux Startup Task** (see LM Studio docs).
- Full CLI reference: `lms daemon --help`

---

## Option 2: Desktop App in Headless Mode

Works on Mac, Windows, Linux (with GUI). Useful if the desktop app is already installed.

### Run server on login

1. Open app settings (`Cmd/Ctrl` + `,`)
2. Enable **"Run LLM server on login"**
3. Exiting the app minimizes to tray — server keeps running

### Start server programmatically

```bash
lms server start
```

Last server state is saved and restored automatically on launch.

---

## Just-In-Time (JIT) Model Loading

Applies to **both** options. Useful when using LM Studio as a backend for other tools (Open WebUI, Claude Code, custom apps).

| JIT State | `/v1/models` returns | Inference behavior |
|-----------|---------------------|--------------------|
| **ON** | All downloaded models | Auto-loads model into VRAM on first call |
| **OFF** | Only models in VRAM | Must manually load model first |

### Auto-Unload

JIT-loaded models are **auto-evicted** after a period of inactivity — see [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] for TTL settings and per-request `ttl` field.

---

## Key Takeaways

- **llmster** is the preferred headless path — works on servers and CI without any GUI
- Desktop headless mode is a quick option for developer machines already running the app
- JIT loading eliminates manual `lms load` calls; models are loaded on first inference request
- JIT-loaded models auto-unload after inactivity (configurable TTL)
- Use `lms server start` to programmatically control the REST server state
- The OpenAI-compatible REST API (`/v1/...`) is available in both modes — see [[wiki/claude-code/lmstudio-openai-compat-endpoints|OpenAI Compat Endpoints]] and [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]]

---

## Related

- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — all endpoints and lifecycle management
- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] — memory management for JIT-loaded models
- [[wiki/claude-code/lmstudio-openai-compat-endpoints|OpenAI Compat Endpoints]] — drop-in base_url swap for any OpenAI client
- [[wiki/claude-code/lmstudio-anthropic-compat|Anthropic Compat Endpoints]] — redirect Claude Code / Anthropic SDK to local LM Studio

## Sources

- `raw/Run LM Studio as a service (headless).md`
- LM Studio docs: https://lmstudio.ai/docs/developer/core/headless