lms — LM Studio CLI

lms is LM Studio's built-in CLI utility for managing models, the inference server, and the runtime. Ships with LM Studio — no separate install needed. MIT licensed, open source on GitHub.

Installation & Verification

# Already installed with LM Studio — just verify:
lms --help

Current version: v0.0.47

Command Reference

Command	What it does
`lms chat`	Start interactive chat with a model in the terminal
`lms get`	Search and download models
`lms ls`	List models available on disk
`lms ps`	List models currently loaded in memory
`lms load`	Load a model (with GPU/context options)
`lms unload`	Unload a model
`lms import`	Import a model file into LM Studio
`lms server start/stop`	Control the local API server
`lms log`	Stream incoming/outgoing messages for debugging
`lms runtime`	Manage and update the inference runtime
`lms daemon`	Manage the headless llmster daemon
`lms link`	Manage LM Link
`lms clone`	Clone an artifact from LM Studio Hub
`lms push`	Upload artifact to LM Studio Hub
`lms login`	Authenticate with LM Studio

Common Workflows

Server control

lms server start
lms server stop

List & inspect models

lms ls        # models on disk (reflects My Models directory)
lms ps        # models currently loaded in memory

Load a model

# With GPU offload and context size:
lms load [--gpu=max|auto|0.0-1.0] [--context-length=1-N]

# --gpu=1.0 → 100% GPU offload
# With a stable identifier alias:
lms load openai/gpt-oss-20b --identifier="my-model-name"

Using --identifier keeps the model ID stable across loads — useful when client code hardcodes a model name.

Unload a model

lms unload           # unload specific model
lms unload --all     # unload everything

Debug message flow

lms log stream       # tail all incoming/outgoing API messages live

Pairs with wiki/claude-code/lmstudio-chat-completions for debugging request/response cycles.

Key Takeaways

lms ships with LM Studio — zero extra install steps
lms ps vs lms ls: loaded-in-memory vs on-disk — two different commands
--gpu=1.0 forces full GPU offload; --gpu=auto lets LM Studio decide
--identifier flag on lms load decouples client model names from actual model paths
lms log stream is the fastest way to debug what's hitting the server
lms daemon manages wiki/claude-code/lmstudio-headless-service for headless/service deployments
MIT licensed: safe to embed in scripts and automation

wiki/claude-code/lmstudio-rest-api — all API endpoints
wiki/claude-code/lmstudio-headless-service — daemon mode for servers
wiki/claude-code/lmstudio-server-settings — port, auth, CORS, JIT loading
wiki/claude-code/lmstudio-chat-completions — OpenAI-compat /v1/chat/completions
wiki/claude-code/lmstudio-llmster-systemd — run llmster at boot on Linux
wiki/claude-code/lmstudio-idle-ttl-auto-evict — memory management

Sources

lmstudio.ai/docs/cli

3.5 KiB Raw Permalink Blame History