diff --git a/.obsidian/plugins/hoarder-sync/data.json b/.obsidian/plugins/hoarder-sync/data.json
index 69f1e66..3daae9f 100644
--- a/.obsidian/plugins/hoarder-sync/data.json
+++ b/.obsidian/plugins/hoarder-sync/data.json
@@ -4,7 +4,7 @@
   "syncFolder": "Hoarder",
   "attachmentsFolder": "Hoarder/attachments",
   "syncIntervalMinutes": 60,
-  "lastSyncTimestamp": 1777555641238,
+  "lastSyncTimestamp": 1777556035137,
   "updateExistingFiles": false,
   "excludeArchived": true,
   "onlyFavorites": false,
diff --git a/99 Daily/2026-04-30.md b/99 Daily/2026-04-30.md
index c448c5f..70279da 100644
--- a/99 Daily/2026-04-30.md	
+++ b/99 Daily/2026-04-30.md	
@@ -149,3 +149,48 @@ tags: [daily]
 - 14:27 (<1min) | `memory-compiler`
   - **Asked:** Compile a new article about LM Studio embeddings into the structured wiki knowledge base.
   - **Done:** Filed article as `wiki/claude-code/lmstudio-embeddings.md` and updated master index with wikilinks to related LM Studio topics and RAG pattern.
+- 14:28 (1min) | `memory-compiler`
+  - **Asked:** Compile a new HP EliteDesk 800G3 teardown/upgrade article into the wiki knowledge base.
+  - **Done:** Filed article as `wiki/homelab/hp-elitedesk-800g3-teardown-upgrade.md` with full disassembly procedures, motherboard specs, and upgrade benchmarks.
+- 14:29 | `video-accessibility`
+  - **Asked:** Asked for code review skills checklist from project instructions | Reviewed project completion and committed code changes | No files specified
+  - **Done:** —
+- 14:30 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new article on LM Studio messages API into the wiki knowledge base.
+  - **Done:** Created structured wiki article with cURL examples and updated topic and master indices.
+- 14:31 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new article about LM Studio's OpenAI-compatible endpoints into the wiki knowledge base.
+  - **Done:** Created the article, updated the claude-code index to 21 articles, and bumped the master index count.
+- 14:32 | `video-accessibility`
+  - **Asked:** What skills should be checked for code review according to the instructions?
+  - **Done:** Reviewed project completion and identified environment configuration changes needed for optical-dev deployment.
+- 14:33 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new article about LM Studio headless service into the knowledge base wiki.
+  - **Done:** Filed article as `claude-code/lmstudio-headless-service.md` and updated master index to reflect 23 total claude-code articles.
+- 14:33 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new article about LM Studio network serving into the knowledge base and update the master index.
+  - **Done:** Created new LM Studio article, updated claude-code topic index, and incremented master index article count from 23 to 24.
+- 14:34 | `video-accessibility`
+  - **Asked:** Check the project instructions for code review skills requirements.
+  - **Done:** Identified OOM issue in whisper-worker memory configuration and pushed hotfix to restore original memory limits while keeping Cloud Run URLs.
+- 14:35 (<1min) | `memory-compiler`
+  - **Asked:** Compile a raw article about LM Studio systemd configuration into the structured wiki knowledge base.
+  - **Done:** Filed the article as a systemd unit configuration guide with systemd service setup details, unit file ordering, and PATH requirements.
+- 14:36 (<1min) | `memory-compiler`
+  - **Asked:** File a new article about LM Studio structured output into the knowledge base.
+  - **Done:** Created wiki article and updated both index files to register the new entry.
+- 14:37 (1min) | `memory-compiler`
+  - **Asked:** Compile a new article on tool use into the knowledge base wiki structure.
+  - **Done:** Processed raw article into `claude-code/lmstudio-tool-use.md` and updated both topic and master indexes.
+- 14:38 | `video-accessibility`
+  - **Asked:** Check the instructions for code review skills to verify the completed project.
+  - **Done:** Reviewed deployment fix that restored memory limits and confirmed all 7 containers started successfully with API health checks passing.
+- 14:39 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new LM Studio CLI article into the knowledge base wiki.
+  - **Done:** Created structured wiki article with command reference and cross-links, updated master index from 29 to 30 claude-code articles.
+- 14:41 | `video-accessibility`
+  - **Asked:** Check the project instructions for code review skills that need to be verified.
+  - **Done:** Reviewed deployment status and identified CORS configuration and ffmpeg logging checks needed.
+- 14:41 | `video-accessibility`
+  - **Asked:** Check project completion and review code quality assessment skills from instructions.
+  - **Done:** Identified server authorization limitations and provided gsutil CORS configuration command for local execution.
diff --git a/raw/HP EliteDesk 800 G3 SFF - Teardown, re-assembly and upgrade.md b/raw/_processed/HP EliteDesk 800 G3 SFF - Teardown, re-assembly and upgrade.md
similarity index 100%
rename from raw/HP EliteDesk 800 G3 SFF - Teardown, re-assembly and upgrade.md
rename to raw/_processed/HP EliteDesk 800 G3 SFF - Teardown, re-assembly and upgrade.md
diff --git a/raw/Idle TTL and Auto-Evict.md b/raw/_processed/Idle TTL and Auto-Evict.md
similarity index 100%
rename from raw/Idle TTL and Auto-Evict.md
rename to raw/_processed/Idle TTL and Auto-Evict.md
diff --git a/raw/LM Studio API.md b/raw/_processed/LM Studio API.md
similarity index 100%
rename from raw/LM Studio API.md
rename to raw/_processed/LM Studio API.md
diff --git a/raw/Messages.md b/raw/_processed/Messages.md
similarity index 100%
rename from raw/Messages.md
rename to raw/_processed/Messages.md
diff --git a/raw/OpenAI Compatibility Endpoints.md b/raw/_processed/OpenAI Compatibility Endpoints.md
similarity index 100%
rename from raw/OpenAI Compatibility Endpoints.md
rename to raw/_processed/OpenAI Compatibility Endpoints.md
diff --git a/raw/Responses.md b/raw/_processed/Responses.md
similarity index 100%
rename from raw/Responses.md
rename to raw/_processed/Responses.md
diff --git a/raw/Run LM Studio as a service (headless).md b/raw/_processed/Run LM Studio as a service (headless).md
similarity index 100%
rename from raw/Run LM Studio as a service (headless).md
rename to raw/_processed/Run LM Studio as a service (headless).md
diff --git a/raw/Serve on Local Network.md b/raw/_processed/Serve on Local Network.md
similarity index 100%
rename from raw/Serve on Local Network.md
rename to raw/_processed/Serve on Local Network.md
diff --git a/raw/Server Settings.md b/raw/_processed/Server Settings.md
similarity index 100%
rename from raw/Server Settings.md
rename to raw/_processed/Server Settings.md
diff --git a/raw/Setup llmster as a Startup Task on Linux.md b/raw/_processed/Setup llmster as a Startup Task on Linux.md
similarity index 100%
rename from raw/Setup llmster as a Startup Task on Linux.md
rename to raw/_processed/Setup llmster as a Startup Task on Linux.md
diff --git a/raw/Structured Output.md b/raw/_processed/Structured Output.md
similarity index 100%
rename from raw/Structured Output.md
rename to raw/_processed/Structured Output.md
diff --git a/raw/Tool Use.md b/raw/_processed/Tool Use.md
similarity index 100%
rename from raw/Tool Use.md
rename to raw/_processed/Tool Use.md
diff --git a/raw/Using MCP via API.md b/raw/_processed/Using MCP via API.md
similarity index 100%
rename from raw/Using MCP via API.md
rename to raw/_processed/Using MCP via API.md
diff --git a/raw/lms — LM Studio's CLI.md b/raw/_processed/lms — LM Studio's CLI.md
similarity index 100%
rename from raw/lms — LM Studio's CLI.md
rename to raw/_processed/lms — LM Studio's CLI.md
diff --git a/wiki/_master-index.md b/wiki/_master-index.md
index a014493..b618f0b 100644
--- a/wiki/_master-index.md
+++ b/wiki/_master-index.md
@@ -26,12 +26,12 @@ This 3-hop pattern works for hundreds of articles without vector search.
 | [[wiki/concepts/_index\|concepts/]] | Atomic knowledge extracted from Claude Code sessions | 75 |
 | [[wiki/connections/_index\|connections/]] | Cross-cutting insights linking 2+ concepts: FastAPI+Azure AD+Docker trinity, AI→cost-tracker, Apache+Vite basePath, GCP→REST polling, Box+hotfolder, Docker DNS+AdGuard | 9 |
 | [[wiki/qa/_index\|qa/]] | Filed answers to queries (saved with `--file-back`) | 0 |
-| [[wiki/homelab/_index\|homelab/]] | Self-hosted infra: Proxmox install, IOMMU/PCI passthrough, hypervisor setup, budget builds, HP Elitedesk G3, Homarr API + Apps + Boards + Certificates + Integrations + Settings + Tasks + AdGuard + Clock + Docker Stats + Docker Integration + Download Client + Firewall + Proxmox Integration + Radarr + Readarr + Sonarr + Bookmarks + Calendar + Icons + App Widget + Weather + GitHub + Nextcloud + qBittorrent + RSS Feed + Speedtest Tracker + System Health Monitoring + System Resources + Services Map + Media Stack | 39 |
+| [[wiki/homelab/_index\|homelab/]] | Self-hosted infra: Proxmox install, IOMMU/PCI passthrough, hypervisor setup, budget builds, HP Elitedesk G3, Homarr API + Apps + Boards + Certificates + Integrations + Settings + Tasks + AdGuard + Clock + Docker Stats + Docker Integration + Download Client + Firewall + Proxmox Integration + Radarr + Readarr + Sonarr + Bookmarks + Calendar + Icons + App Widget + Weather + GitHub + Nextcloud + qBittorrent + RSS Feed + Speedtest Tracker + System Health Monitoring + System Resources + Services Map + Media Stack | 40 |
 | [[wiki/web-agency/_index\|web-agency/]] | AI-assisted website building & selling: Claude Code, Nanobanana 2, Kling, LaunchPath MCP | 9 |
 | [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 21 |
 | [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 30 |
 | [[wiki/llm-models/_index\|llm-models/]] | LLM model catalogs — OpenAI and Claude/Anthropic models, IDs, context, pricing | 2 |
-| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery, LM Studio local | 17 |
+| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery, LM Studio local | 30 |
 | [[wiki/reports/_index\|reports/]] | Weekly and monthly summaries — generate: `uv run python scripts/report-generator.py --weekly` | 1 |
 | [[wiki/infrastructure/_index\|infrastructure/]] | Server inventory: all 10 SSH hosts — optical, optical-dev, optical-prod, baic, librechat, modocmms, box-cli, aimpress, pve | 10 |
 
diff --git a/wiki/claude-code/_index.md b/wiki/claude-code/_index.md
index 3748b8c..d7a8dec 100644
--- a/wiki/claude-code/_index.md
+++ b/wiki/claude-code/_index.md
@@ -31,3 +31,16 @@ Claude Code is Anthropic's agentic coding assistant. Works across terminal, IDE,
 | [[wiki/claude-code/lmstudio-anthropic-compat\|lmstudio-anthropic-compat]] | Redirect Claude Code and the Anthropic SDK to a local LM Studio server via two env vars; `/v1/messages` drop-in, auth options, cURL + Python examples | raw/Anthropic Compatibility Endpoints.md | 2026-04-30 |
 | [[wiki/claude-code/lmstudio-chat-completions\|lmstudio-chat-completions]] | LM Studio OpenAI-compatible `/v1/chat/completions`: Python example, all supported params (incl. top_k, repeat_penalty), `lms log stream` debugging | raw/Chat Completions.md | 2026-04-30 |
 | [[wiki/claude-code/lmstudio-embeddings\|lmstudio-embeddings]] | LM Studio `/v1/embeddings`: OpenAI-compat drop-in, Python example, newline stripping, batch inputs, use with FAISS/Chroma for local RAG | raw/Embeddings.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-idle-ttl-auto-evict\|lmstudio-idle-ttl-auto-evict]] | Idle TTL (per-request `ttl` field, `lms load --ttl`) and Auto-Evict (1 JIT model at a time) for LM Studio memory management | raw/Idle TTL and Auto-Evict.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-rest-api\|lmstudio-rest-api]] | LM Studio native v1 REST API: all endpoints, endpoint feature comparison (native vs OAI vs Anthropic compat), model lifecycle management | raw/LM Studio API.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-messages-api\|lmstudio-messages-api]] | LM Studio `/v1/messages` drop-in: basic, streaming (SSE events), and tool-use cURL examples; auth options | raw/Messages.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-openai-compat-endpoints\|lmstudio-openai-compat-endpoints]] | LM Studio OpenAI-compat overview: 5 endpoints, base_url swap pattern, Python/TS/cURL examples, Codex support | raw/OpenAI Compatibility Endpoints.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-responses-api\|lmstudio-responses-api]] | LM Studio `/v1/responses`: streaming SSE, stateful follow-up via `previous_response_id`, reasoning effort, Remote MCP tools | raw/Responses.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-headless-service\|lmstudio-headless-service]] | Run LM Studio without GUI: llmster daemon (recommended) or desktop tray mode; JIT model loading and auto-evict | raw/Run LM Studio as a service (headless).md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-serve-on-network\|lmstudio-serve-on-network]] | Bind LM Studio server to LAN IP so other devices (thin clients, IoT, team members) can call the API over the local network | raw/Serve on Local Network.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-server-settings\|lmstudio-server-settings]] | All LM Studio API server toggles: port, auth, CORS, LAN access, per-request MCPs, mcp.json access, JIT loading + auto-evict | raw/Server Settings.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-llmster-systemd\|lmstudio-llmster-systemd]] | systemd unit file for llmster: install daemon, load model at boot, ExecStartPre ordering, oneshot+RemainAfterExit pattern, service management commands | raw/Setup llmster as a Startup Task on Linux.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-structured-output\|lmstudio-structured-output]] | Enforce JSON schema on LLM responses via response_format; GGUF uses llama.cpp grammar, MLX uses Outlines; models <7B often unsupported | raw/Structured Output.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-tool-use\|lmstudio-tool-use]] | LM Studio function calling: tool definition format, multi-turn flow, native vs default support, streaming accumulation, Python examples | raw/Tool Use.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-mcp-via-api\|lmstudio-mcp-via-api]] | MCP servers via LM Studio `/api/v1/chat`: ephemeral (inline) vs mcp.json (pre-configured), allowed_tools, custom auth headers | raw/Using MCP via API.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-lms-cli\|lmstudio-lms-cli]] | `lms` CLI: model download/load/unload/list, server start/stop, log streaming, GPU offload flags, --identifier alias, daemon management | raw/lms — LM Studio's CLI.md | 2026-04-30 |
diff --git a/wiki/claude-code/lmstudio-headless-service.md b/wiki/claude-code/lmstudio-headless-service.md
new file mode 100644
index 0000000..2d94a19
--- /dev/null
+++ b/wiki/claude-code/lmstudio-headless-service.md
@@ -0,0 +1,104 @@
+---
+title: "LM Studio Headless / Service Mode"
+aliases: [lmstudio-daemon, llmster, lmstudio-background-service]
+tags: [lmstudio, local-llm, headless, daemon, jit-loading]
+sources: [raw/Run LM Studio as a service (headless).md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio Headless / Service Mode
+
+GUI-less operation of LM Studio: run as a background daemon, start on machine login, and load models on demand via JIT.
+
+## Two Approaches
+
+| Approach | Best For | GUI Required? |
+|----------|----------|---------------|
+| **llmster** (recommended) | Linux servers, cloud, GPU rigs, headless machines | No |
+| **Desktop app headless mode** | Machines with a GUI where app is already installed | Yes (hidden to tray) |
+
+---
+
+## Option 1: llmster (Recommended)
+
+`llmster` is the core of the LM Studio desktop app, repackaged as a server-native daemon. No GUI dependency.
+
+### Install
+
+```bash
+# Linux / Mac
+curl -fsSL https://lmstudio.ai/install.sh | bash
+
+# Windows (PowerShell)
+irm https://lmstudio.ai/install.ps1 | iex
+```
+
+### Start the daemon
+
+```bash
+lms daemon up
+```
+
+- To auto-start on Linux boot, configure it as a **Linux Startup Task** (see LM Studio docs).
+- Full CLI reference: `lms daemon --help`
+
+---
+
+## Option 2: Desktop App in Headless Mode
+
+Works on Mac, Windows, Linux (with GUI). Useful if the desktop app is already installed.
+
+### Run server on login
+
+1. Open app settings (`Cmd/Ctrl` + `,`)
+2. Enable **"Run LLM server on login"**
+3. Exiting the app minimizes to tray — server keeps running
+
+### Start server programmatically
+
+```bash
+lms server start
+```
+
+Last server state is saved and restored automatically on launch.
+
+---
+
+## Just-In-Time (JIT) Model Loading
+
+Applies to **both** options. Useful when using LM Studio as a backend for other tools (Open WebUI, Claude Code, custom apps).
+
+| JIT State | `/v1/models` returns | Inference behavior |
+|-----------|---------------------|--------------------|
+| **ON** | All downloaded models | Auto-loads model into VRAM on first call |
+| **OFF** | Only models in VRAM | Must manually load model first |
+
+### Auto-Unload
+
+JIT-loaded models are **auto-evicted** after a period of inactivity — see [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] for TTL settings and per-request `ttl` field.
+
+---
+
+## Key Takeaways
+
+- **llmster** is the preferred headless path — works on servers and CI without any GUI
+- Desktop headless mode is a quick option for developer machines already running the app
+- JIT loading eliminates manual `lms load` calls; models are loaded on first inference request
+- JIT-loaded models auto-unload after inactivity (configurable TTL)
+- Use `lms server start` to programmatically control the REST server state
+- The OpenAI-compatible REST API (`/v1/...`) is available in both modes — see [[wiki/claude-code/lmstudio-openai-compat-endpoints|OpenAI Compat Endpoints]] and [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]]
+
+---
+
+## Related
+
+- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — all endpoints and lifecycle management
+- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] — memory management for JIT-loaded models
+- [[wiki/claude-code/lmstudio-openai-compat-endpoints|OpenAI Compat Endpoints]] — drop-in base_url swap for any OpenAI client
+- [[wiki/claude-code/lmstudio-anthropic-compat|Anthropic Compat Endpoints]] — redirect Claude Code / Anthropic SDK to local LM Studio
+
+## Sources
+
+- `raw/Run LM Studio as a service (headless).md`
+- LM Studio docs: https://lmstudio.ai/docs/developer/core/headless
diff --git a/wiki/claude-code/lmstudio-idle-ttl-auto-evict.md b/wiki/claude-code/lmstudio-idle-ttl-auto-evict.md
new file mode 100644
index 0000000..56fbac3
--- /dev/null
+++ b/wiki/claude-code/lmstudio-idle-ttl-auto-evict.md
@@ -0,0 +1,90 @@
+---
+title: "LM Studio — Idle TTL and Auto-Evict"
+aliases: [lmstudio-ttl, lmstudio-auto-evict, idle-ttl]
+tags: [lmstudio, memory-management, jit-loading, ttl, api]
+sources: [raw/Idle TTL and Auto-Evict.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — Idle TTL and Auto-Evict
+
+Memory management features for LM Studio's JIT-loaded models. Prevents idle models from occupying VRAM and enables seamless model switching from external apps.
+
+## Background
+
+| Feature | Default | Purpose |
+|---------|---------|---------|
+| **JIT Loading** | enabled | Loads model on first API request — no manual preload needed |
+| **Idle TTL** | 60 min | Unloads a model after it has been idle for N seconds/minutes |
+| **Auto-Evict** | enabled | Unloads previous JIT model before loading a new one |
+
+## Idle TTL
+
+**Problem:** JIT-loaded models stay in VRAM even when idle (e.g. after you stop using Cline, Zed, or Continue.dev).
+
+**Solution:** TTL starts a countdown when the model goes idle. The timer resets on every new request. When it expires, the model unloads automatically.
+
+### Setting TTL
+
+**App-wide default** — configure in Developer tab → Server Settings.
+
+**Per-request (API)** — pass `ttl` in seconds in the request body:
+
+```bash
+curl http://localhost:1234/api/v0/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "deepseek-r1-distill-qwen-7b",
+    "ttl": 300,
+    "messages": [...]
+  }'
+```
+
+Works on both the OpenAI-compat (`/v1/`) and LM Studio REST (`/api/v0/`) endpoints.
+
+**`lms` CLI** — set TTL at load time:
+
+```bash
+lms load <model> --ttl 3600   # 1 hour
+```
+
+Models loaded with `lms load` have **no TTL by default** (persist until manual unload).
+
+**Server tab** — TTL field visible when loading a model through the GUI.
+
+## Auto-Evict
+
+Controls how many JIT-loaded models can coexist in memory.
+
+| State | Behaviour |
+|-------|-----------|
+| **ON** (default) | At most 1 JIT model in memory at a time; old model evicted before new one loads |
+| **OFF** | Models accumulate in memory; only unloaded by TTL expiry or manual action |
+
+- Non-JIT (manually loaded) models are **never** affected by Auto-Evict.
+- Toggle in: Developer tab → Server Settings.
+
+## TTL + Auto-Evict Together
+
+- **Auto-Evict** handles immediate switching — keeps 1 active model.
+- **TTL** handles the "forgot to switch" case — cleans up if you just stop using an app.
+- Both can be active simultaneously for full memory hygiene.
+
+## Key Takeaways
+
+- Set `"ttl": 300` in any API request to cap a model's idle lifetime to 5 minutes.
+- `lms load <model> --ttl 3600` is the CLI equivalent for persistent sessions.
+- Auto-Evict (default ON) ensures only 1 JIT model lives in VRAM at a time — great for low-VRAM machines.
+- `lms load` bypasses TTL defaults; always pass `--ttl` explicitly if you want auto-cleanup.
+- These features are irrelevant for models loaded via the GUI Models tab (non-JIT path).
+
+## Related
+
+- [[wiki/claude-code/lmstudio-anthropic-compat|LM Studio Anthropic Compat]] — redirect Claude Code to local LM Studio
+- [[wiki/claude-code/lmstudio-chat-completions|LM Studio Chat Completions]] — full parameter reference incl. `top_k`, `repeat_penalty`
+- [[wiki/claude-code/lmstudio-embeddings|LM Studio Embeddings]] — local RAG with FAISS/Chroma
+
+## Sources
+
+- [LM Studio Docs — Idle TTL and Auto-Evict](https://lmstudio.ai/docs/developer/core/ttl-and-auto-evict)
diff --git a/wiki/claude-code/lmstudio-llmster-systemd.md b/wiki/claude-code/lmstudio-llmster-systemd.md
new file mode 100644
index 0000000..0e37c19
--- /dev/null
+++ b/wiki/claude-code/lmstudio-llmster-systemd.md
@@ -0,0 +1,96 @@
+---
+title: "LM Studio — llmster Startup Service (systemd)"
+aliases: [llmster-systemd, lmstudio-startup, lmstudio-daemon-linux]
+tags: [lmstudio, llmster, systemd, linux, headless, local-llm]
+sources: [raw/Setup llmster as a Startup Task on Linux.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — llmster Startup Service (systemd)
+
+Configure `llmster` (LM Studio's headless daemon) to launch automatically at boot, load a model, and start the HTTP API server — all via a systemd unit file.
+
+## Install llmster
+
+```bash
+curl -fsSL https://lmstudio.ai/install.sh | bash
+lms --help   # verify
+```
+
+## Download a Model
+
+```bash
+lms get openai/gpt-oss-20b
+# note the model path printed — used in service config
+```
+
+## Manual Test (before systemd)
+
+```bash
+lms load openai/gpt-oss-20b
+lms server start
+curl http://localhost:1234/v1/models   # should return model list
+lms server stop
+```
+
+## systemd Unit File
+
+Create `/etc/systemd/system/lmstudio.service` (replace `YOUR_USERNAME`):
+
+```ini
+[Unit]
+Description=LM Studio Server
+
+[Service]
+Type=oneshot
+RemainAfterExit=yes
+User=YOUR_USERNAME
+Environment="HOME=/home/YOUR_USERNAME"
+ExecStartPre=/home/YOUR_USERNAME/.lmstudio/bin/lms daemon up
+ExecStartPre=/home/YOUR_USERNAME/.lmstudio/bin/lms load openai/gpt-oss-20b --yes
+ExecStart=/home/YOUR_USERNAME/.lmstudio/bin/lms server start
+ExecStop=/home/YOUR_USERNAME/.lmstudio/bin/lms daemon down
+
+[Install]
+WantedBy=multi-user.target
+```
+
+- `Type=oneshot` + `RemainAfterExit=yes` — service is considered "active" after `ExecStart` exits
+- `ExecStartPre` runs sequentially before `ExecStart`
+- Skip the `lms load` line to rely on [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|JIT loading + auto-evict]] instead
+
+## Enable and Start
+
+```bash
+sudo systemctl daemon-reload
+sudo systemctl enable lmstudio.service
+sudo systemctl start lmstudio.service
+```
+
+## Verify
+
+```bash
+systemctl status lmstudio
+curl http://localhost:1234/v1/models
+```
+
+## Service Management
+
+```bash
+sudo systemctl stop lmstudio       # stop
+sudo systemctl restart lmstudio    # restart
+sudo systemctl disable lmstudio    # remove from boot
+```
+
+## Key Takeaways
+
+- Use `lms daemon up` in `ExecStartPre` — the daemon must be running before `lms load` or `lms server start`
+- Binary path is `~/.lmstudio/bin/lms` — use the absolute path in the unit file (systemd has a minimal `$PATH`)
+- `Type=oneshot` + `RemainAfterExit=yes` keeps the service "active" so `ExecStop` runs on shutdown
+- Omit the `lms load` step and use [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|JIT loading]] to avoid pinning a model at boot
+- API is served on `http://localhost:1234` — see [[wiki/claude-code/lmstudio-headless-service|headless service overview]] for non-systemd options and [[wiki/claude-code/lmstudio-serve-on-network|LAN serving]] to expose to other devices
+
+## Sources
+
+- [LM Studio Headless llmster Docs](https://lmstudio.ai/docs/developer/core/headless_llmster)
diff --git a/wiki/claude-code/lmstudio-lms-cli.md b/wiki/claude-code/lmstudio-lms-cli.md
new file mode 100644
index 0000000..9fb4af9
--- /dev/null
+++ b/wiki/claude-code/lmstudio-lms-cli.md
@@ -0,0 +1,108 @@
+---
+title: "lms — LM Studio CLI"
+aliases: [lms-cli, lmstudio-cli]
+tags: [lmstudio, cli, local-llm, inference, server]
+sources: [raw/lms — LM Studio's CLI.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# lms — LM Studio CLI
+
+`lms` is LM Studio's built-in CLI utility for managing models, the inference server, and the runtime. Ships with LM Studio — no separate install needed. MIT licensed, open source on GitHub.
+
+## Installation & Verification
+
+```bash
+# Already installed with LM Studio — just verify:
+lms --help
+```
+
+Current version: `v0.0.47`
+
+## Command Reference
+
+| Command | What it does |
+|---------|-------------|
+| `lms chat` | Start interactive chat with a model in the terminal |
+| `lms get` | Search and download models |
+| `lms ls` | List models available on disk |
+| `lms ps` | List models currently loaded in memory |
+| `lms load` | Load a model (with GPU/context options) |
+| `lms unload` | Unload a model |
+| `lms import` | Import a model file into LM Studio |
+| `lms server start/stop` | Control the local API server |
+| `lms log` | Stream incoming/outgoing messages for debugging |
+| `lms runtime` | Manage and update the inference runtime |
+| `lms daemon` | Manage the headless llmster daemon |
+| `lms link` | Manage LM Link |
+| `lms clone` | Clone an artifact from LM Studio Hub |
+| `lms push` | Upload artifact to LM Studio Hub |
+| `lms login` | Authenticate with LM Studio |
+
+## Common Workflows
+
+### Server control
+
+```bash
+lms server start
+lms server stop
+```
+
+### List & inspect models
+
+```bash
+lms ls        # models on disk (reflects My Models directory)
+lms ps        # models currently loaded in memory
+```
+
+### Load a model
+
+```bash
+# With GPU offload and context size:
+lms load [--gpu=max|auto|0.0-1.0] [--context-length=1-N]
+
+# --gpu=1.0 → 100% GPU offload
+# With a stable identifier alias:
+lms load openai/gpt-oss-20b --identifier="my-model-name"
+```
+
+Using `--identifier` keeps the model ID stable across loads — useful when client code hardcodes a model name.
+
+### Unload a model
+
+```bash
+lms unload           # unload specific model
+lms unload --all     # unload everything
+```
+
+### Debug message flow
+
+```bash
+lms log stream       # tail all incoming/outgoing API messages live
+```
+
+Pairs with [[wiki/claude-code/lmstudio-chat-completions|lmstudio-chat-completions]] for debugging request/response cycles.
+
+## Key Takeaways
+
+- `lms` ships with LM Studio — zero extra install steps
+- `lms ps` vs `lms ls`: loaded-in-memory vs on-disk — two different commands
+- `--gpu=1.0` forces full GPU offload; `--gpu=auto` lets LM Studio decide
+- `--identifier` flag on `lms load` decouples client model names from actual model paths
+- `lms log stream` is the fastest way to debug what's hitting the server
+- `lms daemon` manages [[wiki/claude-code/lmstudio-headless-service|llmster]] for headless/service deployments
+- MIT licensed: safe to embed in scripts and automation
+
+## Related Articles
+
+- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — all API endpoints
+- [[wiki/claude-code/lmstudio-headless-service|Headless Service (llmster)]] — daemon mode for servers
+- [[wiki/claude-code/lmstudio-server-settings|Server Settings]] — port, auth, CORS, JIT loading
+- [[wiki/claude-code/lmstudio-chat-completions|Chat Completions]] — OpenAI-compat `/v1/chat/completions`
+- [[wiki/claude-code/lmstudio-llmster-systemd|llmster systemd unit]] — run llmster at boot on Linux
+- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] — memory management
+
+## Sources
+
+- lmstudio.ai/docs/cli
diff --git a/wiki/claude-code/lmstudio-mcp-via-api.md b/wiki/claude-code/lmstudio-mcp-via-api.md
new file mode 100644
index 0000000..9d4d7fd
--- /dev/null
+++ b/wiki/claude-code/lmstudio-mcp-via-api.md
@@ -0,0 +1,115 @@
+---
+title: "LM Studio — MCP via API"
+aliases: [lmstudio-mcp-api, mcp-lmstudio, lm-studio-mcp]
+tags: [lmstudio, mcp, api, tool-use, integration]
+sources: [raw/Using MCP via API.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — MCP via API
+
+Requires LM Studio 0.4.0+. MCP servers provide tools that models can call during chat requests via `/api/v1/chat`.
+
+## Two Server Modes
+
+| Feature | Ephemeral | mcp.json |
+|---------|-----------|----------|
+| Specified via | `integrations` → `"type": "ephemeral_mcp"` | `integrations` → `"type": "plugin"` |
+| Config | Per-request only | Pre-configured in `mcp.json` |
+| Use case | One-off / remote tools | Frequent use, tools needing `command` (local processes) |
+| Server ID | `server_label` in integration | `id` (e.g. `mcp/playwright`) |
+| Custom headers | `headers` field | Configured in `mcp.json` |
+
+## Ephemeral MCP Servers
+
+Defined inline per-request — no pre-configuration needed.
+
+```bash
+curl http://localhost:1234/api/v1/chat \
+  -H "Authorization: Bearer $LM_API_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "input": "What is the top trending model on hugging face?",
+    "integrations": [
+      {
+        "type": "ephemeral_mcp",
+        "server_label": "huggingface",
+        "server_url": "https://huggingface.co/mcp",
+        "allowed_tools": ["model_search"]
+      }
+    ],
+    "context_length": 8000
+  }'
+```
+
+Response output contains typed entries: `reasoning`, `message`, and `tool_call` objects. Each `tool_call` includes the tool name, arguments, output, and `provider_info` identifying the server.
+
+## mcp.json Pre-configured Servers
+
+Recommended for servers that run local commands (e.g. `microsoft/playwright-mcp`) or are used frequently.
+
+```bash
+curl http://localhost:1234/api/v1/chat \
+  -H "Authorization: Bearer $LM_API_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "input": "Open lmstudio.ai",
+    "integrations": ["mcp/playwright"],
+    "context_length": 8000,
+    "temperature": 0
+  }'
+```
+
+- `integrations` can be a plain string array when referencing pre-configured servers
+- `provider_info.type` will be `"plugin"` (vs `"ephemeral_mcp"` for inline)
+
+## Restricting Tool Access
+
+Use `allowed_tools` on either integration type:
+
+```json
+"allowed_tools": ["model_search"]
+```
+
+- Limits which tools the model can call from that server
+- Speeds up prompt processing — fewer tool definitions in context
+- If omitted, all server tools are available
+
+## Custom Headers (Ephemeral)
+
+For authenticated remote MCP endpoints:
+
+```json
+{
+  "type": "ephemeral_mcp",
+  "server_label": "huggingface",
+  "server_url": "https://huggingface.co/mcp",
+  "allowed_tools": ["model_search"],
+  "headers": {
+    "Authorization": "Bearer <YOUR_HF_TOKEN>"
+  }
+}
+```
+
+## Key Takeaways
+
+- LM Studio exposes MCP tool calling through its native `/api/v1/chat` endpoint (not the OpenAI-compat route)
+- Two modes: **ephemeral** (inline, per-request) vs **mcp.json** (pre-configured, recommended for local/frequent servers)
+- `allowed_tools` works on both modes — use it to reduce context size and restrict scope
+- Tool call results appear inline in the `output` array alongside `reasoning` and `message` entries
+- Auth headers for remote MCP servers go in the `headers` field on ephemeral integrations
+- The [[wiki/claude-code/lmstudio-responses-api|Responses API]] also supports Remote MCP via `tools` — different endpoint, same concept
+
+## Related
+
+- [[wiki/claude-code/lmstudio-responses-api|LM Studio Responses API]] — `/v1/responses` endpoint also supports Remote MCP tools
+- [[wiki/claude-code/lmstudio-tool-use|LM Studio Tool Use]] — function calling (non-MCP) patterns
+- [[wiki/claude-code/lmstudio-server-settings|LM Studio Server Settings]] — toggle per-request MCPs and mcp.json access in the UI
+- [[wiki/claude-code/mcp-integration|Claude Code MCP Integration]] — MCP concepts: transports, scopes, OAuth
+
+## Sources
+
+- `raw/Using MCP via API.md` — LM Studio docs, 2026-04-30
diff --git a/wiki/claude-code/lmstudio-messages-api.md b/wiki/claude-code/lmstudio-messages-api.md
new file mode 100644
index 0000000..5e8da48
--- /dev/null
+++ b/wiki/claude-code/lmstudio-messages-api.md
@@ -0,0 +1,120 @@
+---
+title: "LM Studio — Anthropic Messages API"
+aliases: [lmstudio-messages, lm-studio-anthropic-messages]
+tags: [lmstudio, anthropic, api, messages, local-llm, streaming, tools]
+sources: [raw/Messages.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — Anthropic Messages API
+
+The `/v1/messages` endpoint in LM Studio mirrors the Anthropic Messages API exactly — same request shape, same response shape. Use it as a local drop-in for any code already calling Anthropic's cloud API.
+
+## Endpoint
+
+```
+POST http://localhost:1234/v1/messages
+```
+
+Required headers:
+- `Content-Type: application/json`
+- `x-api-key: $LM_API_TOKEN` — optional if **Require Authentication** is disabled in LM Studio
+
+## Basic Request
+
+```bash
+curl http://localhost:1234/v1/messages \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: $LM_API_TOKEN" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "max_tokens": 256,
+    "messages": [
+      {"role": "user", "content": "Say hello from LM Studio."}
+    ]
+  }'
+```
+
+## Streaming
+
+Add `"stream": true` to receive Server-Sent Events (SSE):
+
+```bash
+curl http://localhost:1234/v1/messages \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: $LM_API_TOKEN" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "messages": [{"role": "user", "content": "Hello"}],
+    "max_tokens": 256,
+    "stream": true
+  }'
+```
+
+SSE event sequence:
+1. `message_start`
+2. `content_block_start`
+3. `content_block_delta` (repeating)
+4. `content_block_stop`
+5. `message_delta`
+6. `message_stop`
+
+## Tool Use
+
+Pass a `tools` array with JSON Schema input definitions and a `tool_choice` policy:
+
+```bash
+curl http://localhost:1234/v1/messages \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: $LM_API_TOKEN" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "max_tokens": 1024,
+    "tools": [
+      {
+        "name": "get_weather",
+        "description": "Get the current weather in a given location",
+        "input_schema": {
+          "type": "object",
+          "properties": {
+            "location": {
+              "type": "string",
+              "description": "The city and state, e.g. San Francisco, CA"
+            }
+          },
+          "required": ["location"]
+        }
+      }
+    ],
+    "tool_choice": {"type": "any"},
+    "messages": [
+      {"role": "user", "content": "What is the weather like in San Francisco?"}
+    ]
+  }'
+```
+
+`tool_choice` options (Anthropic-compat): `"auto"`, `"any"`, `{"type": "tool", "name": "…"}`.
+
+## Authentication
+
+| Scenario | Header needed |
+|----------|---------------|
+| Auth disabled in LM Studio | No `x-api-key` required |
+| Auth enabled | `x-api-key: $LM_API_TOKEN` |
+
+## Key Takeaways
+
+- `POST /v1/messages` on `localhost:1234` is a drop-in for `api.anthropic.com/v1/messages`
+- Same request body — swap the base URL and optionally add `x-api-key`
+- Streaming uses standard Anthropic SSE event names — existing stream parsers work unchanged
+- Tool use with `input_schema` / `tool_choice` is supported
+- Auth header is optional when LM Studio's **Require Authentication** is off
+- See [[wiki/claude-code/lmstudio-anthropic-compat|lmstudio-anthropic-compat]] for redirecting the full Anthropic SDK via env vars
+
+## Related
+
+- [[wiki/claude-code/lmstudio-anthropic-compat|LM Studio Anthropic Compat Setup]] — redirect Claude Code / SDK to local server
+- [[wiki/claude-code/lmstudio-chat-completions|LM Studio Chat Completions]] — OpenAI-compatible `/v1/chat/completions`
+- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — native v1 endpoints and feature comparison table
+- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] — memory management for loaded models
diff --git a/wiki/claude-code/lmstudio-openai-compat-endpoints.md b/wiki/claude-code/lmstudio-openai-compat-endpoints.md
new file mode 100644
index 0000000..4fcdf53
--- /dev/null
+++ b/wiki/claude-code/lmstudio-openai-compat-endpoints.md
@@ -0,0 +1,86 @@
+---
+title: "LM Studio — OpenAI Compatibility Endpoints"
+aliases: [lmstudio-openai-compat, lmstudio-oai-endpoints]
+tags: [lmstudio, openai, local-llm, api, embeddings, chat-completions]
+sources: [raw/OpenAI Compatibility Endpoints.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — OpenAI Compatibility Endpoints
+
+LM Studio exposes an OpenAI-compatible HTTP server. Any existing OpenAI client (Python, TypeScript, cURL, C#, etc.) works against it by changing only the **base URL**.
+
+Default port: `1234`.
+
+## Supported Endpoints
+
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/v1/models` | GET | List loaded/available models |
+| `/v1/responses` | POST | Responses API (Codex-compatible) |
+| `/v1/chat/completions` | POST | Chat with text and images |
+| `/v1/embeddings` | POST | Generate text embeddings |
+| `/v1/completions` | POST | Legacy completions |
+
+## Switching Base URL
+
+Only one line changes — the `base_url` / `baseUrl` property.
+
+### Python
+
+```python
+from openai import OpenAI
+
+client = OpenAI(
+    base_url="http://localhost:1234/v1"
+)
+# rest of your code unchanged
+```
+
+### TypeScript
+
+```typescript
+import OpenAI from 'openai';
+
+const client = new OpenAI({
+  baseUrl: "http://localhost:1234/v1"
+});
+```
+
+### cURL
+
+```bash
+curl http://localhost:1234/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "<model-identifier-from-lmstudio>",
+    "messages": [{"role": "user", "content": "Say this is a test!"}],
+    "temperature": 0.7
+  }'
+```
+
+## Codex Support
+
+LM Studio supports OpenAI Codex via the `POST /v1/responses` endpoint — the same one Codex targets.
+
+## Key Takeaways
+
+- **Drop-in replacement** — swap `base_url` to `http://localhost:1234/v1`; no other code changes needed
+- **Five endpoints** — models, responses, chat/completions, embeddings, legacy completions
+- **No API key required** by default (LM Studio runs locally)
+- **Codex works** because LM Studio implements `/v1/responses`
+- **Model IDs differ** — use the model identifier shown in LM Studio, not OpenAI slugs like `gpt-4o`
+- For richer stats (token/s, TTFT, model lifecycle) use the [[wiki/claude-code/lmstudio-rest-api|native LM Studio REST API]] instead
+
+## Related Articles
+
+- [[wiki/claude-code/lmstudio-anthropic-compat|Anthropic Compat Endpoints]] — `/v1/messages` drop-in for Claude SDK
+- [[wiki/claude-code/lmstudio-chat-completions|Chat Completions]] — full param reference for `/v1/chat/completions`
+- [[wiki/claude-code/lmstudio-embeddings|Embeddings]] — `/v1/embeddings` for local RAG pipelines
+- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — native v1 API with extended model metadata
+- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|Idle TTL & Auto-Evict]] — memory management for loaded models
+
+## Sources
+
+- [LM Studio OpenAI Compat Docs](https://lmstudio.ai/docs/developer/openai-compat) — raw/OpenAI Compatibility Endpoints.md
diff --git a/wiki/claude-code/lmstudio-responses-api.md b/wiki/claude-code/lmstudio-responses-api.md
new file mode 100644
index 0000000..3297328
--- /dev/null
+++ b/wiki/claude-code/lmstudio-responses-api.md
@@ -0,0 +1,124 @@
+---
+title: "LM Studio Responses API"
+aliases: [lmstudio-responses, lm-studio-openai-responses]
+tags: [lm-studio, openai-compat, responses-api, streaming, mcp, reasoning]
+sources: [raw/Responses.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio Responses API
+
+LM Studio exposes `/v1/responses` — an OpenAI Responses API-compatible endpoint with support for streaming, reasoning effort, stateful multi-turn via `previous_response_id`, and Remote MCP tools.
+
+Base URL: `http://localhost:1234/v1/responses`
+
+---
+
+## Basic Request (non-streaming)
+
+```bash
+curl http://localhost:1234/v1/responses \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "openai/gpt-oss-20b",
+    "input": "Provide a prime number less than 50",
+    "reasoning": { "effort": "low" }
+  }'
+```
+
+- `input` — plain string prompt (no messages array required)
+- `reasoning.effort` — `"low"` | `"medium"` | `"high"` (model-dependent)
+
+---
+
+## Stateful Follow-up
+
+Carry conversation state across calls using `previous_response_id`:
+
+```bash
+curl http://localhost:1234/v1/responses \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "openai/gpt-oss-20b",
+    "input": "Multiply it by 2",
+    "previous_response_id": "resp_123"
+  }'
+```
+
+- The `id` field from any prior response becomes the `previous_response_id` of the next
+- No need to replay the full message history client-side
+
+---
+
+## Streaming
+
+```bash
+curl http://localhost:1234/v1/responses \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "openai/gpt-oss-20b",
+    "input": "Hello",
+    "stream": true
+  }'
+```
+
+SSE events emitted:
+| Event | Description |
+|-------|-------------|
+| `response.created` | Response object initialised |
+| `response.output_text.delta` | Incremental text chunk |
+| `response.completed` | Final event, full response included |
+
+---
+
+## Remote MCP Tools (opt-in)
+
+Enable in LM Studio: **Developer → Settings → Remote MCP**.
+
+```bash
+curl http://localhost:1234/v1/responses \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "input": "What is the top trending model on hugging face?",
+    "tools": [
+      {
+        "type": "mcp",
+        "server_label": "huggingface",
+        "server_url": "https://huggingface.co/mcp",
+        "allowed_tools": ["model_search"]
+      }
+    ]
+  }'
+```
+
+- `server_label` — arbitrary identifier for this MCP server
+- `server_url` — remote MCP server URL
+- `allowed_tools` — allowlist of tool names the model may call
+
+---
+
+## Key Takeaways
+
+- `/v1/responses` is an OpenAI Responses API drop-in; swap base URL only
+- `previous_response_id` enables multi-turn without replaying history — simpler than maintaining a messages array
+- Streaming uses standard SSE; listen for `response.output_text.delta` for incremental chunks
+- Remote MCP tools are per-request and opt-in — must enable the feature in LM Studio settings first
+- `reasoning.effort` controls thinking depth; not all models support it
+
+---
+
+## Related
+
+- [[wiki/claude-code/lmstudio-openai-compat-endpoints|LM Studio OpenAI Compat Endpoints]] — overview of all 5 OAI-compatible endpoints
+- [[wiki/claude-code/lmstudio-chat-completions|LM Studio Chat Completions]] — `/v1/chat/completions` with full param reference
+- [[wiki/claude-code/lmstudio-messages-api|LM Studio Messages API]] — `/v1/messages` Anthropic-compat with streaming + tool-use
+- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — native endpoint feature comparison table
+- [[wiki/claude-code/mcp-integration|MCP Integration]] — Claude Code MCP setup and server patterns
+
+---
+
+## Sources
+
+- `raw/Responses.md` — LM Studio developer docs: `/v1/responses` endpoint
diff --git a/wiki/claude-code/lmstudio-rest-api.md b/wiki/claude-code/lmstudio-rest-api.md
new file mode 100644
index 0000000..393b956
--- /dev/null
+++ b/wiki/claude-code/lmstudio-rest-api.md
@@ -0,0 +1,75 @@
+---
+title: "LM Studio REST API (v1)"
+aliases: [lmstudio-api, lm-studio-rest, lmstudio-v1]
+tags: [lmstudio, rest-api, local-inference, openai-compat, anthropic-compat, mcp]
+sources: [raw/LM Studio API.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio REST API (v1)
+
+LM Studio 0.4.0 introduced the native **v1 REST API** at `/api/v1/*`. It sits alongside OpenAI-compatible and Anthropic-compatible endpoints and offers the richest feature set for local inference.
+
+## v1 vs v0
+
+The old v0 API (`/api/v0/*`) is superseded. Migrate to `/api/v1/*` for:
+
+- **Stateful chats** — server keeps conversation context across turns
+- **MCP via API** — use MCPs configured in LM Studio directly from requests
+- **Authentication** — API token support
+- **Model management** — download, load, unload via API
+
+## Supported Endpoints
+
+| Endpoint | Method | Purpose |
+|---|---|---|
+| `/api/v1/chat` | POST | Inference (native) |
+| `/api/v1/models` | GET | List loaded models |
+| `/api/v1/models/load` | POST | Load a model into VRAM |
+| `/api/v1/models/unload` | POST | Unload a model |
+| `/api/v1/models/download` | POST | Download a model |
+| `/api/v1/models/download/status` | GET | Poll download progress |
+
+## Inference Endpoint Comparison
+
+Four endpoints can run inference. Pick based on which features you need:
+
+| Feature | `/api/v1/chat` | `/v1/responses` (OAI) | `/v1/chat/completions` (OAI) | `/v1/messages` (Anthropic) |
+|---|:---:|:---:|:---:|:---:|
+| Streaming | ✅ | ✅ | ✅ | ✅ |
+| Stateful chat | ✅ | ✅ | ❌ | ❌ |
+| Remote MCPs | ✅ | ✅ | ❌ | ❌ |
+| LM Studio MCPs | ✅ | ✅ | ❌ | ❌ |
+| Custom tools | ❌ | ✅ | ✅ | ✅ |
+| Assistant messages in request | ❌ | ✅ | ✅ | ✅ |
+| Model load streaming events | ✅ | ❌ | ❌ | ❌ |
+| Prompt processing events | ✅ | ❌ | ❌ | ❌ |
+| Specify context length | ✅ | ❌ | ❌ | ❌ |
+
+**Decision guide:**
+- Need MCP tools + stateful chat → `/api/v1/chat` or `/v1/responses`
+- Need custom tool definitions → `/v1/responses`, `/v1/chat/completions`, or `/v1/messages`
+- Dropping in existing OpenAI SDK code → `/v1/chat/completions`
+- Dropping in existing Anthropic SDK code → `/v1/messages`
+
+## Key Takeaways
+
+- The **native `/api/v1/chat`** endpoint has exclusive features: stateful chat, LM Studio MCPs, model-load events, prompt-processing events, and per-request context length.
+- **`/v1/responses`** (OpenAI Responses API compat) is the best of both worlds — stateful + MCP + custom tools.
+- **`/v1/chat/completions`** is the broadest drop-in for existing OpenAI code but loses statefulness and MCP.
+- **`/v1/messages`** lets you redirect the Anthropic SDK to a local model with minimal code change (see [[wiki/claude-code/lmstudio-anthropic-compat|lmstudio-anthropic-compat]]).
+- Model management endpoints let you fully automate the model lifecycle — download → load → infer → unload — without touching the GUI.
+- API token auth is available for securing the local server (useful when exposed on a LAN).
+
+## Related Articles
+
+- [[wiki/claude-code/lmstudio-anthropic-compat|lmstudio-anthropic-compat]] — redirect Claude Code / Anthropic SDK to LM Studio via env vars
+- [[wiki/claude-code/lmstudio-chat-completions|lmstudio-chat-completions]] — OpenAI `/v1/chat/completions` usage, params, debugging
+- [[wiki/claude-code/lmstudio-embeddings|lmstudio-embeddings]] — `/v1/embeddings` for local RAG pipelines
+- [[wiki/claude-code/lmstudio-idle-ttl-auto-evict|lmstudio-idle-ttl-auto-evict]] — memory management: TTL and auto-evict
+- [[wiki/agent-sdk/overview|agent-sdk/overview]] — build multi-agent systems that call local models
+
+## Sources
+
+- `raw/LM Studio API.md` — clipped from lmstudio.ai/docs/developer/rest
diff --git a/wiki/claude-code/lmstudio-serve-on-network.md b/wiki/claude-code/lmstudio-serve-on-network.md
new file mode 100644
index 0000000..3beaa2f
--- /dev/null
+++ b/wiki/claude-code/lmstudio-serve-on-network.md
@@ -0,0 +1,54 @@
+---
+title: "LM Studio — Serve on Local Network"
+aliases: [lmstudio-network, lmstudio-lan-server]
+tags: [lmstudio, networking, api-server, local-llm, lan]
+sources: [raw/Serve on Local Network.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — Serve on Local Network
+
+Enabling **Serve on Local Network** makes the LM Studio API server accessible to other devices on the same LAN — not just `localhost`.
+
+## How It Works
+
+- By default the server binds to `127.0.0.1` (localhost only)
+- With the option enabled it binds to your machine's **local network IP** (e.g. `192.168.x.x`)
+- The API access URL shown in LM Studio updates to reflect the new binding
+- All existing API endpoints stay the same — only the host changes
+
+## Use Cases
+
+| Scenario | Why useful |
+|----------|-----------|
+| Thin-client devices (laptop, tablet, phone) | Offload inference to a powerful desktop on the same network |
+| Shared team access | Multiple people hit one LM Studio instance |
+| IoT / edge devices | Raspberry Pi or similar calls the API over LAN |
+| Local service mesh | Other self-hosted services (Home Assistant, scripts) consume the LLM |
+
+## Setup Steps
+
+1. Open LM Studio → **Local Server** tab
+2. Toggle **Serve on Local Network** → ON
+3. Note the updated **API access URL** displayed (e.g. `http://192.168.1.x:1234`)
+4. On client devices, point `base_url` to that address instead of `http://localhost:1234`
+
+## Key Takeaways
+
+- One toggle — no firewall rule changes required on most home routers (LAN-to-LAN is open by default)
+- The API surface is identical to localhost; only the bind address differs
+- Useful when pairing a powerful homelab machine with weaker clients — see [[wiki/homelab/_index|homelab]] for server options
+- Combine with [[wiki/claude-code/lmstudio-headless-service|lmstudio-headless-service]] to run the server without the GUI on a headless machine
+- For redirecting Claude Code itself to the local server, see [[wiki/claude-code/lmstudio-anthropic-compat|lmstudio-anthropic-compat]]
+
+## Related
+
+- [[wiki/claude-code/lmstudio-rest-api|LM Studio REST API]] — full endpoint reference
+- [[wiki/claude-code/lmstudio-headless-service|LM Studio Headless Service]] — run without GUI (daemon mode)
+- [[wiki/claude-code/lmstudio-anthropic-compat|Anthropic Compat Endpoints]] — point Claude Code at local server
+- [[wiki/homelab/_index|Homelab]] — self-hosted hardware for running LM Studio
+
+## Sources
+
+- `raw/Serve on Local Network.md` — clipped from lmstudio.ai/docs/developer/core/server/serve-on-network
diff --git a/wiki/claude-code/lmstudio-server-settings.md b/wiki/claude-code/lmstudio-server-settings.md
new file mode 100644
index 0000000..f70984b
--- /dev/null
+++ b/wiki/claude-code/lmstudio-server-settings.md
@@ -0,0 +1,62 @@
+---
+title: "LM Studio Server Settings"
+aliases: [lmstudio-server-config, lm-studio-api-server-settings]
+tags: [lmstudio, api-server, configuration, mcp, jit, cors, auth]
+sources: [raw/Server Settings.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio Server Settings
+
+Configuration options for the LM Studio API server — accessible from the LM Studio UI or `lms` CLI. Controls port, auth, network access, MCP permissions, CORS, and JIT model memory management.
+
+## Network & Access
+
+| Setting | Type | Description |
+|---------|------|-------------|
+| **Server Port** | Integer | Port the API server listens on (default `1234`) |
+| **Serve on Local Network** | Switch | Binds server to LAN IP so other devices can reach it — see [[wiki/claude-code/lmstudio-serve-on-network\|Serve on Network]] |
+| **Enable CORS** | Switch | Allow cross-origin requests (needed for browser-based clients hitting a local server) |
+
+## Authentication
+
+| Setting | Type | Description |
+|---------|------|-------------|
+| **Require Authentication** | Switch | Clients must pass a valid token in `Authorization` header — see [[wiki/claude-code/lmstudio-anthropic-compat\|LM Studio Auth docs]] |
+
+> Authentication is a prerequisite for enabling MCP server access from `mcp.json`.
+
+## MCP (Model Context Protocol)
+
+| Setting | Type | Description |
+|---------|------|-------------|
+| **Allow per-request MCPs** | Switch | Clients may specify ephemeral remote MCP servers in individual requests (not in `mcp.json`). Only remote MCPs supported. |
+| **Allow calling servers from mcp.json** | Switch | Clients may use MCP servers defined in your LM Studio `mcp.json`. **Requires Auth enabled.** Security risk if those servers have filesystem/data access. |
+
+Related: [[wiki/claude-code/mcp-integration\|MCP Integration]]
+
+## JIT (Just-in-Time) Model Loading
+
+Saves RAM by loading models on demand rather than pre-loading them.
+
+| Setting | Type | Description |
+|---------|------|-------------|
+| **Just in Time Model Loading** | Switch | Load a model at request time if not already loaded |
+| **Auto Unload Unused JIT Models** | Switch | Automatically evict JIT models when idle |
+| **Only Keep Last JIT Loaded Model** | Switch | Evict all but the most recently used JIT model — minimizes RAM usage |
+
+> For deeper JIT / TTL / eviction behavior, see [[wiki/claude-code/lmstudio-idle-ttl-auto-evict\|Idle TTL and Auto-Evict]].
+
+## Key Takeaways
+
+- **Port** is the only integer setting; all others are on/off switches.
+- **Auth is a gate** — `mcp.json` server access won't work without it enabled.
+- **Per-request MCPs** are ephemeral and remote-only; they don't persist after the request.
+- **CORS** must be on for any browser app (web UI, local HTML tool) to call the API.
+- **JIT trio** (`JIT Load` → `Auto Unload` → `Only Keep Last`) progressively tightens memory: enable all three on low-RAM machines.
+- LAN access via [[wiki/claude-code/lmstudio-serve-on-network\|Serve on Network]] is a separate setting from CORS — you may need both.
+
+## Sources
+
+- `raw/Server Settings.md` — scraped from [lmstudio.ai/docs/developer/core/server/settings](https://lmstudio.ai/docs/developer/core/server/settings)
diff --git a/wiki/claude-code/lmstudio-structured-output.md b/wiki/claude-code/lmstudio-structured-output.md
new file mode 100644
index 0000000..bef04a1
--- /dev/null
+++ b/wiki/claude-code/lmstudio-structured-output.md
@@ -0,0 +1,150 @@
+---
+title: "LM Studio Structured Output"
+aliases: [lmstudio-json-schema, structured-output-lmstudio]
+tags: [lmstudio, structured-output, json-schema, openai-compat, local-llm]
+sources: [raw/Structured Output.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio Structured Output
+
+Enforce a specific JSON shape on LLM responses by passing a JSON schema to `/v1/chat/completions`. Compatible with OpenAI's Structured Output API format.
+
+## How It Works
+
+- Add a `response_format` field to the chat completions request
+- Provide a `json_schema` with a `name`, optional `strict`, and a `schema` object
+- The model is constrained to return valid JSON matching that schema
+- Response arrives as a string in `choices[0].message.content` — parse it with `json.loads()`
+
+## Server Setup
+
+```bash
+lms server start
+# or enable from Developer tab in LM Studio UI
+```
+
+Install the CLI first if needed:
+```bash
+npx lmstudio install-cli
+```
+
+## request_format Shape
+
+```json
+"response_format": {
+  "type": "json_schema",
+  "json_schema": {
+    "name": "my_schema",
+    "strict": "true",
+    "schema": {
+      "type": "object",
+      "properties": {
+        "field": { "type": "string" }
+      },
+      "required": ["field"]
+    }
+  }
+}
+```
+
+## cURL Example
+
+```bash
+curl http://localhost:1234/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "{{model}}",
+    "messages": [
+      {"role": "system", "content": "You are a helpful jokester."},
+      {"role": "user", "content": "Tell me a joke."}
+    ],
+    "response_format": {
+      "type": "json_schema",
+      "json_schema": {
+        "name": "joke_response",
+        "strict": "true",
+        "schema": {
+          "type": "object",
+          "properties": { "joke": {"type": "string"} },
+          "required": ["joke"]
+        }
+      }
+    },
+    "temperature": 0.7,
+    "max_tokens": 50,
+    "stream": false
+  }'
+```
+
+## Python Example
+
+```python
+from openai import OpenAI
+import json
+
+client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
+
+character_schema = {
+    "type": "json_schema",
+    "json_schema": {
+        "name": "characters",
+        "schema": {
+            "type": "object",
+            "properties": {
+                "characters": {
+                    "type": "array",
+                    "items": {
+                        "type": "object",
+                        "properties": {
+                            "name": {"type": "string"},
+                            "occupation": {"type": "string"},
+                            "personality": {"type": "string"},
+                            "background": {"type": "string"}
+                        },
+                        "required": ["name", "occupation", "personality", "background"]
+                    },
+                    "minItems": 1
+                }
+            },
+            "required": ["characters"]
+        }
+    }
+}
+
+response = client.chat.completions.create(
+    model="your-model",
+    messages=[
+        {"role": "system", "content": "You are a helpful AI assistant."},
+        {"role": "user", "content": "Create 1-3 fictional characters"}
+    ],
+    response_format=character_schema,
+)
+
+results = json.loads(response.choices[0].message.content)
+print(json.dumps(results, indent=2))
+```
+
+## Structured Output Engines
+
+| Model Format | Engine |
+|---|---|
+| GGUF | `llama.cpp` grammar-based sampling |
+| MLX | [Outlines](https://github.com/dottxt-ai/outlines) via [lmstudio-ai/mlx-engine](https://github.com/lmstudio-ai/mlx-engine) |
+
+## Key Takeaways
+
+- Use `response_format.type = "json_schema"` — same shape as OpenAI's Structured Outputs API
+- Works with any OpenAI-compatible client SDK (Python, TS, etc.) just by pointing `base_url` at localhost
+- Response is always a **string** in `choices[0].message.content` — always call `json.loads()` on it
+- Not all models support this: **models below 7B parameters often cannot do structured output** — check the model card
+- GGUF uses grammar sampling; MLX uses Outlines — both constrain tokens at generation time, not post-hoc
+- All standard `/v1/chat/completions` params (temperature, max_tokens, stream, etc.) still apply
+
+## Related
+
+- [[wiki/claude-code/lmstudio-chat-completions|lmstudio-chat-completions]] — full parameter reference for the completions endpoint
+- [[wiki/claude-code/lmstudio-openai-compat-endpoints|lmstudio-openai-compat-endpoints]] — overview of all OpenAI-compat endpoints
+- [[wiki/claude-code/lmstudio-responses-api|lmstudio-responses-api]] — stateful responses with streaming and Remote MCP tools
+- [[wiki/claude-code/lmstudio-rest-api|lmstudio-rest-api]] — native LM Studio API and endpoint feature comparison
diff --git a/wiki/claude-code/lmstudio-tool-use.md b/wiki/claude-code/lmstudio-tool-use.md
new file mode 100644
index 0000000..a1cd43f
--- /dev/null
+++ b/wiki/claude-code/lmstudio-tool-use.md
@@ -0,0 +1,158 @@
+---
+title: "LM Studio Tool Use (Function Calling)"
+aliases: [lmstudio-function-calling, lmstudio-tools]
+tags: [lmstudio, tool-use, function-calling, openai-compat, python, local-llm]
+sources: [raw/Tool Use.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio Tool Use (Function Calling)
+
+Tool use lets LLMs *request* calls to external functions/APIs via LM Studio's OpenAI-compatible `/v1/chat/completions` and `/v1/responses` endpoints. Your code executes the actual functions and feeds results back.
+
+## Key Takeaways
+
+- LLMs **cannot execute code** — they output structured text requesting a tool call; your code runs it
+- Uses the same format as OpenAI's Function Calling API — any OpenAI SDK works
+- Tool definitions are injected into the system prompt via the model's chat template
+- Two support tiers: **Native** (model trained for tool use) and **Default** (fallback prompt injection)
+- After tool execution, re-prompt the model *without* tools to get a plain-text final answer
+- Streaming tool calls arrive in chunks — accumulate `delta.tool_calls` before executing
+
+## High-Level Flow
+
+```
+Setup LLM + tool list
+  → Get user input
+  → LLM prompted with messages
+  → Needs tools?
+      Yes → Tool Response → Execute tools → Add results to messages → re-prompt
+      No  → Normal response → loop back
+```
+
+## Tool Definition Format
+
+```json
+{
+  "type": "function",
+  "function": {
+    "name": "get_delivery_date",
+    "description": "Get the delivery date for a customer's order",
+    "parameters": {
+      "type": "object",
+      "properties": {
+        "order_id": { "type": "string" }
+      },
+      "required": ["order_id"]
+    }
+  }
+}
+```
+
+Pass as the `tools` array in the request body — identical to OpenAI's spec.
+
+## Response Parsing
+
+- Tool call detected: `choices[0].message.tool_calls` array is populated; `finish_reason = "tool_calls"`
+- No tool call: response lands in `choices[0].message.content` as normal text
+- If the model outputs a malformed tool call, LM Studio falls back to `content` — use `lms log stream` to debug
+
+## Multi-Turn Pattern (Python)
+
+```python
+from openai import OpenAI
+import json
+
+client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
+
+# 1. First call — with tools
+response = client.chat.completions.create(
+    model="lmstudio-community/qwen2.5-7b-instruct",
+    messages=messages,
+    tools=tools,
+)
+
+# 2. Execute the requested tool
+tool_call = response.choices[0].message.tool_calls[0]
+args = json.loads(tool_call.function.arguments)
+result = my_function(**args)
+
+# 3. Append both the assistant's tool-call message and the tool result
+messages += [
+    {"role": "assistant", "tool_calls": [tool_call]},
+    {"role": "tool", "content": json.dumps(result), "tool_call_id": tool_call.id},
+]
+
+# 4. Second call — WITHOUT tools for final plain-text answer
+final = client.chat.completions.create(model=model, messages=messages)
+print(final.choices[0].message.content)
+```
+
+## Native vs Default Support
+
+| Level | What it means | Quality |
+|-------|---------------|---------|
+| **Native** | Model has a tool-use chat template + LM Studio parses its format | Best |
+| **Default** | LM Studio injects a custom system prompt + converts `tool` role to `user` | Variable |
+
+### Models with Native Support (as of 2024-11)
+
+- **Qwen** — Qwen2.5-7B-Instruct (GGUF / MLX)
+- **Llama** — Llama-3.1 / 3.2 8B-Instruct (GGUF / MLX)
+- **Mistral** — Ministral-8B-Instruct-2410 (GGUF / MLX)
+
+Native models show a hammer badge in the LM Studio UI.
+
+## Streaming Tool Calls
+
+```python
+# Accumulate chunks — name and arguments arrive in pieces
+for chunk in stream:
+    delta = chunk.choices[0].delta
+    if delta.tool_calls:
+        for tc in delta.tool_calls:
+            # Append tc.id, tc.function.name, tc.function.arguments fragments
+```
+
+Execute only after the stream ends and `tool_calls` is fully assembled.
+
+## Quick Start
+
+```bash
+# Start server
+lms server start
+
+# Load a model
+lms load
+
+# Debug raw prompts (see how tools are injected)
+lms log stream
+```
+
+```bash
+# curl single-turn example
+curl http://localhost:1234/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{"model": "lmstudio-community/qwen2.5-7b-instruct",
+       "messages": [{"role": "user", "content": "Search dell products under $50"}],
+       "tools": [...]}'
+```
+
+## Troubleshooting
+
+- **No `tool_calls` in response** — model output was malformed; run `lms log stream` to inspect the raw prompt and output
+- **Smaller models** — may not follow the tool call format reliably; prefer ≥7B models with native support
+- **Default mode weirdness** — check the injected system prompt via `lms log stream`; the format uses `[TOOL_REQUEST]...[END_TOOL_REQUEST]` tags
+
+## Related
+
+- [[wiki/claude-code/lmstudio-chat-completions|LM Studio Chat Completions]] — full `/v1/chat/completions` param reference
+- [[wiki/claude-code/lmstudio-openai-compat-endpoints|LM Studio OpenAI Compat Endpoints]] — all 5 compatible endpoints
+- [[wiki/claude-code/lmstudio-responses-api|LM Studio Responses API]] — `/v1/responses` with Remote MCP tools
+- [[wiki/claude-code/lmstudio-structured-output|LM Studio Structured Output]] — enforce JSON schema on responses
+- [[wiki/claude-code/lmstudio-messages-api|LM Studio Messages API]] — Anthropic-compat tool use examples
+
+## Sources
+
+- `raw/Tool Use.md` — LM Studio official docs (lmstudio.ai/docs/developer/openai-compat/tools), published 2024-11-19
diff --git a/wiki/homelab/_index.md b/wiki/homelab/_index.md
index c4fbc7a..5c24734 100644
--- a/wiki/homelab/_index.md
+++ b/wiki/homelab/_index.md
@@ -43,3 +43,4 @@ Self-hosted infra: Proxmox install, IOMMU/PCI passthrough, hypervisor setup, bud
 | [[wiki/homelab/glance-dashboard\|Glance — Self-hosted Dashboard]] | Glance setup replacing Homarr: Docker config, 5-page layout, Prometheus RAPL metrics, key patterns ($include caveat, internal IPs only) | session 2026-04-29 | 2026-04-29 |
 | [[wiki/homelab/homelab-media-stack\|Homelab Media Stack — Jellyfin + *arr + qBittorrent Setup]] | CT111 media LXC: unified /data mount pattern, Intel QuickSync GPU passthrough, step-by-step qBittorrent categories + Sonarr/Radarr/Prowlarr wiring | session 2026-04-26 | 2026-04-26 |
 | [[wiki/homelab/hp-elitedesk-800g3-proxmox\|HP Elitedesk 800 G3 — Proxmox Setup Log]] | Real homelab server setup log: i5-7500, 24 GB RAM, 256 GB NVMe + 6 TB HDD, LXC containers, GPU passthrough (AMD/Intel) | session 2026-04-18 | 2026-04-21 |
+| [[wiki/homelab/hp-elitedesk-800g3-teardown-upgrade\|HP EliteDesk 800 G3 SFF — Teardown, Upgrade & Benchmarks]] | Full disassembly/reassembly guide: proprietary connectors caveat, dual-channel RAM, CPU cooler swap, GTX 1050 Ti, thermal benchmarks (GTA V, Flight Sim) | raw/HP EliteDesk 800 G3 SFF - Teardown, re-assembly and upgrade.md | 2026-04-30 |
diff --git a/wiki/homelab/hp-elitedesk-800g3-teardown-upgrade.md b/wiki/homelab/hp-elitedesk-800g3-teardown-upgrade.md
new file mode 100644
index 0000000..0236817
--- /dev/null
+++ b/wiki/homelab/hp-elitedesk-800g3-teardown-upgrade.md
@@ -0,0 +1,153 @@
+---
+title: "HP EliteDesk 800 G3 SFF — Teardown, Upgrade & Benchmarks"
+aliases: [elitedesk-800-g3-teardown, hp-sff-upgrade-guide]
+tags: [homelab, hardware, hp, sff, upgrade, benchmark]
+sources: [raw/HP EliteDesk 800 G3 SFF - Teardown, re-assembly and upgrade.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+## Overview
+
+The HP EliteDesk 800 G3 SFF is a small form factor desktop often available cheaply at auctions. It uses a **proprietary motherboard and PSU connector** — not standard ATX — which limits some upgrade paths but still allows CPU, RAM, SSD, and GPU swaps.
+
+Reference config (video unit): i7-7700 · GTX 1050 Ti (low-profile) · 16 GB DDR4 · 256 GB NVMe SSD
+
+---
+
+## Exterior Ports
+
+**Front**
+- 1× USB-C
+- 2× USB 3.0
+- 2× USB 2.0
+- Audio in/out
+- Power button
+- Slim optical drive bay
+- Optional SD-card reader slot
+
+**Back**
+- DisplayPort
+- Flexible port option (VGA/DP/HDMI via option card)
+- RJ45 (Gigabit)
+- 2× USB 2.0 + 2× USB 3.0
+- Power connector
+- GPU video outputs (from installed card)
+
+---
+
+## Motherboard Layout
+
+Non-standard form factor — not ATX/ITX. Key connectors:
+
+| Component | Detail |
+|-----------|--------|
+| PCIe slots | 1× x16 (GPU), 2× x1, 1× x4 (downshifted) |
+| RAM slots | 4× DDR4 DIMM — DIMM1/2 = Ch. B, DIMM3/4 = Ch. A |
+| Storage | 1× M.2 NVMe SSD, 3× SATA, 1× M.2 Wi-Fi |
+| Power | Proprietary non-standard PSU connector |
+| Option card | VGA / DisplayPort / HDMI output slot |
+| CMOS reset | Physical button on board |
+
+**Proprietary connectors = motherboard and PSU cannot be swapped for generic parts.**
+
+---
+
+## Disassembly Procedure
+
+1. **Open case** — slide latch on top cover, no tools needed
+2. **Open airflow panel** — provides better access to NVMe, SATA, and RAM
+3. **Remove CPU cooler cover** (plastic airflow shroud)
+4. Disconnect and slide out **slim DVD drive** (green latch release)
+5. Remove **front panel**
+6. Remove **GPU** (low-profile PCIe card, 4 GB VRAM)
+7. Disconnect **proprietary power connectors**
+8. Remove **NVMe SSD** (single retention screw)
+9. Remove **RAM sticks**
+10. Unscrew 4 screws → lift **CPU cooler**
+11. Lift lever → remove **CPU** (LGA 1151)
+12. Remove **motherboard** from chassis
+
+---
+
+## Upgrade Notes
+
+### RAM — Dual Channel
+- Use matching DIMMs in **same-colour slots** (one per channel)
+- For 16 GB: 2× 8 GB — one in Ch. A slot, one in Ch. B slot
+
+### CPU Cooler Replacement
+- Stock cooler can develop bearing noise
+- Replacement must be **PWM 4-pin** type
+- Heatsink mounts to chassis (not board) — install after board is seated in case
+- Clean old paste with isopropyl alcohol before applying new thermal paste
+
+### 3.5" HDD Addition
+- Install standoff screws on drive
+- Slide into drive cage
+- Connect SATA data + power cables
+
+### GPU (Low-Profile Required)
+- SFF case requires **low-profile PCIe card**
+- Tested: Gigabyte GTX 1050 Ti (4 GB VRAM) — fits the x16 slot
+
+---
+
+## Reassembly Order
+
+1. CPU into socket (match orientation notch)
+2. NVMe SSD → slot + screw
+3. RAM → correct channel slots
+4. Motherboard into case
+5. CPU cooler + thermal paste → fix to chassis
+6. Connect CPU fan to board
+7. Airflow cover (clips onto CPU fan)
+8. Power cables + speaker
+9. DVD drive + SATA cable
+10. 3.5" HDD → cage + cables
+11. GPU → PCIe slot
+12. SATA data cable for HDD
+13. Front cover → top cover
+
+---
+
+## Benchmark Results (i7-7700 + GTX 1050 Ti)
+
+| Test | Result |
+|------|--------|
+| Geekbench CPU | Expected for i7-7700 generation |
+| Geekbench Compute (GPU) | Expected for GTX 1050 Ti |
+| Microsoft Flight Simulator (Medium, 1080p) | ~30 FPS steady |
+| GTA V (Very High + AA, 1080p) | Consistent 60+ FPS |
+
+### Thermal Observations
+- CPU and GPU approach **~90°C** under sustained load (Flight Simulator)
+- GTA V similarly runs hot
+- SFF chassis limits airflow — **monitor temps if running sustained workloads**
+
+---
+
+## Key Takeaways
+
+- The EliteDesk 800 G3 SFF uses **proprietary PSU and motherboard connectors** — plan upgrades around this constraint
+- Case opens **tool-free** via a single top-cover latch; very serviceable for the form factor
+- CPU cooler mounts to the **chassis** not the board — must be installed after the board is seated
+- Dual-channel RAM requires same-colour DIMM pairing (Ch. A + Ch. B)
+- GTX 1050 Ti (low-profile) is the practical GPU ceiling for this chassis without a riser
+- Thermals are borderline under sustained 3D load — consider improved case airflow or undervolting for homelab/compute use
+- For homelab use (Proxmox, LXCs), thermal load is far lighter — see [[wiki/homelab/hp-elitedesk-800g3-proxmox|HP Elitedesk 800 G3 — Proxmox Setup Log]]
+
+---
+
+## Related Articles
+
+- [[wiki/homelab/hp-elitedesk-800g3-proxmox|HP Elitedesk 800 G3 — Proxmox Setup Log]]
+- [[wiki/homelab/homelab-from-scratch-budget-build|Homelab From Scratch — Budget-First Design]]
+- [[wiki/homelab/bigibz1-homelab-hardware|bigibz1 Homelab Hardware Reference]]
+- [[wiki/homelab/homelab-services-map|Homelab — Full Services Map & Network Reference]]
+
+---
+
+## Sources
+
+- [YouTube: HP EliteDesk 800 G3 SFF — Teardown, re-assembly and upgrade (jensd_be, 2021-03-08)](https://www.youtube.com/watch?v=n1ETa3mJ85I)