vault backup: 2026-04-30 14:27:25

2026-04-30 14:27:25 +01:00 · 2026-04-30 14:27:25 +01:00 · 1eb1072c19
commit 1eb1072c19
parent af4dc6aaa7
10 changed files with 254 additions and 2 deletions
--- a/.obsidian/plugins/hoarder-sync/data.json
+++ b/.obsidian/plugins/hoarder-sync/data.json
@ -4,7 +4,7 @@
  "syncFolder": "Hoarder",
  "attachmentsFolder": "Hoarder/attachments",
  "syncIntervalMinutes": 60,
-  "lastSyncTimestamp": 1777555441627,
+  "lastSyncTimestamp": 1777555641238,
  "updateExistingFiles": false,
  "excludeArchived": true,
  "onlyFavorites": false,
--- a/Daily/2026-04-30.md
+++ b/Daily/2026-04-30.md
@ -140,3 +140,12 @@ tags: [daily]
 - 14:21 (4min) | `video-accessibility`
  - **Asked:** What skills for code review are specified in the project instructions?
  - **Done:** Reviewed project architecture and identified that va-worker uses Cloud Run Jobs with asyncio.run() entrypoint, eliminating need for Redis broker.
+- 14:25 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new article about LM Studio's Anthropic-compatible endpoint into the wiki knowledge base.
+  - **Done:** Filed article as `wiki/claude-code/lmstudio-anthropic-compat.md` and updated both the topic and master indices.
+- 14:26 (<1min) | `memory-compiler`
+  - **Asked:** Create a structured wiki article for LM Studio's chat completions API endpoint.
+  - **Done:** Created `lmstudio-chat-completions.md` with API documentation, Python examples, and parameter reference; updated topic index to 16 articles.
+- 14:27 (<1min) | `memory-compiler`
+  - **Asked:** Compile a new article about LM Studio embeddings into the structured wiki knowledge base.
+  - **Done:** Filed article as `wiki/claude-code/lmstudio-embeddings.md` and updated master index with wikilinks to related LM Studio topics and RAG pattern.
--- a/raw/_processed/Anthropic
+++ b/raw/_processed/Anthropic
--- a/raw/_processed/Chat
+++ b/raw/_processed/Chat
--- a/raw/_processed/Embeddings.md
+++ b/raw/_processed/Embeddings.md
--- a/wiki/_master-index.md
+++ b/wiki/_master-index.md
@ -31,7 +31,7 @@ This 3-hop pattern works for hundreds of articles without vector search.
 | [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 21 |
 | [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 30 |
 | [[wiki/llm-models/_index\|llm-models/]] | LLM model catalogs — OpenAI and Claude/Anthropic models, IDs, context, pricing | 2 |
-| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery | 14 |
+| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery, LM Studio local | 17 |
 | [[wiki/reports/_index\|reports/]] | Weekly and monthly summaries — generate: `uv run python scripts/report-generator.py --weekly` | 1 |
 | [[wiki/infrastructure/_index\|infrastructure/]] | Server inventory: all 10 SSH hosts — optical, optical-dev, optical-prod, baic, librechat, modocmms, box-cli, aimpress, pve | 10 |

--- a/wiki/claude-code/_index.md
+++ b/wiki/claude-code/_index.md
@ -28,3 +28,6 @@ Claude Code is Anthropic's agentic coding assistant. Works across terminal, IDE,
 | [[wiki/claude-code/troubleshooting\|troubleshooting]] | Install errors quick-ref, PATH/proxy/TLS fixes, platform issues (Linux/macOS/Windows/WSL), auth problems, performance, IDE integration | raw/Troubleshooting.md | 2026-04-17 |
 | [[wiki/claude-code/dot-claude-folder\|dot-claude-folder]] | Full .claude folder reference: CLAUDE.md, hooks, skills, agents, commands, plugins, rules, .mcp.json — the mental model (advisory vs deterministic vs on-demand) | raw/Claude md folder.md | 2026-04-29 |
 | [[wiki/claude-code/oliver-skills-config\|oliver-skills-config]] | Oliver Agency skills setup: 5 always-active Obsidian skills + 19 contextual skills, selection rationale (40+ project frequency analysis), quick-reference by task type | session 2026-04-29 | 2026-04-29 |
+| [[wiki/claude-code/lmstudio-anthropic-compat\|lmstudio-anthropic-compat]] | Redirect Claude Code and the Anthropic SDK to a local LM Studio server via two env vars; `/v1/messages` drop-in, auth options, cURL + Python examples | raw/Anthropic Compatibility Endpoints.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-chat-completions\|lmstudio-chat-completions]] | LM Studio OpenAI-compatible `/v1/chat/completions`: Python example, all supported params (incl. top_k, repeat_penalty), `lms log stream` debugging | raw/Chat Completions.md | 2026-04-30 |
+| [[wiki/claude-code/lmstudio-embeddings\|lmstudio-embeddings]] | LM Studio `/v1/embeddings`: OpenAI-compat drop-in, Python example, newline stripping, batch inputs, use with FAISS/Chroma for local RAG | raw/Embeddings.md | 2026-04-30 |
--- a/wiki/claude-code/lmstudio-anthropic-compat.md
+++ b/wiki/claude-code/lmstudio-anthropic-compat.md
@ -0,0 +1,96 @@
+---
+title: "LM Studio — Anthropic Compatibility Endpoints"
+aliases: [lmstudio-anthropic, lm-studio-local-api, anthropic-compat-lmstudio]
+tags: [claude-code, lm-studio, local-llm, anthropic-sdk, api]
+sources: [raw/Anthropic Compatibility Endpoints.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — Anthropic Compatibility Endpoints
+
+LM Studio exposes a `/v1/messages` endpoint that mirrors the Anthropic Messages API. Any code or tool that talks to Anthropic — including Claude Code and the Anthropic Python/JS SDK — can be redirected to a local LM Studio server with two env vars.
+
+## Supported Endpoints
+
+| Endpoint | Method |
+|----------|--------|
+| `/v1/messages` | POST |
+
+## Quick Start — Claude Code with LM Studio
+
+```bash
+export ANTHROPIC_BASE_URL=http://localhost:1234
+export ANTHROPIC_AUTH_TOKEN=lmstudio
+claude --model openai/gpt-oss-20b
+```
+
+- `ANTHROPIC_BASE_URL` — points Claude Code at the local server instead of `api.anthropic.com`
+- `ANTHROPIC_AUTH_TOKEN` — any non-empty string; LM Studio ignores the value unless auth is enabled
+- `--model` — pass the LM Studio model ID (e.g. `ibm/granite-4-micro`, `openai/gpt-oss-20b`)
+
+## Authentication
+
+LM Studio accepts two auth header formats when **Require Authentication** is enabled:
+
+| Header | Value |
+|--------|-------|
+| `x-api-key` | `$LM_API_TOKEN` |
+| `Authorization` | `Bearer $LM_API_TOKEN` |
+
+When auth is **disabled**, both headers are optional.
+
+## cURL Example
+
+```bash
+curl http://localhost:1234/v1/messages \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: $LM_API_TOKEN" \
+  -d '{
+    "model": "ibm/granite-4-micro",
+    "max_tokens": 256,
+    "messages": [
+      {"role": "user", "content": "Write a haiku about local LLMs."}
+    ]
+  }'
+```
+
+## Python SDK Example
+
+```python
+from anthropic import Anthropic
+
+client = Anthropic(
+    base_url="http://localhost:1234",
+    api_key="lmstudio",          # any string when auth is disabled
+)
+
+message = client.messages.create(
+    max_tokens=1024,
+    messages=[{"role": "user", "content": "Hello from LM Studio"}],
+    model="ibm/granite-4-micro",
+)
+print(message.content)
+```
+
+- `api_key` can be omitted entirely when Require Authentication is off
+- Drop-in replacement: only `base_url` changes vs. the real Anthropic API
+
+## Key Takeaways
+
+- LM Studio's `/v1/messages` is API-compatible with `api.anthropic.com/v1/messages`
+- Two env vars (`ANTHROPIC_BASE_URL` + `ANTHROPIC_AUTH_TOKEN`) are all that's needed to redirect Claude Code to a local model
+- Model IDs come from LM Studio, not Anthropic — use whatever is loaded in the LM Studio server
+- Auth is optional by default; enable it in LM Studio settings if the server is exposed beyond localhost
+- Any Anthropic SDK client (Python, JS, cURL) works without code changes beyond `base_url`
+
+## Related
+
+- [[wiki/claude-code/headless-cli|Headless CLI]] — running Claude Code non-interactively with `-p`
+- [[wiki/claude-code/overview|Claude Code Overview]] — full product capabilities
+- [[wiki/llm-models/claude-model-catalog|Claude Model Catalog]] — official Anthropic model IDs for comparison
+- [[wiki/concepts/local-llm-serving|Local LLM Serving]] — broader context on self-hosted model inference
+
+## Sources
+
+- LM Studio Docs: Anthropic Compatibility — `raw/Anthropic Compatibility Endpoints.md`
--- a/wiki/claude-code/lmstudio-chat-completions.md
+++ b/wiki/claude-code/lmstudio-chat-completions.md
@ -0,0 +1,82 @@
+---
+title: "LM Studio — OpenAI Chat Completions Endpoint"
+aliases: [lmstudio-openai-chat, lm-studio-chat-completions]
+tags: [lmstudio, openai-compat, local-llm, api, chat]
+sources: [raw/Chat Completions.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — OpenAI Chat Completions Endpoint
+
+LM Studio exposes an OpenAI-compatible `POST /v1/chat/completions` endpoint. Any client built for OpenAI can point at LM Studio with two changes: `base_url` and `api_key`.
+
+## Endpoint
+
+| Field | Value |
+|-------|-------|
+| Method | `POST` |
+| URL | `http://localhost:1234/v1/chat/completions` |
+| Auth | any string (e.g. `"lm-studio"`) |
+
+- Prompt template is applied automatically for chat-tuned models
+- Stream with `stream: true` for token-by-token output
+- Inspect actual model input with `lms log stream` in a second terminal
+
+## Python Example
+
+```python
+from openai import OpenAI
+
+client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
+
+completion = client.chat.completions.create(
+  model="model-identifier",
+  messages=[
+    {"role": "system", "content": "Always answer in rhymes."},
+    {"role": "user",   "content": "Introduce yourself."}
+  ],
+  temperature=0.7,
+)
+
+print(completion.choices[0].message)
+```
+
+Replace `"model-identifier"` with the exact model name shown in LM Studio's UI.
+
+## Supported Payload Parameters
+
+```
+model            top_p            top_k
+messages         temperature      max_tokens
+stream           stop             presence_penalty
+frequency_penalty  logit_bias     repeat_penalty
+seed
+```
+
+`top_k` and `repeat_penalty` are LM Studio extensions not in the OpenAI spec — they work here but not against the real OpenAI API.
+
+## Debugging
+
+```bash
+lms log stream   # live view of what the model actually receives
+```
+
+## Key Takeaways
+
+- Drop-in replacement for `openai.chat.completions.create` — change `base_url` only
+- `api_key` value is ignored; pass any non-empty string
+- Chat-tuned models get their prompt template applied automatically
+- `top_k` and `repeat_penalty` are bonus params unavailable on real OpenAI
+- Use `lms log stream` to verify the exact prompt being sent to the model
+
+## Related
+
+- [[wiki/claude-code/lmstudio-anthropic-compat|LM Studio Anthropic Compat]] — use `/v1/messages` with Claude-style SDK instead
+- [[wiki/claude-code/headless-cli|Headless CLI]] — programmatic Claude Code usage patterns
+- [[wiki/llm-models/_index|LLM Models]] — model IDs for OpenAI and Anthropic
+
+## Sources
+
+- [LM Studio Chat Completions docs](https://lmstudio.ai/docs/developer/openai-compat/chat-completions)
+- [OpenAI Chat Completions reference](https://platform.openai.com/docs/api-reference/chat)
--- a/wiki/claude-code/lmstudio-embeddings.md
+++ b/wiki/claude-code/lmstudio-embeddings.md
@ -0,0 +1,62 @@
+---
+title: "LM Studio — Embeddings Endpoint"
+aliases: [lmstudio-embeddings, lm-studio-embeddings, local-embeddings]
+tags: [lmstudio, embeddings, openai-compat, local-llm, vectors]
+sources: [raw/Embeddings.md]
+created: 2026-04-30
+updated: 2026-04-30
+---
+
+# LM Studio — Embeddings Endpoint
+
+LM Studio exposes an OpenAI-compatible `/v1/embeddings` endpoint for generating dense vector representations of text. Drop-in compatible with the `openai` Python SDK.
+
+## Endpoint
+
+- **Method:** `POST /v1/embeddings`
+- **Base URL:** `http://localhost:1234/v1`
+- **API key:** any non-empty string (e.g. `"lm-studio"`)
+- **Spec:** mirrors [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings)
+
+## Python Example
+
+```python
+from openai import OpenAI
+
+client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
+
+def get_embedding(text, model="model-identifier"):
+    text = text.replace("\n", " ")
+    return client.embeddings.create(input=[text], model=model).data[0].embedding
+
+print(get_embedding("Once upon a time, there was a cat."))
+```
+
+- Replace `"model-identifier"` with the name of an embedding model loaded in LM Studio
+- Newlines are stripped before embedding — standard preprocessing step
+- Returns a flat Python list of floats (`data[0].embedding`)
+
+## Usage Notes
+
+- Must have an embedding model loaded in LM Studio (not a chat model)
+- Common embedding models: `nomic-embed-text`, `text-embedding-3-small` clones, `all-MiniLM-L6-v2`
+- The `model` param must match the identifier shown in LM Studio's loaded models list
+- Batch inputs: pass multiple strings in the `input` list for efficiency
+
+## Key Takeaways
+
+- LM Studio's `/v1/embeddings` is a drop-in OpenAI replacement — zero code changes beyond `base_url` and `api_key`
+- Use any non-empty string as the API key; auth is not enforced locally
+- Strip newlines before embedding for cleaner vectors
+- Return value is `response.data[0].embedding` — a list of floats
+- Pair with a vector store (FAISS, Chroma, pgvector) to build a fully local RAG pipeline
+
+## Related Articles
+
+- [[wiki/claude-code/lmstudio-anthropic-compat|LM Studio — Anthropic Compat]] — redirect Claude Code / Anthropic SDK to local LM Studio
+- [[wiki/claude-code/lmstudio-chat-completions|LM Studio — Chat Completions]] — `/v1/chat/completions` with full param list
+- [[wiki/architecture/rag-pattern|RAG Pattern]] — retrieval-augmented generation using embeddings
+
+## Sources
+
+- `raw/Embeddings.md` — clipped from https://lmstudio.ai/docs/developer/openai-compat/embeddings