From 1eb1072c196d79f22251fe53b5fa444aa8556575 Mon Sep 17 00:00:00 2001 From: Vadym Samoilenko Date: Thu, 30 Apr 2026 14:27:25 +0100 Subject: [PATCH] vault backup: 2026-04-30 14:27:25 --- .obsidian/plugins/hoarder-sync/data.json | 2 +- 99 Daily/2026-04-30.md | 9 ++ .../Anthropic Compatibility Endpoints.md | 0 raw/{ => _processed}/Chat Completions.md | 0 raw/{ => _processed}/Embeddings.md | 0 wiki/_master-index.md | 2 +- wiki/claude-code/_index.md | 3 + wiki/claude-code/lmstudio-anthropic-compat.md | 96 +++++++++++++++++++ wiki/claude-code/lmstudio-chat-completions.md | 82 ++++++++++++++++ wiki/claude-code/lmstudio-embeddings.md | 62 ++++++++++++ 10 files changed, 254 insertions(+), 2 deletions(-) rename raw/{ => _processed}/Anthropic Compatibility Endpoints.md (100%) rename raw/{ => _processed}/Chat Completions.md (100%) rename raw/{ => _processed}/Embeddings.md (100%) create mode 100644 wiki/claude-code/lmstudio-anthropic-compat.md create mode 100644 wiki/claude-code/lmstudio-chat-completions.md create mode 100644 wiki/claude-code/lmstudio-embeddings.md diff --git a/.obsidian/plugins/hoarder-sync/data.json b/.obsidian/plugins/hoarder-sync/data.json index 1e7b4db..69f1e66 100644 --- a/.obsidian/plugins/hoarder-sync/data.json +++ b/.obsidian/plugins/hoarder-sync/data.json @@ -4,7 +4,7 @@ "syncFolder": "Hoarder", "attachmentsFolder": "Hoarder/attachments", "syncIntervalMinutes": 60, - "lastSyncTimestamp": 1777555441627, + "lastSyncTimestamp": 1777555641238, "updateExistingFiles": false, "excludeArchived": true, "onlyFavorites": false, diff --git a/99 Daily/2026-04-30.md b/99 Daily/2026-04-30.md index 89652d5..c448c5f 100644 --- a/99 Daily/2026-04-30.md +++ b/99 Daily/2026-04-30.md @@ -140,3 +140,12 @@ tags: [daily] - 14:21 (4min) | `video-accessibility` - **Asked:** What skills for code review are specified in the project instructions? - **Done:** Reviewed project architecture and identified that va-worker uses Cloud Run Jobs with asyncio.run() entrypoint, eliminating need for Redis broker. +- 14:25 (<1min) | `memory-compiler` + - **Asked:** Compile a new article about LM Studio's Anthropic-compatible endpoint into the wiki knowledge base. + - **Done:** Filed article as `wiki/claude-code/lmstudio-anthropic-compat.md` and updated both the topic and master indices. +- 14:26 (<1min) | `memory-compiler` + - **Asked:** Create a structured wiki article for LM Studio's chat completions API endpoint. + - **Done:** Created `lmstudio-chat-completions.md` with API documentation, Python examples, and parameter reference; updated topic index to 16 articles. +- 14:27 (<1min) | `memory-compiler` + - **Asked:** Compile a new article about LM Studio embeddings into the structured wiki knowledge base. + - **Done:** Filed article as `wiki/claude-code/lmstudio-embeddings.md` and updated master index with wikilinks to related LM Studio topics and RAG pattern. diff --git a/raw/Anthropic Compatibility Endpoints.md b/raw/_processed/Anthropic Compatibility Endpoints.md similarity index 100% rename from raw/Anthropic Compatibility Endpoints.md rename to raw/_processed/Anthropic Compatibility Endpoints.md diff --git a/raw/Chat Completions.md b/raw/_processed/Chat Completions.md similarity index 100% rename from raw/Chat Completions.md rename to raw/_processed/Chat Completions.md diff --git a/raw/Embeddings.md b/raw/_processed/Embeddings.md similarity index 100% rename from raw/Embeddings.md rename to raw/_processed/Embeddings.md diff --git a/wiki/_master-index.md b/wiki/_master-index.md index 43dd837..a014493 100644 --- a/wiki/_master-index.md +++ b/wiki/_master-index.md @@ -31,7 +31,7 @@ This 3-hop pattern works for hundreds of articles without vector search. | [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 21 | | [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 30 | | [[wiki/llm-models/_index\|llm-models/]] | LLM model catalogs — OpenAI and Claude/Anthropic models, IDs, context, pricing | 2 | -| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery | 14 | +| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery, LM Studio local | 17 | | [[wiki/reports/_index\|reports/]] | Weekly and monthly summaries — generate: `uv run python scripts/report-generator.py --weekly` | 1 | | [[wiki/infrastructure/_index\|infrastructure/]] | Server inventory: all 10 SSH hosts — optical, optical-dev, optical-prod, baic, librechat, modocmms, box-cli, aimpress, pve | 10 | diff --git a/wiki/claude-code/_index.md b/wiki/claude-code/_index.md index 713af53..3748b8c 100644 --- a/wiki/claude-code/_index.md +++ b/wiki/claude-code/_index.md @@ -28,3 +28,6 @@ Claude Code is Anthropic's agentic coding assistant. Works across terminal, IDE, | [[wiki/claude-code/troubleshooting\|troubleshooting]] | Install errors quick-ref, PATH/proxy/TLS fixes, platform issues (Linux/macOS/Windows/WSL), auth problems, performance, IDE integration | raw/Troubleshooting.md | 2026-04-17 | | [[wiki/claude-code/dot-claude-folder\|dot-claude-folder]] | Full .claude folder reference: CLAUDE.md, hooks, skills, agents, commands, plugins, rules, .mcp.json — the mental model (advisory vs deterministic vs on-demand) | raw/Claude md folder.md | 2026-04-29 | | [[wiki/claude-code/oliver-skills-config\|oliver-skills-config]] | Oliver Agency skills setup: 5 always-active Obsidian skills + 19 contextual skills, selection rationale (40+ project frequency analysis), quick-reference by task type | session 2026-04-29 | 2026-04-29 | +| [[wiki/claude-code/lmstudio-anthropic-compat\|lmstudio-anthropic-compat]] | Redirect Claude Code and the Anthropic SDK to a local LM Studio server via two env vars; `/v1/messages` drop-in, auth options, cURL + Python examples | raw/Anthropic Compatibility Endpoints.md | 2026-04-30 | +| [[wiki/claude-code/lmstudio-chat-completions\|lmstudio-chat-completions]] | LM Studio OpenAI-compatible `/v1/chat/completions`: Python example, all supported params (incl. top_k, repeat_penalty), `lms log stream` debugging | raw/Chat Completions.md | 2026-04-30 | +| [[wiki/claude-code/lmstudio-embeddings\|lmstudio-embeddings]] | LM Studio `/v1/embeddings`: OpenAI-compat drop-in, Python example, newline stripping, batch inputs, use with FAISS/Chroma for local RAG | raw/Embeddings.md | 2026-04-30 | diff --git a/wiki/claude-code/lmstudio-anthropic-compat.md b/wiki/claude-code/lmstudio-anthropic-compat.md new file mode 100644 index 0000000..9c5c982 --- /dev/null +++ b/wiki/claude-code/lmstudio-anthropic-compat.md @@ -0,0 +1,96 @@ +--- +title: "LM Studio — Anthropic Compatibility Endpoints" +aliases: [lmstudio-anthropic, lm-studio-local-api, anthropic-compat-lmstudio] +tags: [claude-code, lm-studio, local-llm, anthropic-sdk, api] +sources: [raw/Anthropic Compatibility Endpoints.md] +created: 2026-04-30 +updated: 2026-04-30 +--- + +# LM Studio — Anthropic Compatibility Endpoints + +LM Studio exposes a `/v1/messages` endpoint that mirrors the Anthropic Messages API. Any code or tool that talks to Anthropic — including Claude Code and the Anthropic Python/JS SDK — can be redirected to a local LM Studio server with two env vars. + +## Supported Endpoints + +| Endpoint | Method | +|----------|--------| +| `/v1/messages` | POST | + +## Quick Start — Claude Code with LM Studio + +```bash +export ANTHROPIC_BASE_URL=http://localhost:1234 +export ANTHROPIC_AUTH_TOKEN=lmstudio +claude --model openai/gpt-oss-20b +``` + +- `ANTHROPIC_BASE_URL` — points Claude Code at the local server instead of `api.anthropic.com` +- `ANTHROPIC_AUTH_TOKEN` — any non-empty string; LM Studio ignores the value unless auth is enabled +- `--model` — pass the LM Studio model ID (e.g. `ibm/granite-4-micro`, `openai/gpt-oss-20b`) + +## Authentication + +LM Studio accepts two auth header formats when **Require Authentication** is enabled: + +| Header | Value | +|--------|-------| +| `x-api-key` | `$LM_API_TOKEN` | +| `Authorization` | `Bearer $LM_API_TOKEN` | + +When auth is **disabled**, both headers are optional. + +## cURL Example + +```bash +curl http://localhost:1234/v1/messages \ + -H "Content-Type: application/json" \ + -H "x-api-key: $LM_API_TOKEN" \ + -d '{ + "model": "ibm/granite-4-micro", + "max_tokens": 256, + "messages": [ + {"role": "user", "content": "Write a haiku about local LLMs."} + ] + }' +``` + +## Python SDK Example + +```python +from anthropic import Anthropic + +client = Anthropic( + base_url="http://localhost:1234", + api_key="lmstudio", # any string when auth is disabled +) + +message = client.messages.create( + max_tokens=1024, + messages=[{"role": "user", "content": "Hello from LM Studio"}], + model="ibm/granite-4-micro", +) +print(message.content) +``` + +- `api_key` can be omitted entirely when Require Authentication is off +- Drop-in replacement: only `base_url` changes vs. the real Anthropic API + +## Key Takeaways + +- LM Studio's `/v1/messages` is API-compatible with `api.anthropic.com/v1/messages` +- Two env vars (`ANTHROPIC_BASE_URL` + `ANTHROPIC_AUTH_TOKEN`) are all that's needed to redirect Claude Code to a local model +- Model IDs come from LM Studio, not Anthropic — use whatever is loaded in the LM Studio server +- Auth is optional by default; enable it in LM Studio settings if the server is exposed beyond localhost +- Any Anthropic SDK client (Python, JS, cURL) works without code changes beyond `base_url` + +## Related + +- [[wiki/claude-code/headless-cli|Headless CLI]] — running Claude Code non-interactively with `-p` +- [[wiki/claude-code/overview|Claude Code Overview]] — full product capabilities +- [[wiki/llm-models/claude-model-catalog|Claude Model Catalog]] — official Anthropic model IDs for comparison +- [[wiki/concepts/local-llm-serving|Local LLM Serving]] — broader context on self-hosted model inference + +## Sources + +- LM Studio Docs: Anthropic Compatibility — `raw/Anthropic Compatibility Endpoints.md` diff --git a/wiki/claude-code/lmstudio-chat-completions.md b/wiki/claude-code/lmstudio-chat-completions.md new file mode 100644 index 0000000..57d7755 --- /dev/null +++ b/wiki/claude-code/lmstudio-chat-completions.md @@ -0,0 +1,82 @@ +--- +title: "LM Studio — OpenAI Chat Completions Endpoint" +aliases: [lmstudio-openai-chat, lm-studio-chat-completions] +tags: [lmstudio, openai-compat, local-llm, api, chat] +sources: [raw/Chat Completions.md] +created: 2026-04-30 +updated: 2026-04-30 +--- + +# LM Studio — OpenAI Chat Completions Endpoint + +LM Studio exposes an OpenAI-compatible `POST /v1/chat/completions` endpoint. Any client built for OpenAI can point at LM Studio with two changes: `base_url` and `api_key`. + +## Endpoint + +| Field | Value | +|-------|-------| +| Method | `POST` | +| URL | `http://localhost:1234/v1/chat/completions` | +| Auth | any string (e.g. `"lm-studio"`) | + +- Prompt template is applied automatically for chat-tuned models +- Stream with `stream: true` for token-by-token output +- Inspect actual model input with `lms log stream` in a second terminal + +## Python Example + +```python +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio") + +completion = client.chat.completions.create( + model="model-identifier", + messages=[ + {"role": "system", "content": "Always answer in rhymes."}, + {"role": "user", "content": "Introduce yourself."} + ], + temperature=0.7, +) + +print(completion.choices[0].message) +``` + +Replace `"model-identifier"` with the exact model name shown in LM Studio's UI. + +## Supported Payload Parameters + +``` +model top_p top_k +messages temperature max_tokens +stream stop presence_penalty +frequency_penalty logit_bias repeat_penalty +seed +``` + +`top_k` and `repeat_penalty` are LM Studio extensions not in the OpenAI spec — they work here but not against the real OpenAI API. + +## Debugging + +```bash +lms log stream # live view of what the model actually receives +``` + +## Key Takeaways + +- Drop-in replacement for `openai.chat.completions.create` — change `base_url` only +- `api_key` value is ignored; pass any non-empty string +- Chat-tuned models get their prompt template applied automatically +- `top_k` and `repeat_penalty` are bonus params unavailable on real OpenAI +- Use `lms log stream` to verify the exact prompt being sent to the model + +## Related + +- [[wiki/claude-code/lmstudio-anthropic-compat|LM Studio Anthropic Compat]] — use `/v1/messages` with Claude-style SDK instead +- [[wiki/claude-code/headless-cli|Headless CLI]] — programmatic Claude Code usage patterns +- [[wiki/llm-models/_index|LLM Models]] — model IDs for OpenAI and Anthropic + +## Sources + +- [LM Studio Chat Completions docs](https://lmstudio.ai/docs/developer/openai-compat/chat-completions) +- [OpenAI Chat Completions reference](https://platform.openai.com/docs/api-reference/chat) diff --git a/wiki/claude-code/lmstudio-embeddings.md b/wiki/claude-code/lmstudio-embeddings.md new file mode 100644 index 0000000..0d561cc --- /dev/null +++ b/wiki/claude-code/lmstudio-embeddings.md @@ -0,0 +1,62 @@ +--- +title: "LM Studio — Embeddings Endpoint" +aliases: [lmstudio-embeddings, lm-studio-embeddings, local-embeddings] +tags: [lmstudio, embeddings, openai-compat, local-llm, vectors] +sources: [raw/Embeddings.md] +created: 2026-04-30 +updated: 2026-04-30 +--- + +# LM Studio — Embeddings Endpoint + +LM Studio exposes an OpenAI-compatible `/v1/embeddings` endpoint for generating dense vector representations of text. Drop-in compatible with the `openai` Python SDK. + +## Endpoint + +- **Method:** `POST /v1/embeddings` +- **Base URL:** `http://localhost:1234/v1` +- **API key:** any non-empty string (e.g. `"lm-studio"`) +- **Spec:** mirrors [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) + +## Python Example + +```python +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio") + +def get_embedding(text, model="model-identifier"): + text = text.replace("\n", " ") + return client.embeddings.create(input=[text], model=model).data[0].embedding + +print(get_embedding("Once upon a time, there was a cat.")) +``` + +- Replace `"model-identifier"` with the name of an embedding model loaded in LM Studio +- Newlines are stripped before embedding — standard preprocessing step +- Returns a flat Python list of floats (`data[0].embedding`) + +## Usage Notes + +- Must have an embedding model loaded in LM Studio (not a chat model) +- Common embedding models: `nomic-embed-text`, `text-embedding-3-small` clones, `all-MiniLM-L6-v2` +- The `model` param must match the identifier shown in LM Studio's loaded models list +- Batch inputs: pass multiple strings in the `input` list for efficiency + +## Key Takeaways + +- LM Studio's `/v1/embeddings` is a drop-in OpenAI replacement — zero code changes beyond `base_url` and `api_key` +- Use any non-empty string as the API key; auth is not enforced locally +- Strip newlines before embedding for cleaner vectors +- Return value is `response.data[0].embedding` — a list of floats +- Pair with a vector store (FAISS, Chroma, pgvector) to build a fully local RAG pipeline + +## Related Articles + +- [[wiki/claude-code/lmstudio-anthropic-compat|LM Studio — Anthropic Compat]] — redirect Claude Code / Anthropic SDK to local LM Studio +- [[wiki/claude-code/lmstudio-chat-completions|LM Studio — Chat Completions]] — `/v1/chat/completions` with full param list +- [[wiki/architecture/rag-pattern|RAG Pattern]] — retrieval-augmented generation using embeddings + +## Sources + +- `raw/Embeddings.md` — clipped from https://lmstudio.ai/docs/developer/openai-compat/embeddings