vault backup: 2026-04-28 22:17:28

2026-04-28 22:17:28 +01:00 · 2026-04-28 22:17:28 +01:00 · d93c9f4516
commit d93c9f4516
parent b53ec9ce64
6 changed files with 337 additions and 1 deletions
--- a/wiki/_master-index.md
+++ b/wiki/_master-index.md
@ -23,7 +23,7 @@ This 3-hop pattern works for hundreds of articles without vector search.
 | [[wiki/tech-patterns/_index\|tech-patterns/]] | Recurring tech stacks: FastAPI, React/Vite, Next.js, Azure AD, AI, Box, One2Edit, Redis/Celery, cost-tracker | 13 |
 | [[wiki/architecture/_index\|architecture/]] | Cross-cutting architectural patterns: Docker Compose, multi-agent AI, GCP timeout, RAG, hotfolder, optical-dev deploy, cost-tracker, new-project checklist, troubleshooting playbooks, ADR log | 10 |
 | [[wiki/client-knowledge/_index\|client-knowledge/]] | Per-client notes for Ford, H&M, L'Oréal, Barclays, Ferrero, 3M | 6 |
-| [[wiki/concepts/_index\|concepts/]] | Atomic knowledge extracted from Claude Code sessions | 51 |
+| [[wiki/concepts/_index\|concepts/]] | Atomic knowledge extracted from Claude Code sessions | 54 |
 | [[wiki/connections/_index\|connections/]] | Cross-cutting insights linking 2+ concepts: FastAPI+Azure AD+Docker trinity, AI→cost-tracker, Apache+Vite basePath, GCP→REST polling, Box+hotfolder, Docker DNS+AdGuard | 9 |
 | [[wiki/qa/_index\|qa/]] | Filed answers to queries (saved with `--file-back`) | 0 |
 | [[wiki/homelab/_index\|homelab/]] | Self-hosted infra: Proxmox install, IOMMU/PCI passthrough, hypervisor setup, budget builds, HP Elitedesk G3, Homarr API + Apps + Boards + Certificates + Integrations + Settings + Tasks + AdGuard + Clock + Docker Stats + Docker Integration + Download Client + Firewall + Proxmox Integration + Radarr + Readarr + Sonarr + Bookmarks + Calendar + Icons + App Widget + Weather + GitHub + Nextcloud + qBittorrent + RSS Feed + Speedtest Tracker + System Health Monitoring + System Resources + Services Map + Media Stack | 38 |
--- a/wiki/concepts/_index.md
+++ b/wiki/concepts/_index.md
@ -56,6 +56,9 @@
 | [[wiki/concepts/docker-lxc-dns-configuration]] | Docker containers in LXC/Proxmox use router DNS by default — need explicit AdGuard DNS in compose for internal domains | daily/2026-04-28.md | 2026-04-28 |
 | [[wiki/concepts/mac-address-randomization-dhcp]] | Apple MAC randomization creates multiple DHCP leases per device — exhausts pool, causes IP conflicts; disable per-network | daily/2026-04-28.md | 2026-04-28 |
 | [[wiki/concepts/prowlarr-flaresolverr-limitation]] | Prowlarr bypasses FlareSolverr for some indexers (RuTracker); VPN alternatives: SOCKS5, WireGuard in LXC, router VPN | daily/2026-04-28.md | 2026-04-28 |
+| [[wiki/concepts/bash-and-or-short-circuit]] | Bash `A && B \|\| C` is not if/else — if B fails, C runs even when A was true; deploy scripts must use if/fi | daily/2026-04-24.md | 2026-04-24 |
+| [[wiki/concepts/python-iso-z-suffix]] | Python < 3.11 `fromisoformat()` rejects `Z` suffix from JS `toISOString()` — replace `Z` with `+00:00` before parsing | daily/2026-04-24.md | 2026-04-24 |
+| [[wiki/concepts/gemini-conversation-cost-scaling]] | Gemini bills full accumulated conversation history per turn — cost grows quadratically; backfill scripts must account for this | daily/2026-04-24.md | 2026-04-24 |

 <!-- Articles added automatically by compile.py -->
 <!-- Format: | [[concepts/slug]] | One-line summary | daily/YYYY-MM-DD.md | date | -->
--- a/wiki/concepts/bash-and-or-short-circuit.md
+++ b/wiki/concepts/bash-and-or-short-circuit.md
@ -0,0 +1,105 @@
+---
+title: "Bash — A && B || C Is Not If/Else"
+aliases: [bash-short-circuit, bash-and-or-pitfall, deploy-script-bash-bug]
+tags: [bash, shell, deploy, devops, debugging]
+sources:
+  - "daily/2026-04-24.md"
+created: 2026-04-24
+updated: 2026-04-24
+---
+
+# Bash — A && B || C Is Not If/Else
+
+The bash pattern `A && B || C` looks like an if/else construct but is not. If `A` succeeds and `B` fails (exits non-zero), `C` still executes — even though `A` was true. In deploy scripts this silently sets a `FAILED` flag when the deployment actually succeeded, causing false failure reports or missed error handling.
+
+## Key Points
+
+- **`A && B || C` ≠ `if A; then B; else C; fi`** — if `B` exits non-zero, `C` runs even when `A` was true
+- The correct substitute is an explicit `if/fi` block, not `&&/||` chaining
+- Deploy scripts commonly use this pattern as: `[[ condition ]] && run_check || FAILED=1` — if `run_check` exits non-zero (e.g., grep found nothing), `FAILED=1` is set falsely
+- The bug is invisible: the deploy log shows no error, the service works fine, but the script exits with failure status — CI marks the deploy as failed
+- Most dangerous when `B` is a test command (`grep`, `curl`, `diff`) that legitimately returns 1 for "no match found"
+
+## Details
+
+### The Failure Mode
+
+```bash
+# WRONG — looks like: "if backend is running, check health, else mark failed"
+[[ "$BACKEND_ONLY" == "true" ]] && check_backend_health || FAILED=1
+
+# What actually happens:
+# 1. [[ "$BACKEND_ONLY" == "true" ]] → exits 0 (true)
+# 2. check_backend_health → exits 1 (curl found no match in log, but server is fine)
+# 3. FAILED=1 executes — even though condition was true and we entered the "then" branch
+```
+
+The backend is healthy (HTTP 200 in the logs), but the script concludes deployment failed because `check_backend_health` returned non-zero for an unrelated reason (e.g., `grep` found no output lines).
+
+### The Fix: Explicit if/fi
+
+```bash
+# CORRECT — unambiguous if/else
+if [[ "$BACKEND_ONLY" == "true" ]]; then
+    if ! check_backend_health; then
+        FAILED=1
+    fi
+fi
+
+# Or inline:
+[[ "$BACKEND_ONLY" == "true" ]] && { check_backend_health || FAILED=1; }
+```
+
+The grouped `{ }` form keeps B and C together as a single unit — C only runs if B fails within that group.
+
+### Why `&&/||` Chaining Is Tempting
+
+```bash
+# Reads naturally as "do A and then B, or else C"
+mkdir -p /tmp/deploy && rsync -r . /tmp/deploy || echo "deploy failed"
+
+# This works correctly ONLY IF rsync is the last command and we want:
+# "if rsync fails → print error"
+# NOT: "if mkdir fails → skip rsync AND print error"
+```
+
+`A && B || C` works correctly only when:
+1. You only care whether the entire `A && B` chain succeeds (not the individual steps)
+2. `B` never legitimately exits non-zero when `A` was true
+
+In deploy scripts, `B` is almost always a health check that can legitimately return 1 — making `&&/||` unsafe.
+
+### Diagnosing a False Failure
+
+```bash
+# Script exits 1 but the deployment worked?
+# Add set -x to see what ran:
+set -x
+bash deploy.sh
+
+# Or check exit codes manually:
+check_backend_health; echo "health check exit: $?"
+```
+
+If `check_backend_health` exits 1 despite the service being healthy, it's returning 1 for "no match" or "no output" rather than actual failure.
+
+### Real Incident (2026-04-24)
+
+`deploy.sh` for NotebookLlama project had:
+```bash
+[[ condition ]] && run_check || FAILED=1
+```
+
+The backend was running (HTTP 200 in logs), but `run_check` returned 1 because `grep` found no matching log lines (empty output = exit 1 for grep). `FAILED=1` was set, script exited 1. User saw "deploy failed" when deployment was fine.
+
+Fix: replaced all `&&/||` patterns in the deploy script with `if/fi` blocks (commit `8c5e01f`).
+
+## Related Concepts
+
+- [[wiki/concepts/monorepo-deploy-script-pitfall]] — another silent deploy script failure (subdirectory `.git` check)
+- [[wiki/concepts/python-service-deployment-dotenv]] — Python service deploy checklist with explicit error handling
+- [[wiki/architecture/troubleshooting-playbooks]] — symptom → root cause → fix for common failures
+
+## Sources
+
+- [[daily/2026-04-24.md]] — NotebookLlama deploy.sh flagged as failed when backend was healthy; `&&/||` pattern caused `FAILED=1` despite `[[ condition ]]` being true and service returning HTTP 200; fix: `if/fi` blocks (commit `8c5e01f`)
--- a/wiki/concepts/gemini-conversation-cost-scaling.md
+++ b/wiki/concepts/gemini-conversation-cost-scaling.md
@ -0,0 +1,115 @@
+---
+title: "Gemini — Conversation History Causes Quadratic Token Cost Growth"
+aliases: [gemini-conversation-cost, gemini-token-scaling, gemini-accumulated-history, llm-conversation-cost]
+tags: [gemini, cost-tracking, llm, ai, performance, budget]
+sources:
+  - "daily/2026-04-24.md"
+created: 2026-04-24
+updated: 2026-04-24
+---
+
+# Gemini — Conversation History Causes Quadratic Token Cost Growth
+
+Gemini (and most LLM APIs) bill the full accumulated conversation history as prompt tokens on every turn. Each new message includes all previous messages in the request. This means cost grows quadratically with conversation length, not linearly — a 10-turn conversation costs roughly 5× more total than its message count suggests if estimated as individual turns.
+
+## Key Points
+
+- **Every Gemini API call sends the full history** — turn 10 includes the entire turns 1–9 as prompt context
+- **Cost is quadratic**: total tokens ≈ `sum(messages_so_far_at_each_turn) = n * (n+1) / 2 * avg_turn_tokens`
+- **Backfill/estimation scripts get this wrong** if they estimate each turn's input tokens as just the current message — actual prompt tokens are `base_template + all_prior_messages`
+- **Gemini `usage_metadata.prompt_token_count`** from the live API response is always accurate — only estimation/backfill tools need special handling
+- **200k token threshold** in Gemini pricing creates a tier split — conversations long enough to cross this boundary have different per-token prices mid-conversation
+
+## Details
+
+### Why Cost Grows Quadratically
+
+```
+Turn 1: sends [system_prompt + message_1]         → ~1500 tokens input
+Turn 2: sends [system_prompt + message_1 + message_2] → ~2000 tokens input
+Turn 3: sends [system_prompt + m1 + m2 + message_3]   → ~2500 tokens input
+...
+Turn N: sends [system_prompt + m1...m(N-1) + mN]      → ~(1500 + N*500) tokens input
+```
+
+**Total input tokens** = sum of all turns' input tokens ≈ `N² × avg_message_tokens / 2`
+
+For a 20-turn conversation with ~500 tokens/turn:
+- Naïve estimate: `20 × 500 = 10,000` input tokens
+- Actual total: `~20 × 21 / 2 × 500 = 105,000` input tokens — **10× more than the naïve estimate**
+
+### Live API vs Estimation
+
+Gemini's live `usage_metadata` response always returns the real token count:
+
+```python
+response = await client.generate_content_async(messages)
+# Always accurate — includes full history
+input_tokens = response.usage_metadata.prompt_token_count
+output_tokens = response.usage_metadata.candidates_token_count
+```
+
+The problem only arises in:
+1. **Backfill scripts** that reconstruct historical usage from database logs — they often estimate `len(message_text) / 4` as the input token count for each turn
+2. **Pre-call estimation** (preflight) for budget enforcement — must estimate accumulated context, not just the current message
+
+### Correct Estimation Formula for Backfill
+
+When reconstructing historical usage without the original API response:
+
+```python
+def estimate_turn_input_tokens(turn_index: int, avg_turn_tokens: int = 500) -> int:
+    """
+    Estimate input tokens for a conversation turn.
+    Includes base template + all prior turns.
+    """
+    base_template_tokens = 1500  # system prompt + boilerplate
+    return base_template_tokens + (turn_index * avg_turn_tokens)
+
+# Total for a conversation:
+total_input = sum(estimate_turn_input_tokens(i) for i in range(num_turns))
+```
+
+This still under-estimates (assumes constant avg_turn_tokens), but is far closer to reality than `len(text) / 4`.
+
+### Implications for Cost Tracking and Budgeting
+
+1. **Budget enforcement via preflight** must account for accumulated context:
+   - Short early conversations are cheap to estimate
+   - Long conversations (10+ turns) are expensive — budget may be exceeded faster than expected
+
+2. **Per-user cost analytics** appear to show low costs early in the month but spike as conversations accumulate history
+
+3. **RAG-based conversations** (inserting retrieved documents as context) amplify the effect — each retrieval adds thousands of tokens that accumulate across turns
+
+4. **Mitigation strategies**:
+   - Implement conversation summarization after N turns (replace full history with a summary)
+   - Set a maximum conversation history window (e.g., last 10 turns only)
+   - Track `conversation_turn_count` in the database and warn users when conversations grow expensive
+
+### Gemini Pricing Tier (200k tokens)
+
+Gemini has a pricing split at 200k input tokens per request. Conversations that grow beyond this threshold have the tokens above 200k billed at a higher rate. For cost tracking:
+
+```python
+GEMINI_PRICE_TIER_BOUNDARY = 200_000
+
+def compute_cost(model: str, input_tokens: int, output_tokens: int) -> float:
+    if input_tokens <= GEMINI_PRICE_TIER_BOUNDARY:
+        input_cost = input_tokens * PRICE_PER_TOKEN_BELOW_200K[model]
+    else:
+        input_cost = (GEMINI_PRICE_TIER_BOUNDARY * PRICE_PER_TOKEN_BELOW_200K[model] + 
+                      (input_tokens - GEMINI_PRICE_TIER_BOUNDARY) * PRICE_PER_TOKEN_ABOVE_200K[model])
+    return input_cost + output_tokens * OUTPUT_PRICE[model]
+```
+
+## Related Concepts
+
+- [[wiki/concepts/preflight-record-pattern]] — preflight/record pattern for AI cost tracking; preflight must estimate accumulated context
+- [[wiki/concepts/litellm-pricing-source]] — LiteLLM as pricing source for Gemini models
+- [[wiki/tech-patterns/python-ai-agents]] — Python AI agent patterns where conversation history management is relevant
+- [[wiki/concepts/lazy-user-mirror]] — user mirror pattern in cost tracker; per-user cost shows the quadratic growth effect
+
+## Sources
+
+- [[daily/2026-04-24.md]] — Semblance usage backfill script was estimating input tokens as `len(output_text) / 4` for both input AND output — 60× underestimate because it ignored accumulated history; rewritten to use `base_template (~1500 tok) + cumulative conversation history per turn`; Gemini 200k pricing tier handled as a split calculation
--- a/wiki/concepts/python-iso-z-suffix.md
+++ b/wiki/concepts/python-iso-z-suffix.md
@ -0,0 +1,107 @@
+---
+title: "Python — fromisoformat() Cannot Parse Z Suffix (Python < 3.11)"
+aliases: [python-z-suffix, python-fromisoformat-z, js-iso-python-interop, python-utc-z]
+tags: [python, javascript, datetime, interop, api, debugging]
+sources:
+  - "daily/2026-04-24.md"
+created: 2026-04-24
+updated: 2026-04-24
+---
+
+# Python — fromisoformat() Cannot Parse Z Suffix (Python < 3.11)
+
+JavaScript's `Date.toISOString()` always appends `Z` to ISO 8601 strings (e.g., `2026-04-24T11:00:00.000Z`). Python's `datetime.fromisoformat()` cannot parse the `Z` suffix in Python versions before 3.11 — it raises `ValueError`. This causes silent 500 errors whenever a JS frontend sends a date string to a Python backend that uses `fromisoformat()` for parsing.
+
+## Key Points
+
+- **JS `toISOString()` always outputs `Z` suffix** — there is no way to suppress this; every date sent from a browser will have `Z`
+- **`datetime.fromisoformat("2026-04-24T11:00:00.000Z")` fails in Python < 3.11** — raises `ValueError: Invalid isoformat string`
+- **Python 3.11+** fixed this: `fromisoformat()` now accepts `Z` as a valid UTC offset
+- The failure is **silent in API endpoints** — the ValueError is caught by the framework as a 500, often with no log output visible in the response
+- The filter/query parameters sent from JS date pickers are the most common trigger: `from=2026-04-01T00:00:00.000Z&to=2026-04-30T23:59:59.999Z`
+
+## Details
+
+### The Problem
+
+```python
+# Python 3.9 / 3.10 — FAILS
+from datetime import datetime
+datetime.fromisoformat("2026-04-24T11:00:00.000Z")
+# ValueError: Invalid isoformat string: '2026-04-24T11:00:00.000Z'
+
+# Python 3.11+ — WORKS
+datetime.fromisoformat("2026-04-24T11:00:00.000Z")
+# datetime(2026, 4, 24, 11, 0, tzinfo=timezone.utc)
+```
+
+The `Z` suffix is valid ISO 8601 for UTC, but Python's stdlib didn't support it in `fromisoformat()` until 3.11 (PEP 683).
+
+### The Fix: Normalize Before Parsing
+
+Replace `Z` with `+00:00` before passing to `fromisoformat()`:
+
+```python
+def _parse_iso(s: str) -> datetime:
+    """Parse ISO string from JS toISOString() — handles Z suffix across all Python versions."""
+    if s.endswith("Z"):
+        s = s[:-1] + "+00:00"
+    return datetime.fromisoformat(s)
+```
+
+Or use a single-line approach:
+
+```python
+from datetime import datetime, timezone
+dt = datetime.fromisoformat(s.replace("Z", "+00:00"))
+```
+
+### Why the Failure Is Silent
+
+In a Quart/Flask/FastAPI endpoint:
+```python
+@app.route("/api/usage")
+async def get_usage():
+    from_str = request.args.get("from")
+    from_dt = datetime.fromisoformat(from_str)  # raises ValueError if Z suffix
+    ...
+```
+
+The `ValueError` bubbles up as an unhandled exception → the framework catches it as a 500 Internal Server Error. The frontend receives a 500 with no body (or a generic error page). The filter appears to "not work" — but the API is actually crashing on date parsing.
+
+This is especially insidious because:
+1. The endpoint works in local dev if dates are typed manually without `Z`
+2. The endpoint works in Python 3.11+ without any code change
+3. The 500 may not appear in application logs if the exception handler swallows it
+
+### Real Incident (2026-04-24)
+
+Semblance admin dashboard period selector (React + TypeScript) sent dates via `new Date(startDate).toISOString()` as query parameters. Every request with a `from` or `to` date filter returned 500. The admin usage summary showed 0 rows. Root cause: Quart backend running Python 3.9, `datetime.fromisoformat()` rejecting `Z` suffix. Fix: `_parse_iso()` helper with `Z → +00:00` substitution.
+
+### Period Filter Fallback Bug
+
+The same endpoint had a related bug: when no `from`/`to` params were sent (meaning "All time"), the code defaulted to `_month_start()` — silently filtering to the current month instead of returning all records. The correct behavior for absent params is no date filter:
+
+```python
+def _period_match(from_str, to_str):
+    if not from_str and not to_str:
+        return {}  # no filter — return all records
+    filters = {}
+    if from_str:
+        filters["created_at__gte"] = _parse_iso(from_str)
+    if to_str:
+        filters["created_at__lte"] = _parse_iso(to_str)
+    return filters
+```
+
+Both bugs together caused the "All time" view to show current-month data and the date filter to always 500.
+
+## Related Concepts
+
+- [[wiki/concepts/preflight-record-pattern]] — AI cost tracking that uses datetime for `effective_from`/`to` fields
+- [[wiki/concepts/openai-max-completion-tokens]] — another silent API parameter mismatch that causes hard errors
+- [[wiki/tech-patterns/fastapi-python-docker]] — FastAPI Python backend where date parsing happens at the API boundary
+
+## Sources
+
+- [[daily/2026-04-24.md]] — Semblance admin dashboard period filter returning 500 on all date-filtered requests; root cause was Python 3.9 `fromisoformat()` rejecting `Z` suffix from JS `toISOString()`; fix: `_parse_iso()` helper with `Z → +00:00` substitution; additional bug: missing params defaulted to current month instead of no filter
--- a/wiki/log.md
+++ b/wiki/log.md
@ -109,3 +109,9 @@
 - Source: daily/2026-04-28.md
 - Articles created: [[wiki/concepts/homarr-sqlite-integration-cleanup]], [[wiki/concepts/docker-lxc-dns-configuration]], [[wiki/concepts/mac-address-randomization-dhcp]], [[wiki/concepts/prowlarr-flaresolverr-limitation]], [[wiki/connections/docker-dns-adguard-split-horizon]]
 - Articles updated: [[wiki/concepts/adguard-blocklist-setup]] (added OOM risk section: exit code 137, RAM recommendations, fix); [[wiki/concepts/_index]] (concepts 47→51); [[wiki/connections/_index]] (connections 8→9); [[wiki/_master-index]] (concepts 47→51, connections 8→9)
+
+## [2026-04-28T22:11:43+01:00] compile | 2026-04-24.md
+- Source: daily/2026-04-24.md
+- Articles created: [[wiki/concepts/bash-and-or-short-circuit]], [[wiki/concepts/python-iso-z-suffix]], [[wiki/concepts/gemini-conversation-cost-scaling]]
+- Articles updated: [[wiki/concepts/_index]] (concepts 51→54); [[wiki/_master-index]] (concepts 51→54)
+- Note: Kling tech-pattern article already up-to-date from prior session; dotfiles Ghostty gotchas deferred (low reuse value vs 3 broadly applicable concepts)