From a3e8bb2fcae7e3f037dd024ae512fd4003a538b4 Mon Sep 17 00:00:00 2001 From: Vadym Samoilenko Date: Mon, 18 May 2026 13:36:14 +0100 Subject: [PATCH] wiki: auto-compile 2026-05-18 (1 log(s), 317 articles) --- wiki/_master-index.md | 6 +- wiki/architecture/_index.md | 42 +++++++++++++ wiki/concepts/_index.md | 3 +- .../gemini-preview-vs-stable-rate-limits.md | 63 +++++++++++++++++++ wiki/log.md | 9 +++ wiki/mistakes/_index.md | 1 + wiki/mistakes/gemini.md | 25 ++++++++ 7 files changed, 145 insertions(+), 4 deletions(-) create mode 100644 wiki/concepts/gemini-preview-vs-stable-rate-limits.md create mode 100644 wiki/mistakes/gemini.md diff --git a/wiki/_master-index.md b/wiki/_master-index.md index 9781898..01b1438 100644 --- a/wiki/_master-index.md +++ b/wiki/_master-index.md @@ -21,9 +21,9 @@ This 3-hop pattern works for hundreds of articles without vector search. | [[wiki/obsidian-rag/_index\|obsidian-rag/]] | Karpathy's LLM wiki method — Obsidian RAG, setup, vs true RAG | 3 | | [[wiki/projects-overview/_index\|projects-overview/]] | All 42 Oliver Agency projects — grouped by server (optical-web-1, optical-dev, baic, box-cli) | 1 | | [[wiki/tech-patterns/_index\|tech-patterns/]] | Recurring tech stacks: FastAPI, React/Vite, Next.js, Azure AD, AI, Box, One2Edit, Redis/Celery, cost-tracker, OMG API, Payload CMS | 39 | -| [[wiki/architecture/_index\|architecture/]] | Cross-cutting architectural patterns: Docker Compose, multi-agent AI, GCP timeout, RAG, hotfolder, optical-dev deploy, cost-tracker, new-project checklist, troubleshooting playbooks, ADR log, Cloud Run Jobs | 11 | +| [[wiki/architecture/_index\|architecture/]] | Cross-cutting architectural patterns: Docker Compose, multi-agent AI, GCP timeout, RAG, hotfolder, optical-dev deploy, cost-tracker, new-project checklist, troubleshooting playbooks, ADR log, Cloud Run Jobs | 53 | | [[wiki/client-knowledge/_index\|client-knowledge/]] | Per-client notes for Ford, H&M, L'Oréal, Barclays, Ferrero, 3M, BAIC | 7 | -| [[wiki/concepts/_index\|concepts/]] | Atomic knowledge extracted from Claude Code sessions | 201 | +| [[wiki/concepts/_index\|concepts/]] | Atomic knowledge extracted from Claude Code sessions | 202 | | [[wiki/connections/_index\|connections/]] | Cross-cutting insights linking 2+ concepts: FastAPI+Azure AD+Docker trinity, AI→cost-tracker, Apache+Vite basePath, GCP→REST polling, Box+hotfolder, Docker DNS+AdGuard, Celery prefork×faster_whisper memory stacking | 10 | | [[wiki/qa/_index\|qa/]] | Filed answers to queries (saved with `--file-back`) | 0 | | [[wiki/homelab/_index\|homelab/]] | Self-hosted infra: Proxmox install, IOMMU/PCI passthrough, hypervisor setup, budget builds, HP Elitedesk G3, Homarr API + Apps + Boards + Certificates + Integrations + Settings + Tasks + AdGuard + Clock + Docker Stats + Docker Integration + Download Client + Firewall + Proxmox Integration + Radarr + Readarr + Sonarr + Bookmarks + Calendar + Icons + App Widget + Weather + GitHub + Nextcloud + qBittorrent + RSS Feed + Speedtest Tracker + System Health Monitoring + System Resources + Services Map + Media Stack | 43 | @@ -38,7 +38,7 @@ This 3-hop pattern works for hundreds of articles without vector search. | [[wiki/payloadcms/_index\|payloadcms/]] | Full Payload CMS reference — getting started, config, database (Postgres/MongoDB/SQLite), all 22 field types, access control, hooks, authentication (cookies, JWT, API keys, custom strategies, token data), admin UI, custom components, Lexical rich text, live preview, versions/drafts/autosave, Local/REST/GraphQL APIs, queries, plugins, jobs queue, upload, storage adapters (S3/R2/GCS/Azure/Vercel Blob), ecommerce, production deploy, TypeScript, migration guides, i18n, localization, hierarchy, trash/soft-delete, troubleshooting | 147 | | [[wiki/shared-patterns/_index\|shared-patterns/]] | Oliver Agency standard library patterns: httpx, structlog, pydantic-settings, alembic — reuse before writing from scratch | 4 | -| [[wiki/mistakes/_index\|mistakes/]] | Anti-patterns extracted from sessions — per-stack running lists (fastapi, react, docker, postgres, general) — injected at session start | 5 | +| [[wiki/mistakes/_index\|mistakes/]] | Anti-patterns extracted from sessions — per-stack running lists (fastapi, react, docker, postgres, general, gemini) — injected at session start | 6 | diff --git a/wiki/architecture/_index.md b/wiki/architecture/_index.md index 1a7ea78..8ab7fd6 100644 --- a/wiki/architecture/_index.md +++ b/wiki/architecture/_index.md @@ -25,6 +25,48 @@ Cross-cutting architectural decisions that appear in multiple Oliver projects. | [[wiki/architecture/troubleshooting-playbooks\|troubleshooting-playbooks]] | Failure → diagnosis → fix for FastAPI, Docker, React/Vite, Azure AD, Apache, PostgreSQL | All Oliver projects | | [[wiki/architecture/adr-log\|adr-log]] | Architecture Decision Records — why HTTP polling, Docker Compose, FastAPI, Azure AD, cost tracker were chosen | All Oliver projects | | [[wiki/architecture/cloud-run-jobs-celery\|cloud-run-jobs-celery]] | Moving heavy Celery tasks (ffmpeg, TTS, Whisper) to Cloud Run Jobs — finite execution, pay-per-use, env-specific compose, chain dispatch pattern | Video Accessibility | +| [[wiki/architecture/3m-portal-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/3m-portal (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/Barclays-banner-builder-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/Barclays-banner-builder (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/DevOps_Click_UP_sync-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/DevOps_Click_UP_sync (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/Oliver-ai-bot_2.0-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/Oliver-ai-bot_2.0 (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/ac-helper-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/ac-helper (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/ac-tool-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/ac-tool (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/amazon-transcreation-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/amazon-transcreation (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/baic_dashboard-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/baic_dashboard (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/barclays-rag-report-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/barclays-rag-report (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/build-a-squad-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/build-a-squad (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/cc-dashboard-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/cc-dashboard (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/cinema-studio-pro-kling-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/cinema-studio-pro-kling (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/cinema-studio-pro-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/cinema-studio-pro (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/enterprise-ai-hub-nexus-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/enterprise-ai-hub-nexus (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/ferrero-ac-creator-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/ferrero-ac-creator (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/ford-gechub-sftp-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/ford-gechub-sftp (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/ford_qc-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/ford_qc (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/gmal-scope-builder-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/gmal-scope-builder (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/hm-o2e-tool-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/hm-o2e-tool (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/hm_ems_report-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/hm_ems_report (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/homepage-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/homepage (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/hp-prod-tracker-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/hp-prod-tracker (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/loreal-global-kickoff-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/loreal-global-kickoff (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/loreal-sla-calculator-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/loreal-sla-calculator (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/loreal-timelog-viewer-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/loreal-timelog-viewer (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/lusa-back-planner-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/lusa-back-planner (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/modcomms-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/modcomms (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/olivas-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/olivas (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/oliver-ai-assistant-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/oliver-ai-assistant (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/oliver-sales-ops-platform-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/oliver-sales-ops-platform (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/pdf-accessibility-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/pdf-accessibility (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/pimco-charts-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/pimco-charts (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/ppt-tool-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/ppt-tool (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/presenton-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/presenton (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/sandbox-notebookllamalm-nextjs-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/sandbox-notebookllamalm-nextjs (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/semblance-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/semblance (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/smartcrop26-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/smartcrop26 (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/social-reporting-tool-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/social-reporting-tool (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/solventum-image-metadata-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/solventum-image-metadata (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/video-accessibility-old-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/video-accessibility-old (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/video-accessibility-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/video-accessibility (2026-05-18)]] | - cluster-only mode — file stats not available | — | +| [[wiki/architecture/wsj-filenaming-structure\|Graph Report - /Users/ai_leed/Documents/Projects/Oliver/wsj-filenaming (2026-05-18)]] | - cluster-only mode — file stats not available | — | ## Key Architectural Decisions diff --git a/wiki/concepts/_index.md b/wiki/concepts/_index.md index 28edbeb..b14d952 100644 --- a/wiki/concepts/_index.md +++ b/wiki/concepts/_index.md @@ -9,7 +9,7 @@ updated: 2026-05-15 Atomic knowledge articles from real Oliver Agency sessions. Each article = one specific gotcha, bug, or non-obvious behaviour discovered in production or testing. -**201 articles** — use Obsidian search or `grep` to find by keyword. +**202 articles** — use Obsidian search or `grep` to find by keyword. | Article | Title | Created | |---------|-------|---------| @@ -86,6 +86,7 @@ Atomic knowledge articles from real Oliver Agency sessions. Each article = one s | [[wiki/concepts/gcs-resumable-upload-pattern\|gcs-resumable-upload-pattern]] | GCS Resumable Upload Pattern | 2026-04-30 | | [[wiki/concepts/gemini-conversation-cost-scaling\|gemini-conversation-cost-scaling]] | Gemini — Conversation History Causes Quadratic Token Cost Growth | 2026-04-24 | | [[wiki/concepts/gemini-embedding-api-channel\|gemini-embedding-api-channel]] | Gemini Embedding Models Split by API Channel | 2026-04-29 | +| [[wiki/concepts/gemini-preview-vs-stable-rate-limits\|gemini-preview-vs-stable-rate-limits]] | Gemini API — Preview Models Have Stricter Rate Limits Than Stable | 2026-05-18 | | [[wiki/concepts/git-includeif-per-remote\|git-includeif-per-remote]] | Git — Per-Remote Identity with includeIf | 2026-04-23 | | [[wiki/concepts/git-worktrees-parallel-claude\|git-worktrees-parallel-claude]] | Git Worktrees — Parallel Claude Sessions | — | | [[wiki/concepts/glance-dashboard-config\|glance-dashboard-config]] | Glance Dashboard — Config Patterns and Homelab Setup | 2026-04-29 | diff --git a/wiki/concepts/gemini-preview-vs-stable-rate-limits.md b/wiki/concepts/gemini-preview-vs-stable-rate-limits.md new file mode 100644 index 0000000..fdf0378 --- /dev/null +++ b/wiki/concepts/gemini-preview-vs-stable-rate-limits.md @@ -0,0 +1,63 @@ +--- +title: "Gemini API — Preview Models Have Stricter Rate Limits Than Stable" +aliases: [gemini-preview-rate-limits, gemini-rpm-limits, gemini-429-preview] +tags: [gemini, google-ai, rate-limits, multi-agent, baic, debugging] +sources: + - "daily/2026-05-18.md" +created: 2026-05-18 +updated: 2026-05-18 +--- + +# Gemini API — Preview Models Have Stricter Rate Limits Than Stable + +Models with `-preview` in their name (e.g., `gemini-3.1-pro-preview`, `gemini-3.1-flash-preview`) have **5–10× stricter RPM limits** than their stable counterparts. This limit is not prominently documented and causes 429 / 504 errors that look like transient network issues. + +## The Pattern + +```python +# WRONG — preview model hits rate limits fast in multi-agent setups +model = "gemini-3.1-pro-preview" + +# CORRECT — stable model has much higher RPM ceiling +model = "gemini-3.1-flash-lite" # or "gemini-3.1-pro" (stable) +``` + +## Why It Happens + +- Google uses preview models to test new capabilities under controlled load +- Rate limits on preview models are intentionally low to prevent over-reliance before GA +- In a multi-agent system, N parallel agents each make independent API calls — RPM exposure multiplies by N +- The result is a burst of 429s which the API gateway surfaces as 504 after timeout + +## Diagnostic Signals + +1. `429 Too Many Requests` — immediate rate limit hit +2. `504 Gateway Timeout` — rate limit reached, upstream gave up waiting +3. Errors appear only under parallel load (single-agent tests pass fine) +4. Errors correlate with `-preview` model IDs in logs + +## Fix + +Replace `-preview` model IDs with stable equivalents: + +| Preview (avoid in prod) | Stable replacement | +|------------------------|-------------------| +| `gemini-3.1-pro-preview` | `gemini-3.1-pro` | +| `gemini-3.1-flash-preview` | `gemini-3.1-flash-lite` | +| `gemini-2.5-pro-preview-*` | `gemini-2.5-pro` (when GA) | + +## Multi-Agent Amplification + +In a multi-agent system with N concurrent agents: +- Each agent independently counts toward the shared RPM quota +- A 10 RPM preview limit with 5 parallel agents = effectively 2 RPM per agent +- Use stable models in production; preview models only in single-agent dev/test + +## Related Concepts + +- [[wiki/concepts/gemini-conversation-cost-scaling]] — token cost per conversation turn in Gemini +- [[wiki/llm-models/_index]] — full Gemini model catalog with stable IDs + +## Sources + +- [[daily/2026-05-18.md]] — Discovered during BAIC multi-agent system debugging; 504/429 errors caused by parallel agents all using `gemini-3.1-pro-preview`; fixed by switching to `gemini-3.1-flash-lite` diff --git a/wiki/log.md b/wiki/log.md index c90fcee..71935c1 100644 --- a/wiki/log.md +++ b/wiki/log.md @@ -1,6 +1,15 @@ # Build Log +## [2026-05-18T21:00:00+01:00] compile | daily/2026-05-18.md — pass 2 (session 13:18 extraction) +- Source: daily/2026-05-18.md (session 13:18 — BAIC Gemini 504/429 debugging) +- Articles created (1): + - [[wiki/concepts/gemini-preview-vs-stable-rate-limits]] — Gemini preview models have 5-10× stricter RPM limits; parallel agents amplify; fix: switch to stable model IDs +- Mistakes created (1): + - [[wiki/mistakes/gemini]] — new stack file; entry: 2026-05-18 preview-vs-stable-confusion +- Index updates: concepts/_index.md (201→202), mistakes/_index.md (+gemini row), _master-index.md (concepts 201→202, mistakes 5→6) +- Note: pass 1 (09:40) was no-op due to flush error; session 13:18 FLUSH_OK entry was under-evaluated by flush LLM — extracted manually in pass 2 + ## [2026-05-18T09:40:00+01:00] compile | daily/2026-05-18.md — no-op (flush error) - Source: daily/2026-05-18.md - Articles created: (none — log empty, flush hook exited with code 1 at 09:26) diff --git a/wiki/mistakes/_index.md b/wiki/mistakes/_index.md index b4c2532..a0e339e 100644 --- a/wiki/mistakes/_index.md +++ b/wiki/mistakes/_index.md @@ -16,3 +16,4 @@ Anti-patterns and time-wasters extracted from real debugging sessions. | Docker / Infra | [[wiki/mistakes/docker]] | — | | PostgreSQL / Alembic | [[wiki/mistakes/postgres]] | — | | General / Cross-stack | [[wiki/mistakes/general]] | — | +| Gemini / Google AI | [[wiki/mistakes/gemini]] | 2026-05-18 | diff --git a/wiki/mistakes/gemini.md b/wiki/mistakes/gemini.md new file mode 100644 index 0000000..124a617 --- /dev/null +++ b/wiki/mistakes/gemini.md @@ -0,0 +1,25 @@ +--- +title: "Gemini / Google AI Mistakes" +tags: [gemini, google-ai, mistakes] +updated: 2026-05-18 +--- + +# Gemini / Google AI — Mistakes to Avoid + +Running list. Newest first. Auto-populated from session flush. + +--- + + + +## 2026-05-18 — preview-vs-stable-confusion + +**Mistake:** Used `-preview` model IDs (e.g., `gemini-3.1-pro-preview`) in a production multi-agent system. + +**Symptom:** 504/429 errors appearing only under parallel load; single-agent tests passed fine. + +**Root cause:** Preview models have 5–10× stricter RPM limits than stable models. N parallel agents multiply the RPM exposure by N, instantly saturating the quota. + +**Fix:** Replace all `-preview` model IDs with stable counterparts (`gemini-3.1-flash-lite`, `gemini-3.1-pro`). + +**See also:** [[wiki/concepts/gemini-preview-vs-stable-rate-limits]]