vault backup: 2026-04-27 12:29:28

This commit is contained in:
Vadym Samoilenko 2026-04-27 12:29:28 +01:00
parent 05fd7519fb
commit 92929d3b41
2 changed files with 49 additions and 22 deletions

View file

@ -3,7 +3,7 @@ title: "Architecture Patterns Index"
description: "Cross-cutting architectural decisions across Oliver Agency projects"
tags: [index, architecture]
created: 2026-04-15
updated: 2026-04-15
updated: 2026-04-27
---
# Architecture Patterns
@ -19,15 +19,14 @@ Cross-cutting architectural decisions that appear in multiple Oliver projects.
| [[wiki/architecture/gcp-deployment-lb-timeout\|gcp-deployment-lb-timeout]] | GCP 30s LB timeout — WebSocket → HTTP polling fix | Mod Comms, Semblance |
| [[wiki/architecture/rag-architecture\|rag-architecture]] | RAG: Firecrawl → AI structuring → Qdrant → LLM synthesis | Enterprise Nexus, Sandbox NotebookLM |
| [[wiki/architecture/hotfolder-daemon\|hotfolder-daemon]] | Box folder monitoring daemon with systemd | Ford QC, Ford SFTP |
| [[wiki/architecture/optical-dev-server-deploy\|optical-dev-server-deploy]] | optical-dev Apache subpath pattern: single vhost, Include conf, port table, deploy script | All Oliver projects |
| [[wiki/architecture/ai-cost-tracker\|ai-cost-tracker]] | Shared AI cost tracker: Docker Compose, Workspace→Team→Project, preflight/record HTTP API, LiteLLM pricing, hard budget limits | All Oliver projects |
## Key Architectural Decisions
1. **Docker Compose** — default deployment for all multi-service projects
1. **Docker Compose** — default deployment for all multi-service projects on optical-dev
2. **HTTP polling over WebSocket** — mandatory on GCP (30s LB timeout)
3. **AI pre-structuring before RAG indexing** — improves retrieval quality
4. **Hotfolder + archive pattern** — prevents reprocessing in Box automations
5. **DEV_AUTH_BYPASS** — skip Azure AD in local dev, always use real auth in production
| [[wiki/architecture/optical-dev-server-deploy\|optical-dev-server-deploy]] | optical-dev GCP server: single-vhost Apache, Include pattern, port table, deploy script cache | Barclays Banner Builder, all Oliver projects |
| [[wiki/architecture/ai-cost-tracker\|ai-cost-tracker]] | Shared AI cost tracking service: Workspace→Team→Project, LiteLLM pricing, preflight+record SDK, hard limits | All Oliver projects |
5. **DEV_AUTH_BYPASS / dev login** — skip Azure AD in local/dev environment, real auth in production
6. **Cost tracking as cross-cutting concern** — every AI call preflight+record via ai-cost-tracker

View file

@ -9,6 +9,17 @@ updated: 2026-04-27
Centralised **shared service** that tracks AI API spend across all Oliver projects. Every project that calls Gemini, ElevenLabs, Google Cloud TTS, or other paid AI APIs sends usage events here.
## Live URLs
| Environment | URL |
|---|---|
| Dev (optical-dev) | `https://optical-dev.oliver.solutions/cost-tracker/` |
| API health | `https://optical-dev.oliver.solutions/cost-tracker/v1/health` |
| API docs | `https://optical-dev.oliver.solutions/cost-tracker/v1/docs` |
| Prod (future) | `https://cost.oliver.agency/` |
**Repo:** `git@bitbucket.org:zlalani/ai-cost-tracker.git`
## Why it exists
Oliver runs multiple projects (video-accessibility, One2Edit, Box-pipelines …) all consuming paid AI APIs. Without centralised tracking: no visibility into total spend, no per-client cost attribution, no budget enforcement.
@ -23,9 +34,9 @@ Oliver runs multiple projects (video-accessibility, One2Edit, Box-pipelines …)
│ AI call sites │ preflt │ FastAPI + MongoDB + Redis + React │
│ (Gemini, ElevenLabs, GCP TTS) │────────►│ POST /v1/preflight │
│ │ │ │ POST /v1/usage/record │
oliver_cost_tracker SDK │ │ POST /v1/users/upsert │
│ - preflight(estimate) │◄────────│ POST /v1/projects/upsert │
│ - record(actual) │ │ │
Direct HTTP calls (httpx) │ │ POST /v1/users/upsert │
│ - POST /v1/preflight │◄────────│ POST /v1/projects/upsert │
│ - POST /v1/usage/record │ │ │
│ - SQLite outbox + retry │ │ Admin UI (Microsoft SSO) │
└──────────────────────────────────┘ │ Workspaces / Teams / Projects │
│ Pricing + LiteLLM auto-sync │
@ -38,14 +49,15 @@ Oliver runs multiple projects (video-accessibility, One2Edit, Box-pipelines …)
| Decision | Choice | Why |
|---|---|---|
| Deployment | Separate repo + separate server | Clean isolation, independent scaling |
| Deployment | Separate repo + Docker Compose on optical-dev | Clean isolation, independent scaling |
| Org hierarchy | Workspace → Team → Project | Matches Oliver agency structure |
| User ownership | Each project owns users; lazy mirror in shared | No SSO migration needed |
| Pricing | LiteLLM auto-sync + YAML (non-LLM) + admin override | Auto-updated for LLMs, manual for chars |
| Transport | Sync HTTP + SQLite outbox fallback | Never breaks the AI pipeline |
| Budget enforcement | Hard limits via preflight check | `allow=false` before call is made |
| Auth (projects) | API key per project | Simple, revocable, auditable |
| Auth (admins) | Microsoft SSO | Consistent with all Oliver projects |
| Auth (projects) | API key per project (`X-API-Key` header) | Simple, revocable, auditable |
| Auth (admins) | Microsoft SSO (+ dev login for testing) | Consistent with all Oliver projects |
| Integration | Direct HTTP calls from client projects | No SDK package to maintain/publish |
## Org hierarchy
@ -60,12 +72,31 @@ Users live in each project (video-accessibility, etc.) and are **lazily mirrored
## Tech stack
Mirrors video-accessibility for team familiarity:
- Backend: **FastAPI + MongoDB Atlas + Redis + Celery**
- Frontend: **React 18 + Vite (TypeScript)**
- Auth admin: **Microsoft AAD (MSAL)**
- Charts: **recharts**
- Tables/pivot: **@tanstack/react-table**
- Backend: **FastAPI + MongoDB 7 + Redis 7 + Celery** (worker + beat)
- Frontend: **React 18 + Vite + TypeScript** — nginx inside Docker
- Auth admin: **Microsoft Azure AD OIDC** + dev login (APP_ENV=dev only)
- Deploy: **Docker Compose** on optical-dev, Apache reverse proxy at `/cost-tracker/`
## Docker Compose services
| Service | Port | Role |
|---|---|---|
| `api` | 8003→8001 | FastAPI + uvicorn (2 workers) |
| `frontend` | 5174→80 | React SPA + nginx |
| `mongodb` | internal | MongoDB 7 |
| `redis` | internal | Redis 7 |
| `celery-worker` | — | Async task processing |
| `celery-beat` | — | Scheduled tasks (pricing sync 02:00 UTC) |
## API endpoints (project-facing, `X-API-Key` auth)
| Method | Path | Purpose |
|---|---|---|
| `POST` | `/v1/preflight` | Check budget, get `request_id` |
| `POST` | `/v1/usage/record` | Record actual AI call cost |
| `POST` | `/v1/users/upsert` | Create/update user mirror |
| `POST` | `/v1/projects/upsert` | Create/update project mirror |
| `GET` | `/v1/health` | Liveness check |
## Related articles
@ -73,6 +104,3 @@ Mirrors video-accessibility for team familiarity:
- [[wiki/tech-patterns/cost-tracker-pricing-sources|cost-tracker-pricing-sources]] — how pricing is maintained
- [[wiki/tech-patterns/cost-tracker-providers|cost-tracker-providers]] — billing units per AI provider
- [[wiki/concepts/preflight-record-pattern|preflight-record-pattern]] — the core usage-tracking pattern
- [[wiki/concepts/lazy-user-mirror|lazy-user-mirror]] — how user sync works
- [[wiki/concepts/sync-with-outbox|sync-with-outbox]] — resilient HTTP calls with SQLite fallback
- [[wiki/projects-overview/ai-cost-tracker|ai-cost-tracker (project card)]] — project registry card