From aed5c5d7a48819e19b7ebca9547e31b2210580f4 Mon Sep 17 00:00:00 2001
From: Vadym Samoilenko <vadymsamoilenko@oliver.agency>
Date: Sat, 23 May 2026 18:59:28 +0100
Subject: [PATCH] chore: update docs to trigger CI/CD test

---
 CLAUDE.md | 186 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 136 insertions(+), 50 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 644f5a80..e0c13334 100755
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -3,88 +3,174 @@
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 
 ## Commands
-- **Dev Server**: `npm run dev` (port 5173, proxies `/api` → `localhost:5137`)
-- **Build**: `npm run build` (use this to verify TypeScript compilation)
-- **Dev Build**: `npm run build:dev` (development mode build)
-- **Lint**: `npm run lint`
-- **Backend**: `cd backend && python run.py` (Hypercorn ASGI on port 5137)
 
-## Backend Testing
-After modifying any Python files:
+### Frontend
+- **Dev server**: `npm run dev` — Vite on port 5173, proxies `/api` → `localhost:5137`
+- **Production build**: `npm run build`
+- **Dev build**: `npm run build:dev`
+- **Lint**: `npm run lint`
+
+### Backend
+- **Start**: `cd backend && source venv/bin/activate && python run.py` — Hypercorn ASGI on port 5137
+- **Both at once**: `./start.sh`
+
+### Backend sanity checks (after modifying Python files)
 ```bash
 source backend/venv/bin/activate
-python -c "import app.services.module_name"        # Test specific module
-python -c "from app import create_app; create_app()"  # Test app creation
+python -c "import app.services.<module_name>"
+python -c "from app import create_app; create_app()"
+```
+
+### Docker (production-style)
+```bash
+# Build frontend and copy to web root
+docker compose --profile build up frontend
+
+# Run MongoDB + backend
+docker compose up mongo backend
 ```
 
 ## Architecture Overview
 
 ### ASGI Stack (critical detail)
-`create_app()` returns a **`socketio.ASGIApp`** wrapping a Quart app — not the Quart app itself. Accessing `app.quart_app` gives the inner Quart instance. This distinction matters whenever you write ASGI middleware or access app config directly.
+`create_app()` returns a **`socketio.ASGIApp`** wrapping the Quart app — not the Quart app itself. Access `asgi_app.quart_app` for the inner Quart instance. This distinction matters in ASGI middleware and anywhere you access `app.config` directly.
 
 ### Real-Time Communication
-Socket.IO via `python-socketio` `AsyncServer` (ASGI mode). The `WebSocketContextNew.tsx` context manages the client connection. `websocket_manager_async.py` handles room-based messaging for focus group sessions. The WebSocket manager must call `ws_mgr.set_main_loop(asyncio.get_running_loop())` at startup so that cross-thread emits from the AI Runner land on the right loop.
+Socket.IO via `python-socketio` `AsyncServer` (ASGI mode). Frontend: `WebSocketContextNew.tsx` context → `websocketServiceNew.ts`. Backend: `websocket_manager_async.py` manages room-based messaging per focus group session.
 
-> `VITE_ENABLE_WEBSOCKET` is hardcoded `true` in dev and `false` in production builds via `vite.config.ts` — it is not controlled by `.env`.
+`VITE_ENABLE_WEBSOCKET` is hardcoded by `vite.config.ts` to `true` in dev and `false` in production — it is **not** controlled by `.env`.
 
-### AI Runner + Threading
-`ai_runner_service.py` is a singleton that owns a **dedicated OS thread** with a single asyncio event loop. All autonomous AI conversations run in this thread. This solves Motor (AsyncIOMotorClient) event-loop affinity: Motor clients in the AI runner are bound to that loop, while regular API routes use synchronous PyMongo. Never share Motor clients between the two contexts.
+At startup `ws_mgr.set_main_loop(asyncio.get_running_loop())` must be called (done in `before_serving`) so cross-thread emits from the AI Runner land on the correct loop.
+
+### AI Runner + Threading (Motor event-loop affinity)
+`ai_runner_service.py` is a singleton owning a **dedicated OS thread** with a single asyncio event loop. All autonomous AI conversations run there.
+
+- The AI runner creates its own `AsyncIOMotorClient` bound to that thread's loop.
+- Regular API routes use synchronous `PyMongo` (from `app/db.py`).
+- **Never share Motor clients between the AI runner thread and the ASGI/Quart thread.**
 
 ### Autonomous Conversation Pipeline
-1. `ai_runner_service.py` — spawns coroutines on the dedicated thread's event loop
-2. `autonomous_conversation_controller.py` — orchestrates the full session
-3. `conversation_decision_service.py` — picks the next speaker
-4. `conversation_context_service.py` — maintains history/state
-5. `conversation_state_manager.py` — in-memory state across turns
+```
+ai_runner_service.py          — spawns coroutines on the dedicated loop
+autonomous_conversation_controller.py — orchestrates the session
+conversation_decision_service.py      — picks next speaker, wraps up
+conversation_context_service.py       — maintains history/context window
+conversation_state_manager.py         — in-memory state across turns
+```
 
 ### Task Manager
-`task_manager.py` is a singleton tracking cancellable asyncio tasks (persona generation, discussion guides, etc.). Tasks are exposed via `/api/tasks` routes. A background sweeper cleans up completed/expired tasks. Frontend polling is handled by `useTaskPolling.ts`.
+`task_manager.py` singleton tracks cancellable asyncio tasks (persona generation, discussion guides, bulk exports). Exposed via `/api/tasks`. Frontend polls with `useTaskPolling.ts` / `src/lib/taskPolling.ts`. A background sweeper cleans up expired tasks.
+
+Long-running AI operations return `task_id` immediately (HTTP 202); the caller polls `/api/tasks/<task_id>` for progress. `aiPersonasApi.generatePersonasFull` is the canonical example — 10 s timeout on the kick-off call, then polling.
+
+### Persona Generation — Two-Stage Pipeline
+1. **Stage 1** (`/ai-personas/generate-basic-profiles`) — generates lightweight profiles from an audience brief; returns `task_id` immediately.
+2. **Stage 2** (`/ai-personas/complete-and-save-persona`) — runs in parallel per profile to add full psychographic/behavioral detail and persist to MongoDB.
+
+`aiPersonasApi.batchGenerateWithStages` in `src/lib/api.ts` orchestrates this client-side via `Promise.allSettled`; partial success (some personas fail) is handled gracefully.
 
 ### LLM Integration
-`llm_service.py` creates fresh clients per call (avoids event-loop mismatch in ASGI). Default model: **Google Gemini** via `google-genai`. Alternative: **OpenAI** (`AsyncOpenAI`). Both require env vars `GEMINI_API_KEY` and `OPENAI_API_KEY` — startup fails if missing. Prompts are markdown templates in `/backend/prompts/` loaded by `prompt_loader.py`.
+`llm_service.py` creates fresh clients per call — avoids event-loop mismatch in ASGI. Default model: `gpt-5.4` (Azure AI Foundry via OpenAI-compatible endpoint). Mini tasks route to `gpt-5.4-mini`. Prompts are markdown templates in `backend/prompts/` loaded by `prompt_loader.py`.
+
+Azure endpoint: `https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/`  
+Both models deployed and sharing the same base URL. `AZURE_AI_API_KEY` is required at startup.
+
+Mini-routed features (via `LLMUsageContext`): `summary`, `conversation_decision`, `key_themes`, basic persona generation.  
+Main-routed features: `persona_response`, `moderator`, detailed persona gen/modification.
+
+### Usage & Quota Tracking
+`llm_usage_context.py` wraps LLM calls to record token usage as `UsageEvent` documents. `app/models/quota.py` defines per-user monthly USD limits (hard-cap safety net). The API returns HTTP **402** when a user's quota or credit balance is exceeded; `src/lib/api.ts` catches this and fires a `quota_exceeded` custom DOM event.
+
+### Credit System
+`credits_balance` on the `User` model, `credit_transactions` collection as ledger. Atomic deduction via `findAndModify` with `$gte` guard. Pricing config in `app_settings` collection (60s cache). Trial credits granted on registration. Stripe Checkout for credit pack purchases — webhook at `/api/billing/webhook`.
+
+Costs: persona creation = 2 cr, focus group run = 40 cr. Packs: Starter $49/50cr, Pro $199/220cr, Scale $499/600cr.
+
+### Authentication
+Custom JWT: `app/auth/quart_jwt.py` (not Flask-JWT-Extended — incompatible with Quart async). Email + password only (bcrypt). No SSO/Microsoft/MSAL. JWT stored in `localStorage` as `auth_token`; `src/lib/api.ts` attaches as Bearer and checks expiry before every request.
 
 ## Code Style
-- TypeScript with `strictNullChecks: false`
-- Functional components with hooks; local state via hooks, shared state via context/props
-- `@/` alias maps to `src/`
-- **URL construction**: always use `${import.meta.env.BASE_URL}asset.png` — production base is `/semblance/`
-- Error handling: try/catch + `sonner` toast for user feedback
 
-## File Organization
+- TypeScript with `strictNullChecks: false`
+- `@/` alias maps to `src/`
+- **Asset URLs**: always `${import.meta.env.BASE_URL}asset.png` — base is `/`
+- Error feedback: `sonner` toast library (`src/lib/toast.ts` wrapper)
+
+## File Organisation
+
 ```
 backend/
   app/
-    routes/          # Blueprints: auth, personas, focus-groups, ai-personas, focus-group-ai, folders, tasks
-    services/        # Business logic: llm_service, ai_runner_service, task_manager, autonomous_*, conversation_*
-    models/          # Data models: User, FocusGroup, Persona, Folder
-    auth/            # Auth utilities (JWT helpers)
-    prompts/         # LLM prompt markdown templates
-    websocket_manager_async.py  # Room-based async WebSocket manager
-    extensions.py    # socketio.AsyncServer singleton
+    routes/       auth, personas, focus_groups, ai_personas, focus_group_ai,
+                  folders, tasks, admin, usage, billing
+    services/     llm_service, ai_runner_service, task_manager,
+                  autonomous_conversation_controller, conversation_*,
+                  focus_group_*, persona_*, image_description_service,
+                  llm_usage_context, customer_data_service, stripe_service
+    models/       User, Persona, FocusGroup, Folder, UsageEvent, Quota,
+                  ModelPricing, AppSettings, CreditTransaction
+    auth/         quart_jwt.py — custom Quart-compatible JWT
+    utils/        prompt_loader.py, discussion_guide_schema.py, rate_limiter.py
+    prompts/      20 markdown LLM prompt templates
+    websocket_manager_async.py   room-based async WebSocket manager
+    extensions.py                socketio.AsyncServer singleton
 
 src/
-  pages/             # Route-level components (Dashboard, FocusGroups, FocusGroupSession, Login, SyntheticUsers)
+  pages/          Dashboard, FocusGroups, FocusGroupSession, Login,
+                  SyntheticUsers, Admin, MyUsage, Billing
   components/
-    focus-group-session/  # Session UI panels (Discussion, Participant, Themes, etc.)
-    persona/         # Persona management components
-    ui/              # shadcn-ui primitives
-  contexts/          # AuthContext, WebSocketContextNew, NavigationContext
-  hooks/             # useTaskPolling, useWebSocket, usePersonaStorage, useDiscussionGuideGeneration, etc.
-  types/             # TypeScript type definitions
+    focus-group-session/  DiscussionPanel, ParticipantPanel, ThemesPanel,
+                          AutonomousDashboard, DiscussionGuideViewer, …
+    persona/              PersonaEditor, PersonaProfile, PersonaModificationModal
+    admin/                UsersTab, UsageTab, PricingTab, AnalyticsTab, CreditSettingsTab
+    ui/                   shadcn-ui primitives + custom: GenerationProgressBar,
+                          BulkExportProgressModal, MentionInput
+  contexts/       AuthContext, WebSocketContextNew, NavigationContext
+  hooks/          useTaskPolling, useWebSocket, usePersonaStorage,
+                  useDiscussionGuideGeneration, useCancellableGeneration, …
+  lib/            api.ts (all API calls), taskPolling.ts, taskCancellation.ts
+  types/          persona.ts, cancellable.ts
+  utils/          avatarUtils, discussionGuideMarkdown, mentionUtils
 ```
 
 ## Environment Configuration
 
 | Setting | Development | Production |
 |---------|-------------|------------|
-| Base path | `/` | `/semblance/` |
-| API base | `/api` (proxied to 5137) | `https://optical-dev.oliver.solutions/semblance_back/api` |
-| WebSocket path | `/socket.io/` | `/semblance_back/socket.io/` |
-| MSAL redirect | `http://localhost:5173/` | `https://optical-dev.oliver.solutions/semblance` |
+| Base path | `/` | `/` |
+| API base | `/api` (proxied to 5137) | `/api` (Traefik routes to backend) |
+| WebSocket path | `/socket.io/` | `/socket.io/` |
 
-Setup: copy `.env.development` or `.env.production` to `.env`. Backend requires `backend/.env` with `SECRET_KEY`, `JWT_SECRET_KEY`, `GEMINI_API_KEY`, `OPENAI_API_KEY` — startup will throw `RuntimeError` if any are missing or use weak defaults.
+**Frontend**: copy `.env.development` or `.env.production` to `.env`.
 
-## Knowledge Wiki
-A cross-project knowledge base is maintained automatically from all Claude Code sessions.
-- **Index:** `/Users/ai_leed/Library/Mobile Documents/iCloud~md~obsidian/Documents/VadymSamoilenko/wiki/index.md`
-- **Query:** `cd ~/.claude/memory-compiler && uv run python scripts/query.py "your question"`
+**Backend** (`backend/.env` — required keys, see `backend/.env.example`):
+```
+MONGO_URI=mongodb://localhost:27017/cohorta_db
+SECRET_KEY=<random 32-byte hex>
+JWT_SECRET_KEY=<random 32-byte hex>
+AZURE_AI_ENDPOINT=https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/
+AZURE_AI_API_KEY=<rotated key from Azure portal>
+AZURE_AI_MODEL_MAIN=gpt-5.4
+AZURE_AI_MODEL_MINI=gpt-5.4-mini
+STRIPE_SECRET_KEY=<from Stripe dashboard>
+STRIPE_WEBHOOK_SECRET=<from Stripe dashboard>
+CORS_ALLOWED_ORIGINS=http://localhost:5173   # comma-separated in production
+```
+Generate secrets: `python3 -c "import secrets; print(secrets.token_hex(32))"`
+
+Startup throws `RuntimeError` for any missing or weak-default secret/API key.
+
+## Deployment
+
+Production target: **`cohorta.ai-impress.com`** on aimpress (OVH) server via Traefik.
+
+```bash
+# Phase 6: Docker Compose + Traefik at /opt/03-business/cohorta/
+docker compose up -d
+```
+
+Manual production backend start:
+```bash
+cd backend && source venv/bin/activate
+hypercorn "app:create_app()" --bind 0.0.0.0:5137
+```