8.8 KiB
Executable file
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Commands
Frontend
- Dev server:
npm run dev— Vite on port 5173, proxies/api→localhost:5137 - Production build:
npm run build - Dev build:
npm run build:dev - Lint:
npm run lint
Backend
- Start:
cd backend && source venv/bin/activate && python run.py— Hypercorn ASGI on port 5137 - Both at once:
./start.sh
Backend sanity checks (after modifying Python files)
source backend/venv/bin/activate
python -c "import app.services.<module_name>"
python -c "from app import create_app; create_app()"
Docker (production-style)
# Build frontend and copy to web root
docker compose --profile build up frontend
# Run MongoDB + backend
docker compose up mongo backend
Architecture Overview
ASGI Stack (critical detail)
create_app() returns a socketio.ASGIApp wrapping the Quart app — not the Quart app itself. Access asgi_app.quart_app for the inner Quart instance. This distinction matters in ASGI middleware and anywhere you access app.config directly.
Real-Time Communication
Socket.IO via python-socketio AsyncServer (ASGI mode). Frontend: WebSocketContextNew.tsx context → websocketServiceNew.ts. Backend: websocket_manager_async.py manages room-based messaging per focus group session.
VITE_ENABLE_WEBSOCKET is hardcoded by vite.config.ts to true in dev and false in production — it is not controlled by .env.
At startup ws_mgr.set_main_loop(asyncio.get_running_loop()) must be called (done in before_serving) so cross-thread emits from the AI Runner land on the correct loop.
AI Runner + Threading (Motor event-loop affinity)
ai_runner_service.py is a singleton owning a dedicated OS thread with a single asyncio event loop. All autonomous AI conversations run there.
- The AI runner creates its own
AsyncIOMotorClientbound to that thread's loop. - Regular API routes use synchronous
PyMongo(fromapp/db.py). - Never share Motor clients between the AI runner thread and the ASGI/Quart thread.
Autonomous Conversation Pipeline
ai_runner_service.py — spawns coroutines on the dedicated loop
autonomous_conversation_controller.py — orchestrates the session
conversation_decision_service.py — picks next speaker, wraps up
conversation_context_service.py — maintains history/context window
conversation_state_manager.py — in-memory state across turns
Task Manager
task_manager.py singleton tracks cancellable asyncio tasks (persona generation, discussion guides, bulk exports). Exposed via /api/tasks. Frontend polls with useTaskPolling.ts / src/lib/taskPolling.ts. A background sweeper cleans up expired tasks.
Long-running AI operations return task_id immediately (HTTP 202); the caller polls /api/tasks/<task_id> for progress. aiPersonasApi.generatePersonasFull is the canonical example — 10 s timeout on the kick-off call, then polling.
Persona Generation — Two-Stage Pipeline
- Stage 1 (
/ai-personas/generate-basic-profiles) — generates lightweight profiles from an audience brief; returnstask_idimmediately. - Stage 2 (
/ai-personas/complete-and-save-persona) — runs in parallel per profile to add full psychographic/behavioral detail and persist to MongoDB.
aiPersonasApi.batchGenerateWithStages in src/lib/api.ts orchestrates this client-side via Promise.allSettled; partial success (some personas fail) is handled gracefully.
LLM Integration
llm_service.py creates fresh clients per call — avoids event-loop mismatch in ASGI. Default model: gpt-5.4 (Azure AI Foundry via OpenAI-compatible endpoint). Mini tasks route to gpt-5.4-mini. Prompts are markdown templates in backend/prompts/ loaded by prompt_loader.py.
Azure endpoint: https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/
Both models deployed and sharing the same base URL. AZURE_AI_API_KEY is required at startup.
Mini-routed features (via LLMUsageContext): summary, conversation_decision, key_themes, basic persona generation.
Main-routed features: persona_response, moderator, detailed persona gen/modification.
Usage & Quota Tracking
llm_usage_context.py wraps LLM calls to record token usage as UsageEvent documents. app/models/quota.py defines per-user monthly USD limits (hard-cap safety net). The API returns HTTP 402 when a user's quota or credit balance is exceeded; src/lib/api.ts catches this and fires a quota_exceeded custom DOM event.
Credit System
credits_balance on the User model, credit_transactions collection as ledger. Atomic deduction via findAndModify with $gte guard. Pricing config in app_settings collection (60s cache). Trial credits granted on registration. Stripe Checkout for credit pack purchases — webhook at /api/billing/webhook.
Costs: persona creation = 2 cr, focus group run = 40 cr. Packs: Starter $49/50cr, Pro $199/220cr, Scale $499/600cr.
Authentication
Custom JWT: app/auth/quart_jwt.py (not Flask-JWT-Extended — incompatible with Quart async). Email + password only (bcrypt). No SSO/Microsoft/MSAL. JWT stored in localStorage as auth_token; src/lib/api.ts attaches as Bearer and checks expiry before every request.
Code Style
- TypeScript with
strictNullChecks: false @/alias maps tosrc/- Asset URLs: always
${import.meta.env.BASE_URL}asset.png— base is/ - Error feedback:
sonnertoast library (src/lib/toast.tswrapper)
File Organisation
backend/
app/
routes/ auth, personas, focus_groups, ai_personas, focus_group_ai,
folders, tasks, admin, usage, billing
services/ llm_service, ai_runner_service, task_manager,
autonomous_conversation_controller, conversation_*,
focus_group_*, persona_*, image_description_service,
llm_usage_context, customer_data_service, stripe_service
models/ User, Persona, FocusGroup, Folder, UsageEvent, Quota,
ModelPricing, AppSettings, CreditTransaction
auth/ quart_jwt.py — custom Quart-compatible JWT
utils/ prompt_loader.py, discussion_guide_schema.py, rate_limiter.py
prompts/ 20 markdown LLM prompt templates
websocket_manager_async.py room-based async WebSocket manager
extensions.py socketio.AsyncServer singleton
src/
pages/ Dashboard, FocusGroups, FocusGroupSession, Login,
SyntheticUsers, Admin, MyUsage, Billing
components/
focus-group-session/ DiscussionPanel, ParticipantPanel, ThemesPanel,
AutonomousDashboard, DiscussionGuideViewer, …
persona/ PersonaEditor, PersonaProfile, PersonaModificationModal
admin/ UsersTab, UsageTab, PricingTab, AnalyticsTab, CreditSettingsTab
ui/ shadcn-ui primitives + custom: GenerationProgressBar,
BulkExportProgressModal, MentionInput
contexts/ AuthContext, WebSocketContextNew, NavigationContext
hooks/ useTaskPolling, useWebSocket, usePersonaStorage,
useDiscussionGuideGeneration, useCancellableGeneration, …
lib/ api.ts (all API calls), taskPolling.ts, taskCancellation.ts
types/ persona.ts, cancellable.ts
utils/ avatarUtils, discussionGuideMarkdown, mentionUtils
Environment Configuration
| Setting | Development | Production |
|---|---|---|
| Base path | / |
/ |
| API base | /api (proxied to 5137) |
/api (Traefik routes to backend) |
| WebSocket path | /socket.io/ |
/socket.io/ |
Frontend: copy .env.development or .env.production to .env.
Backend (backend/.env — required keys, see backend/.env.example):
MONGO_URI=mongodb://localhost:27017/cohorta_db
SECRET_KEY=<random 32-byte hex>
JWT_SECRET_KEY=<random 32-byte hex>
AZURE_AI_ENDPOINT=https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/
AZURE_AI_API_KEY=<rotated key from Azure portal>
AZURE_AI_MODEL_MAIN=gpt-5.4
AZURE_AI_MODEL_MINI=gpt-5.4-mini
STRIPE_SECRET_KEY=<from Stripe dashboard>
STRIPE_WEBHOOK_SECRET=<from Stripe dashboard>
CORS_ALLOWED_ORIGINS=http://localhost:5173 # comma-separated in production
Generate secrets: python3 -c "import secrets; print(secrets.token_hex(32))"
Startup throws RuntimeError for any missing or weak-default secret/API key.
Deployment
Production target: cohorta.ai-impress.com on aimpress (OVH) server via Traefik.
# Phase 6: Docker Compose + Traefik at /opt/03-business/cohorta/
docker compose up -d
Manual production backend start:
cd backend && source venv/bin/activate
hypercorn "app:create_app()" --bind 0.0.0.0:5137