cohorta/CLAUDE.md
Vadym Samoilenko aed5c5d7a4
Some checks failed
Deploy to Production / deploy (push) Failing after 0s
chore: update docs to trigger CI/CD test
2026-05-23 18:59:28 +01:00

8.8 KiB
Executable file

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

Frontend

  • Dev server: npm run dev — Vite on port 5173, proxies /apilocalhost:5137
  • Production build: npm run build
  • Dev build: npm run build:dev
  • Lint: npm run lint

Backend

  • Start: cd backend && source venv/bin/activate && python run.py — Hypercorn ASGI on port 5137
  • Both at once: ./start.sh

Backend sanity checks (after modifying Python files)

source backend/venv/bin/activate
python -c "import app.services.<module_name>"
python -c "from app import create_app; create_app()"

Docker (production-style)

# Build frontend and copy to web root
docker compose --profile build up frontend

# Run MongoDB + backend
docker compose up mongo backend

Architecture Overview

ASGI Stack (critical detail)

create_app() returns a socketio.ASGIApp wrapping the Quart app — not the Quart app itself. Access asgi_app.quart_app for the inner Quart instance. This distinction matters in ASGI middleware and anywhere you access app.config directly.

Real-Time Communication

Socket.IO via python-socketio AsyncServer (ASGI mode). Frontend: WebSocketContextNew.tsx context → websocketServiceNew.ts. Backend: websocket_manager_async.py manages room-based messaging per focus group session.

VITE_ENABLE_WEBSOCKET is hardcoded by vite.config.ts to true in dev and false in production — it is not controlled by .env.

At startup ws_mgr.set_main_loop(asyncio.get_running_loop()) must be called (done in before_serving) so cross-thread emits from the AI Runner land on the correct loop.

AI Runner + Threading (Motor event-loop affinity)

ai_runner_service.py is a singleton owning a dedicated OS thread with a single asyncio event loop. All autonomous AI conversations run there.

  • The AI runner creates its own AsyncIOMotorClient bound to that thread's loop.
  • Regular API routes use synchronous PyMongo (from app/db.py).
  • Never share Motor clients between the AI runner thread and the ASGI/Quart thread.

Autonomous Conversation Pipeline

ai_runner_service.py          — spawns coroutines on the dedicated loop
autonomous_conversation_controller.py — orchestrates the session
conversation_decision_service.py      — picks next speaker, wraps up
conversation_context_service.py       — maintains history/context window
conversation_state_manager.py         — in-memory state across turns

Task Manager

task_manager.py singleton tracks cancellable asyncio tasks (persona generation, discussion guides, bulk exports). Exposed via /api/tasks. Frontend polls with useTaskPolling.ts / src/lib/taskPolling.ts. A background sweeper cleans up expired tasks.

Long-running AI operations return task_id immediately (HTTP 202); the caller polls /api/tasks/<task_id> for progress. aiPersonasApi.generatePersonasFull is the canonical example — 10 s timeout on the kick-off call, then polling.

Persona Generation — Two-Stage Pipeline

  1. Stage 1 (/ai-personas/generate-basic-profiles) — generates lightweight profiles from an audience brief; returns task_id immediately.
  2. Stage 2 (/ai-personas/complete-and-save-persona) — runs in parallel per profile to add full psychographic/behavioral detail and persist to MongoDB.

aiPersonasApi.batchGenerateWithStages in src/lib/api.ts orchestrates this client-side via Promise.allSettled; partial success (some personas fail) is handled gracefully.

LLM Integration

llm_service.py creates fresh clients per call — avoids event-loop mismatch in ASGI. Default model: gpt-5.4 (Azure AI Foundry via OpenAI-compatible endpoint). Mini tasks route to gpt-5.4-mini. Prompts are markdown templates in backend/prompts/ loaded by prompt_loader.py.

Azure endpoint: https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/
Both models deployed and sharing the same base URL. AZURE_AI_API_KEY is required at startup.

Mini-routed features (via LLMUsageContext): summary, conversation_decision, key_themes, basic persona generation.
Main-routed features: persona_response, moderator, detailed persona gen/modification.

Usage & Quota Tracking

llm_usage_context.py wraps LLM calls to record token usage as UsageEvent documents. app/models/quota.py defines per-user monthly USD limits (hard-cap safety net). The API returns HTTP 402 when a user's quota or credit balance is exceeded; src/lib/api.ts catches this and fires a quota_exceeded custom DOM event.

Credit System

credits_balance on the User model, credit_transactions collection as ledger. Atomic deduction via findAndModify with $gte guard. Pricing config in app_settings collection (60s cache). Trial credits granted on registration. Stripe Checkout for credit pack purchases — webhook at /api/billing/webhook.

Costs: persona creation = 2 cr, focus group run = 40 cr. Packs: Starter $49/50cr, Pro $199/220cr, Scale $499/600cr.

Authentication

Custom JWT: app/auth/quart_jwt.py (not Flask-JWT-Extended — incompatible with Quart async). Email + password only (bcrypt). No SSO/Microsoft/MSAL. JWT stored in localStorage as auth_token; src/lib/api.ts attaches as Bearer and checks expiry before every request.

Code Style

  • TypeScript with strictNullChecks: false
  • @/ alias maps to src/
  • Asset URLs: always ${import.meta.env.BASE_URL}asset.png — base is /
  • Error feedback: sonner toast library (src/lib/toast.ts wrapper)

File Organisation

backend/
  app/
    routes/       auth, personas, focus_groups, ai_personas, focus_group_ai,
                  folders, tasks, admin, usage, billing
    services/     llm_service, ai_runner_service, task_manager,
                  autonomous_conversation_controller, conversation_*,
                  focus_group_*, persona_*, image_description_service,
                  llm_usage_context, customer_data_service, stripe_service
    models/       User, Persona, FocusGroup, Folder, UsageEvent, Quota,
                  ModelPricing, AppSettings, CreditTransaction
    auth/         quart_jwt.py — custom Quart-compatible JWT
    utils/        prompt_loader.py, discussion_guide_schema.py, rate_limiter.py
    prompts/      20 markdown LLM prompt templates
    websocket_manager_async.py   room-based async WebSocket manager
    extensions.py                socketio.AsyncServer singleton

src/
  pages/          Dashboard, FocusGroups, FocusGroupSession, Login,
                  SyntheticUsers, Admin, MyUsage, Billing
  components/
    focus-group-session/  DiscussionPanel, ParticipantPanel, ThemesPanel,
                          AutonomousDashboard, DiscussionGuideViewer, …
    persona/              PersonaEditor, PersonaProfile, PersonaModificationModal
    admin/                UsersTab, UsageTab, PricingTab, AnalyticsTab, CreditSettingsTab
    ui/                   shadcn-ui primitives + custom: GenerationProgressBar,
                          BulkExportProgressModal, MentionInput
  contexts/       AuthContext, WebSocketContextNew, NavigationContext
  hooks/          useTaskPolling, useWebSocket, usePersonaStorage,
                  useDiscussionGuideGeneration, useCancellableGeneration, …
  lib/            api.ts (all API calls), taskPolling.ts, taskCancellation.ts
  types/          persona.ts, cancellable.ts
  utils/          avatarUtils, discussionGuideMarkdown, mentionUtils

Environment Configuration

Setting Development Production
Base path / /
API base /api (proxied to 5137) /api (Traefik routes to backend)
WebSocket path /socket.io/ /socket.io/

Frontend: copy .env.development or .env.production to .env.

Backend (backend/.env — required keys, see backend/.env.example):

MONGO_URI=mongodb://localhost:27017/cohorta_db
SECRET_KEY=<random 32-byte hex>
JWT_SECRET_KEY=<random 32-byte hex>
AZURE_AI_ENDPOINT=https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/
AZURE_AI_API_KEY=<rotated key from Azure portal>
AZURE_AI_MODEL_MAIN=gpt-5.4
AZURE_AI_MODEL_MINI=gpt-5.4-mini
STRIPE_SECRET_KEY=<from Stripe dashboard>
STRIPE_WEBHOOK_SECRET=<from Stripe dashboard>
CORS_ALLOWED_ORIGINS=http://localhost:5173   # comma-separated in production

Generate secrets: python3 -c "import secrets; print(secrets.token_hex(32))"

Startup throws RuntimeError for any missing or weak-default secret/API key.

Deployment

Production target: cohorta.ai-impress.com on aimpress (OVH) server via Traefik.

# Phase 6: Docker Compose + Traefik at /opt/03-business/cohorta/
docker compose up -d

Manual production backend start:

cd backend && source venv/bin/activate
hypercorn "app:create_app()" --bind 0.0.0.0:5137