cohorta/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Commands

### Frontend
- **Dev server**: `npm run dev` — Vite on port 5173, proxies `/api` → `localhost:5137`
- **Production build**: `npm run build`
- **Dev build**: `npm run build:dev`
- **Lint**: `npm run lint`

### Backend
- **Start**: `cd backend && source venv/bin/activate && python run.py` — Hypercorn ASGI on port 5137
- **Both at once**: `./start.sh`

### Backend sanity checks (after modifying Python files)
```bash
source backend/venv/bin/activate
python -c "import app.services.<module_name>"
python -c "from app import create_app; create_app()"
```

### Docker (production-style)
```bash
# Build frontend and copy to web root
docker compose --profile build up frontend

# Run MongoDB + backend
docker compose up mongo backend
```

## Architecture Overview

### ASGI Stack (critical detail)
`create_app()` returns a **`socketio.ASGIApp`** wrapping the Quart app — not the Quart app itself. Access `asgi_app.quart_app` for the inner Quart instance. This distinction matters in ASGI middleware and anywhere you access `app.config` directly.

### Real-Time Communication
Socket.IO via `python-socketio` `AsyncServer` (ASGI mode). Frontend: `WebSocketContextNew.tsx` context → `websocketServiceNew.ts`. Backend: `websocket_manager_async.py` manages room-based messaging per focus group session.

`VITE_ENABLE_WEBSOCKET` is hardcoded by `vite.config.ts` to `true` in dev and `false` in production — it is **not** controlled by `.env`.

At startup `ws_mgr.set_main_loop(asyncio.get_running_loop())` must be called (done in `before_serving`) so cross-thread emits from the AI Runner land on the correct loop.

### AI Runner + Threading (Motor event-loop affinity)
`ai_runner_service.py` is a singleton owning a **dedicated OS thread** with a single asyncio event loop. All autonomous AI conversations run there.

- The AI runner creates its own `AsyncIOMotorClient` bound to that thread's loop.
- Regular API routes use synchronous `PyMongo` (from `app/db.py`).
- **Never share Motor clients between the AI runner thread and the ASGI/Quart thread.**

### Autonomous Conversation Pipeline
```
ai_runner_service.py          — spawns coroutines on the dedicated loop
autonomous_conversation_controller.py — orchestrates the session
conversation_decision_service.py      — picks next speaker, wraps up
conversation_context_service.py       — maintains history/context window
conversation_state_manager.py         — in-memory state across turns
```

### Task Manager
`task_manager.py` singleton tracks cancellable asyncio tasks (persona generation, discussion guides, bulk exports). Exposed via `/api/tasks`. Frontend polls with `useTaskPolling.ts` / `src/lib/taskPolling.ts`. A background sweeper cleans up expired tasks.

Long-running AI operations return `task_id` immediately (HTTP 202); the caller polls `/api/tasks/<task_id>` for progress. `aiPersonasApi.generatePersonasFull` is the canonical example — 10 s timeout on the kick-off call, then polling.

### Persona Generation — Two-Stage Pipeline
1. **Stage 1** (`/ai-personas/generate-basic-profiles`) — generates lightweight profiles from an audience brief; returns `task_id` immediately.
2. **Stage 2** (`/ai-personas/complete-and-save-persona`) — runs in parallel per profile to add full psychographic/behavioral detail and persist to MongoDB.

`aiPersonasApi.batchGenerateWithStages` in `src/lib/api.ts` orchestrates this client-side via `Promise.allSettled`; partial success (some personas fail) is handled gracefully.

### LLM Integration
`llm_service.py` creates fresh clients per call — avoids event-loop mismatch in ASGI. Default model: `gpt-5.4` (Azure AI Foundry via OpenAI-compatible endpoint). Mini tasks route to `gpt-5.4-mini`. Prompts are markdown templates in `backend/prompts/` loaded by `prompt_loader.py`.

Azure endpoint: `https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/`
Both models deployed and sharing the same base URL. `AZURE_AI_API_KEY` is required at startup.

Mini-routed features (via `LLMUsageContext`): `summary`, `conversation_decision`, `key_themes`, basic persona generation.
Main-routed features: `persona_response`, `moderator`, detailed persona gen/modification.

### Usage & Quota Tracking
`llm_usage_context.py` wraps LLM calls to record token usage as `UsageEvent` documents. `app/models/quota.py` defines per-user monthly USD limits (hard-cap safety net). The API returns HTTP **402** when a user's quota or credit balance is exceeded; `src/lib/api.ts` catches this and fires a `quota_exceeded` custom DOM event.

### Credit System
`credits_balance` on the `User` model, `credit_transactions` collection as ledger. Atomic deduction via `findAndModify` with `$gte` guard. Pricing config in `app_settings` collection (60s cache). Trial credits granted on registration. Stripe Checkout for credit pack purchases — webhook at `/api/billing/webhook`.

Costs: persona creation = 2 cr, focus group run = 40 cr. Packs: Starter $49/50cr, Pro $199/220cr, Scale $499/600cr.

### Authentication
Custom JWT: `app/auth/quart_jwt.py` (not Flask-JWT-Extended — incompatible with Quart async). Email + password only (bcrypt). No SSO/Microsoft/MSAL. JWT stored in `localStorage` as `auth_token`; `src/lib/api.ts` attaches as Bearer and checks expiry before every request.

## Code Style

- TypeScript with `strictNullChecks: false`
- `@/` alias maps to `src/`
- **Asset URLs**: always `${import.meta.env.BASE_URL}asset.png` — base is `/`
- Error feedback: `sonner` toast library (`src/lib/toast.ts` wrapper)

## File Organisation

```
backend/
  app/
    routes/       auth, personas, focus_groups, ai_personas, focus_group_ai,
                  folders, tasks, admin, usage, billing
    services/     llm_service, ai_runner_service, task_manager,
                  autonomous_conversation_controller, conversation_*,
                  focus_group_*, persona_*, image_description_service,
                  llm_usage_context, customer_data_service, stripe_service
    models/       User, Persona, FocusGroup, Folder, UsageEvent, Quota,
                  ModelPricing, AppSettings, CreditTransaction
    auth/         quart_jwt.py — custom Quart-compatible JWT
    utils/        prompt_loader.py, discussion_guide_schema.py, rate_limiter.py
    prompts/      20 markdown LLM prompt templates
    websocket_manager_async.py   room-based async WebSocket manager
    extensions.py                socketio.AsyncServer singleton

src/
  pages/          Dashboard, FocusGroups, FocusGroupSession, Login,
                  SyntheticUsers, Admin, MyUsage, Billing
  components/
    focus-group-session/  DiscussionPanel, ParticipantPanel, ThemesPanel,
                          AutonomousDashboard, DiscussionGuideViewer, …
    persona/              PersonaEditor, PersonaProfile, PersonaModificationModal
    admin/                UsersTab, UsageTab, PricingTab, AnalyticsTab, CreditSettingsTab
    ui/                   shadcn-ui primitives + custom: GenerationProgressBar,
                          BulkExportProgressModal, MentionInput
  contexts/       AuthContext, WebSocketContextNew, NavigationContext
  hooks/          useTaskPolling, useWebSocket, usePersonaStorage,
                  useDiscussionGuideGeneration, useCancellableGeneration, …
  lib/            api.ts (all API calls), taskPolling.ts, taskCancellation.ts
  types/          persona.ts, cancellable.ts
  utils/          avatarUtils, discussionGuideMarkdown, mentionUtils
```

## Environment Configuration

| Setting | Development | Production |
|---------|-------------|------------|
| Base path | `/` | `/` |
| API base | `/api` (proxied to 5137) | `/api` (Traefik routes to backend) |
| WebSocket path | `/socket.io/` | `/socket.io/` |

**Frontend**: copy `.env.development` or `.env.production` to `.env`.

**Backend** (`backend/.env` — required keys, see `backend/.env.example`):
```
MONGO_URI=mongodb://localhost:27017/cohorta_db
SECRET_KEY=<random 32-byte hex>
JWT_SECRET_KEY=<random 32-byte hex>
AZURE_AI_ENDPOINT=https://aipmress-ai-n8n.services.ai.azure.com/api/projects/aipmress-ai-n8n-OVH/openai/v1/
AZURE_AI_API_KEY=<rotated key from Azure portal>
AZURE_AI_MODEL_MAIN=gpt-5.4
AZURE_AI_MODEL_MINI=gpt-5.4-mini
STRIPE_SECRET_KEY=<from Stripe dashboard>
STRIPE_WEBHOOK_SECRET=<from Stripe dashboard>
CORS_ALLOWED_ORIGINS=http://localhost:5173   # comma-separated in production
```
Generate secrets: `python3 -c "import secrets; print(secrets.token_hex(32))"`

Startup throws `RuntimeError` for any missing or weak-default secret/API key.

## Deployment

Production target: **`cohorta.ai-impress.com`** on aimpress (OVH) server via Traefik.

```bash
# Phase 6: Docker Compose + Traefik at /opt/03-business/cohorta/
docker compose up -d
```

Manual production backend start:
```bash
cd backend && source venv/bin/activate
hypercorn "app:create_app()" --bind 0.0.0.0:5137
```