- Model renames: gpt-5.2 → gpt-5.4-2026-03-05, gemini-3-pro-preview → gemini-3.1-pro-preview; retire gpt-4.1 via alias fallback - New: llm_usage_context.py (ContextVar-based attribution), model_pricing.py (tiered pricing + 60s cache), usage_event.py (append-only telemetry), quota.py (user/FG quota enforcement with 80% warning) - Wire _record_usage into all 3 LLM methods; set_llm_context at every service entry point - Fix admin_required decorator (was sync, never awaited User.find_by_id); add active_required and with_user_context decorators - Inject user_id into ContextVar from JWT on every authenticated request - Add DB indexes for usage_events, model_pricing, users collections - Seed script for model pricing (gpt-5.4 single-tier, gemini-3.1 two-tier 200k threshold) - Fix parse_json_response NameError (logger undefined at module level) - 70 passing tests: conftest.py with sys.modules stubs, test_usage_infrastructure.py (52 tests), rewrite stale test_llm_service.py (18 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
90 lines
5.3 KiB
Markdown
Executable file
90 lines
5.3 KiB
Markdown
Executable file
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Commands
|
|
- **Dev Server**: `npm run dev` (port 5173, proxies `/api` → `localhost:5137`)
|
|
- **Build**: `npm run build` (use this to verify TypeScript compilation)
|
|
- **Dev Build**: `npm run build:dev` (development mode build)
|
|
- **Lint**: `npm run lint`
|
|
- **Backend**: `cd backend && python run.py` (Hypercorn ASGI on port 5137)
|
|
|
|
## Backend Testing
|
|
After modifying any Python files:
|
|
```bash
|
|
source backend/venv/bin/activate
|
|
python -c "import app.services.module_name" # Test specific module
|
|
python -c "from app import create_app; create_app()" # Test app creation
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
### ASGI Stack (critical detail)
|
|
`create_app()` returns a **`socketio.ASGIApp`** wrapping a Quart app — not the Quart app itself. Accessing `app.quart_app` gives the inner Quart instance. This distinction matters whenever you write ASGI middleware or access app config directly.
|
|
|
|
### Real-Time Communication
|
|
Socket.IO via `python-socketio` `AsyncServer` (ASGI mode). The `WebSocketContextNew.tsx` context manages the client connection. `websocket_manager_async.py` handles room-based messaging for focus group sessions. The WebSocket manager must call `ws_mgr.set_main_loop(asyncio.get_running_loop())` at startup so that cross-thread emits from the AI Runner land on the right loop.
|
|
|
|
> `VITE_ENABLE_WEBSOCKET` is hardcoded `true` in dev and `false` in production builds via `vite.config.ts` — it is not controlled by `.env`.
|
|
|
|
### AI Runner + Threading
|
|
`ai_runner_service.py` is a singleton that owns a **dedicated OS thread** with a single asyncio event loop. All autonomous AI conversations run in this thread. This solves Motor (AsyncIOMotorClient) event-loop affinity: Motor clients in the AI runner are bound to that loop, while regular API routes use synchronous PyMongo. Never share Motor clients between the two contexts.
|
|
|
|
### Autonomous Conversation Pipeline
|
|
1. `ai_runner_service.py` — spawns coroutines on the dedicated thread's event loop
|
|
2. `autonomous_conversation_controller.py` — orchestrates the full session
|
|
3. `conversation_decision_service.py` — picks the next speaker
|
|
4. `conversation_context_service.py` — maintains history/state
|
|
5. `conversation_state_manager.py` — in-memory state across turns
|
|
|
|
### Task Manager
|
|
`task_manager.py` is a singleton tracking cancellable asyncio tasks (persona generation, discussion guides, etc.). Tasks are exposed via `/api/tasks` routes. A background sweeper cleans up completed/expired tasks. Frontend polling is handled by `useTaskPolling.ts`.
|
|
|
|
### LLM Integration
|
|
`llm_service.py` creates fresh clients per call (avoids event-loop mismatch in ASGI). Default model: **Google Gemini** via `google-genai`. Alternative: **OpenAI** (`AsyncOpenAI`). Both require env vars `GEMINI_API_KEY` and `OPENAI_API_KEY` — startup fails if missing. Prompts are markdown templates in `/backend/prompts/` loaded by `prompt_loader.py`.
|
|
|
|
## Code Style
|
|
- TypeScript with `strictNullChecks: false`
|
|
- Functional components with hooks; local state via hooks, shared state via context/props
|
|
- `@/` alias maps to `src/`
|
|
- **URL construction**: always use `${import.meta.env.BASE_URL}asset.png` — production base is `/semblance/`
|
|
- Error handling: try/catch + `sonner` toast for user feedback
|
|
|
|
## File Organization
|
|
```
|
|
backend/
|
|
app/
|
|
routes/ # Blueprints: auth, personas, focus-groups, ai-personas, focus-group-ai, folders, tasks
|
|
services/ # Business logic: llm_service, ai_runner_service, task_manager, autonomous_*, conversation_*
|
|
models/ # Data models: User, FocusGroup, Persona, Folder
|
|
auth/ # Auth utilities (JWT helpers)
|
|
prompts/ # LLM prompt markdown templates
|
|
websocket_manager_async.py # Room-based async WebSocket manager
|
|
extensions.py # socketio.AsyncServer singleton
|
|
|
|
src/
|
|
pages/ # Route-level components (Dashboard, FocusGroups, FocusGroupSession, Login, SyntheticUsers)
|
|
components/
|
|
focus-group-session/ # Session UI panels (Discussion, Participant, Themes, etc.)
|
|
persona/ # Persona management components
|
|
ui/ # shadcn-ui primitives
|
|
contexts/ # AuthContext, WebSocketContextNew, NavigationContext
|
|
hooks/ # useTaskPolling, useWebSocket, usePersonaStorage, useDiscussionGuideGeneration, etc.
|
|
types/ # TypeScript type definitions
|
|
```
|
|
|
|
## Environment Configuration
|
|
|
|
| Setting | Development | Production |
|
|
|---------|-------------|------------|
|
|
| Base path | `/` | `/semblance/` |
|
|
| API base | `/api` (proxied to 5137) | `https://optical-dev.oliver.solutions/semblance_back/api` |
|
|
| WebSocket path | `/socket.io/` | `/semblance_back/socket.io/` |
|
|
| MSAL redirect | `http://localhost:5173/` | `https://optical-dev.oliver.solutions/semblance` |
|
|
|
|
Setup: copy `.env.development` or `.env.production` to `.env`. Backend requires `backend/.env` with `SECRET_KEY`, `JWT_SECRET_KEY`, `GEMINI_API_KEY`, `OPENAI_API_KEY` — startup will throw `RuntimeError` if any are missing or use weak defaults.
|
|
|
|
## Knowledge Wiki
|
|
A cross-project knowledge base is maintained automatically from all Claude Code sessions.
|
|
- **Index:** `/Users/ai_leed/Library/Mobile Documents/iCloud~md~obsidian/Documents/VadymSamoilenko/wiki/index.md`
|
|
- **Query:** `cd ~/.claude/memory-compiler && uv run python scripts/query.py "your question"`
|