semblance-dev/CLAUDE.md
Vadym Samoilenko 3e9ccafad2 Add LLM usage tracking infrastructure (Phases A-C)
- Model renames: gpt-5.2 → gpt-5.4-2026-03-05, gemini-3-pro-preview → gemini-3.1-pro-preview; retire gpt-4.1 via alias fallback
- New: llm_usage_context.py (ContextVar-based attribution), model_pricing.py (tiered pricing + 60s cache), usage_event.py (append-only telemetry), quota.py (user/FG quota enforcement with 80% warning)
- Wire _record_usage into all 3 LLM methods; set_llm_context at every service entry point
- Fix admin_required decorator (was sync, never awaited User.find_by_id); add active_required and with_user_context decorators
- Inject user_id into ContextVar from JWT on every authenticated request
- Add DB indexes for usage_events, model_pricing, users collections
- Seed script for model pricing (gpt-5.4 single-tier, gemini-3.1 two-tier 200k threshold)
- Fix parse_json_response NameError (logger undefined at module level)
- 70 passing tests: conftest.py with sys.modules stubs, test_usage_infrastructure.py (52 tests), rewrite stale test_llm_service.py (18 tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 18:08:27 +01:00

5.3 KiB
Executable file

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Commands

  • Dev Server: npm run dev (port 5173, proxies /apilocalhost:5137)
  • Build: npm run build (use this to verify TypeScript compilation)
  • Dev Build: npm run build:dev (development mode build)
  • Lint: npm run lint
  • Backend: cd backend && python run.py (Hypercorn ASGI on port 5137)

Backend Testing

After modifying any Python files:

source backend/venv/bin/activate
python -c "import app.services.module_name"        # Test specific module
python -c "from app import create_app; create_app()"  # Test app creation

Architecture Overview

ASGI Stack (critical detail)

create_app() returns a socketio.ASGIApp wrapping a Quart app — not the Quart app itself. Accessing app.quart_app gives the inner Quart instance. This distinction matters whenever you write ASGI middleware or access app config directly.

Real-Time Communication

Socket.IO via python-socketio AsyncServer (ASGI mode). The WebSocketContextNew.tsx context manages the client connection. websocket_manager_async.py handles room-based messaging for focus group sessions. The WebSocket manager must call ws_mgr.set_main_loop(asyncio.get_running_loop()) at startup so that cross-thread emits from the AI Runner land on the right loop.

VITE_ENABLE_WEBSOCKET is hardcoded true in dev and false in production builds via vite.config.ts — it is not controlled by .env.

AI Runner + Threading

ai_runner_service.py is a singleton that owns a dedicated OS thread with a single asyncio event loop. All autonomous AI conversations run in this thread. This solves Motor (AsyncIOMotorClient) event-loop affinity: Motor clients in the AI runner are bound to that loop, while regular API routes use synchronous PyMongo. Never share Motor clients between the two contexts.

Autonomous Conversation Pipeline

  1. ai_runner_service.py — spawns coroutines on the dedicated thread's event loop
  2. autonomous_conversation_controller.py — orchestrates the full session
  3. conversation_decision_service.py — picks the next speaker
  4. conversation_context_service.py — maintains history/state
  5. conversation_state_manager.py — in-memory state across turns

Task Manager

task_manager.py is a singleton tracking cancellable asyncio tasks (persona generation, discussion guides, etc.). Tasks are exposed via /api/tasks routes. A background sweeper cleans up completed/expired tasks. Frontend polling is handled by useTaskPolling.ts.

LLM Integration

llm_service.py creates fresh clients per call (avoids event-loop mismatch in ASGI). Default model: Google Gemini via google-genai. Alternative: OpenAI (AsyncOpenAI). Both require env vars GEMINI_API_KEY and OPENAI_API_KEY — startup fails if missing. Prompts are markdown templates in /backend/prompts/ loaded by prompt_loader.py.

Code Style

  • TypeScript with strictNullChecks: false
  • Functional components with hooks; local state via hooks, shared state via context/props
  • @/ alias maps to src/
  • URL construction: always use ${import.meta.env.BASE_URL}asset.png — production base is /semblance/
  • Error handling: try/catch + sonner toast for user feedback

File Organization

backend/
  app/
    routes/          # Blueprints: auth, personas, focus-groups, ai-personas, focus-group-ai, folders, tasks
    services/        # Business logic: llm_service, ai_runner_service, task_manager, autonomous_*, conversation_*
    models/          # Data models: User, FocusGroup, Persona, Folder
    auth/            # Auth utilities (JWT helpers)
    prompts/         # LLM prompt markdown templates
    websocket_manager_async.py  # Room-based async WebSocket manager
    extensions.py    # socketio.AsyncServer singleton

src/
  pages/             # Route-level components (Dashboard, FocusGroups, FocusGroupSession, Login, SyntheticUsers)
  components/
    focus-group-session/  # Session UI panels (Discussion, Participant, Themes, etc.)
    persona/         # Persona management components
    ui/              # shadcn-ui primitives
  contexts/          # AuthContext, WebSocketContextNew, NavigationContext
  hooks/             # useTaskPolling, useWebSocket, usePersonaStorage, useDiscussionGuideGeneration, etc.
  types/             # TypeScript type definitions

Environment Configuration

Setting Development Production
Base path / /semblance/
API base /api (proxied to 5137) https://optical-dev.oliver.solutions/semblance_back/api
WebSocket path /socket.io/ /semblance_back/socket.io/
MSAL redirect http://localhost:5173/ https://optical-dev.oliver.solutions/semblance

Setup: copy .env.development or .env.production to .env. Backend requires backend/.env with SECRET_KEY, JWT_SECRET_KEY, GEMINI_API_KEY, OPENAI_API_KEY — startup will throw RuntimeError if any are missing or use weak defaults.

Knowledge Wiki

A cross-project knowledge base is maintained automatically from all Claude Code sessions.

  • Index: /Users/ai_leed/Library/Mobile Documents/iCloud~md~obsidian/Documents/VadymSamoilenko/wiki/index.md
  • Query: cd ~/.claude/memory-compiler && uv run python scripts/query.py "your question"