--- name: "Semblance — Synthetic Society" client: Oliver Internal status: active tech: [React, TypeScript, Python, Quart, Socket.IO, MongoDB, Gemini API, OpenAI API] local_path: /Users/ai_leed/Documents/Projects/Oliver/semblance deploy: ./deploy.sh url: https://optical-dev.oliver.solutions/semblance/ server: optical-dev tags: [oliver, ai, synthetic-personas, focus-group, insights, gcp, socketio] created: 2026-04-14 last_commit: 2026-04-24 commits: 122 port: 5137 db: MongoDB 7 service: Docker Compose --- ## Overview **Semblance** is an AI-powered research platform that generates synthetic consumer personas and orchestrates autonomous focus group sessions. Users build realistic buyer profiles (via AI generation or manual entry), moderate live discussions with LLM-powered persona responses, or enable AI autonomous mode to orchestrate multi-persona conversations with speaker selection and turn-taking. The platform extracts themes and session analytics in real time via Socket.IO, with full quota enforcement, usage tracking, and admin analytics for cost visibility. ## Tech Stack - **Frontend:** React 18 + TypeScript, Vite, TanStack Query (server state), Socket.IO client, React Router DOM, MSAL (Microsoft SSO) - **Backend:** Python 3.11+, Quart (async web framework), python-socketio, Motor (async MongoDB driver), PyMongo (sync driver) - **Database:** MongoDB 7 (Docker container, 127.0.0.1:27017) - **Infrastructure:** Docker Compose, Apache 2 (reverse proxy), systemd/Docker (process management) - **AI/ML:** Google Gemini 2.0 Flash & Pro, OpenAI GPT-4o, unified LLM interface with token counting - **Key libraries:** asyncio (concurrent AI threads), Hypercorn (ASGI server), pydantic (validation), PyJWT (auth) ## Architecture ``` Browser (React SPA) ├── TanStack Query + Socket.IO client ├── AuthContext (JWT in memory) └── WebSocketContextNew (real-time events) ↓ Apache (optical-dev.oliver.solutions) ├── /semblance/ → static files (/var/www/html/semblance) ├── /semblance_back/api/ → proxy to 127.0.0.1:5137 └── /semblance_back/socket.io/ → WebSocket to 127.0.0.1:5137 ↓ Hypercorn ASGI (port 5137) ├── Quart app (8 blueprints: auth, personas, focus-groups, ai-personas, etc.) ├── python-socketio AsyncServer (room-based WebSocket emit) ├── 19 Services (LLM, AI runner, persona generation, theme extraction, etc.) └── Motor + PyMongo clients ↓ MongoDB 7 (10 collections: users, personas, focus_groups, usage_events, etc.) ``` **Core data flows:** 1. **Persona Generation (AI):** POST `/api/ai-personas/generate` → task_manager creates async job → ai_persona_service calls LLM (Gemini/OpenAI) with persona-basic/detailed-generation.md prompt → stores result in `personas` collection with allowlisted fields → UsageEvent records tokens + cost → WebSocket task_complete event to user. 2. **Manual Focus Group Message:** POST `/api/focus-groups//messages` → jwt_required + quota check → focus_group_response_service iterates personas → calls LLM for each persona response in parallel → stores in focus_group_messages collection → emits new_message WebSocket event. 3. **AI Autonomous Mode:** After session create, autonomous_conversation_controller spawns dedicated asyncio thread → conversation_decision_service selects next speaker → generates turn-by-turn responses → conversation_context_service maintains rolling context window → conversation_state_manager tracks turn count + termination logic → emits real-time events as messages arrive. 4. **Theme Extraction:** After session end, key_theme_service calls LLM to identify patterns → stores in focus_group_themes collection → focus_group_summary_service aggregates for session report. ## Dev Commands ```bash # First-time setup cd /Users/ai_leed/Documents/Projects/Oliver/semblance npm install cd backend python3 -m venv venv source venv/bin/activate pip install -r requirements.txt cd .. cp .env.development .env cd backend && cp .env.example .env # Edit backend/.env: fill SECRET_KEY, JWT_SECRET_KEY, GEMINI_API_KEY, OPENAI_API_KEY nano .env # Run locally (combined) ./start.sh # Backend: http://localhost:5137 # Frontend: http://localhost:5173 (proxies /api → backend) # Or separate terminals: # Terminal 1 — Backend: cd backend && source venv/bin/activate && python run.py # Terminal 2 — Frontend: npm run dev # Development tasks npm run lint # Frontend lint npm run build # Production build npm run build:dev # Dev mode build cd backend && python -m pytest tests/ # Backend tests python scripts/seed_model_pricing.py # Seed model costs locally # Default local login: user / pass ``` ## Deployment - **Server:** optical-dev.oliver.solutions - **Deploy:** `./deploy.sh` (on server, in `/opt/semblance/`) - **URL:** https://optical-dev.oliver.solutions/semblance/ - **Port:** 5137 (backend, 127.0.0.1 only; public via Apache) - **Service:** Docker Compose (`docker compose up -d --build`) - **Local path:** /Users/ai_leed/Documents/Projects/Oliver/semblance **Deploy process (automated by deploy.sh):** 1. Pre-flight: verify `backend/.env` has all 4 required keys (SECRET_KEY, JWT_SECRET_KEY, GEMINI_API_KEY, OPENAI_API_KEY) 2. `git pull` 3. Docker build frontend → copy dist to `/var/www/html/semblance` 4. Ensure Apache proxy config + reload 5. `docker compose up -d --build` (rebuild + restart mongo + backend) 6. Health check loop (30 retries, 10s intervals) 7. Run seed_model_pricing.py backfill 8. Verify both services healthy **⚠️ CONSTRAINT:** Do NOT run `./deploy.sh` without explicit user instruction. Hard rule. ## Environment Variables **Frontend (.env):** - `VITE_FRONTEND_BASE_URL` — public app URL (dev: `http://localhost:5173`, prod: `https://optical-dev.oliver.solutions/semblance`) - `VITE_API_BASE_URL` — backend API root (dev: `/api` proxied by Vite, prod: `https://optical-dev.oliver.solutions/semblance_back/api`) - `VITE_WEBSOCKET_PATH` — Socket.IO path (dev: `/socket.io/`, prod: `/semblance_back/socket.io/`) - `VITE_MSAL_REDIRECT_URI` — Microsoft SSO redirect (dev: `http://localhost:5173/`, prod: `https://optical-dev.oliver.solutions/semblance`) - `VITE_MSAL_POST_LOGOUT_REDIRECT_URI` — Post-logout URL (same as REDIRECT_URI) - `VITE_ENABLE_LOCAL_LOGIN` — enable user/pass login (dev: `true`, prod: `false`) - `VITE_ENABLE_WEBSOCKET_DEBUG` — WebSocket logging (dev: `true`, prod: `false`) **Backend (backend/.env) — required:** - `SECRET_KEY` — Flask session secret (generate: `python3 -c "import secrets; print(secrets.token_hex(32))"`) - `JWT_SECRET_KEY` — JWT signing key (same generation as above) - `GEMINI_API_KEY` — Google Gemini API key - `OPENAI_API_KEY` — OpenAI API key **Backend (optional):** - `MONGODB_URI` — MongoDB connection string (default: `mongodb://localhost:27017/semblance`) - `MSAL_CLIENT_ID`, `MSAL_CLIENT_SECRET`, `MSAL_TENANT_ID` — Microsoft SSO credentials (prod only) - `QUART_ENV` — `development` or `production` ## API / Endpoints **Key routes (all under `/api/`):** - `POST /auth/login` — local auth (dev only) - `POST /ai-personas/generate` — async persona generation (returns 202 + task_id) - `GET /tasks/` — poll job status - `POST /focus-groups` — create session - `POST /focus-groups//messages` — send manual message - `POST /focus-groups//autonomous-start` — enable AI mode - `POST /focus-groups//autonomous-stop` — disable AI mode - `GET /focus-groups/` — retrieve session (includes messages, themes) - `POST /admin/users` — manage users & quotas (admin only) - `GET /admin/usage` — analytics dashboard (admin only) **WebSocket events:** - `task_complete` — persona/theme generation finished - `new_message` — focus group message from human or AI - `mode_changed` — autonomous mode toggled - `conversation_turn` — AI turn announcement (in autonomous mode) - `session_summary` — final themes & analytics ## Known Issues - **Live token extraction:** Missing `usage_metadata` in some LLM responses logs warning but continues gracefully; "thinking tokens" captured when available (o1-mini limitation). - **Backfill pricing:** Requires `--delete-existing-estimates` flag to recalculate; uses accumulated conversation context for estimation. - **Admin filters:** ISO Z timestamp parsing previously crashed; fixed in commit 7b6a7c73; validate all date filters are ISO 8601. - **AI autonomous mode:** Historical race conditions (split-brain UI, cross-loop WebSocket emit) fixed in commits 283b31e7–b4978989; monitor logs for asyncio.TimeoutError ## Timeline / Git History | Date | Change | |------|--------| | 2026-04-24 | Add LLM usage tracking infrastructure (Phases A-C) | | 2026-03-30 | Fix: task result not stored in useTaskPolling (false 'no personas' error) | | 2026-03-23 | Fix AI autonomous mode: cross-loop WebSocket emit + polling fallback | | 2026-03-23 | Allow document uploads (PDF, DOCX, TXT) as focus group assets | | 2026-03-23 | **Critical:** Migrate task delivery WebSocket → HTTP polling (GCP 30s timeout) | | 2026-03-23 | Fix all async LLM routes: bypass GCP 30s LB timeout | | 2026-03-23 | Fix naive vs aware datetime crash + stuck AI mode | ## Sessions ### 2026-04-29 – Check Obsidian integration to ensure all **Asked:** Check Obsidian integration to ensure all project changes are logged with sufficient detail for user understanding. **Done:** Verified that the cloud-code script receives complete project context from HUCOM on startup including server and deployment information. ### 2026-04-24 – Check if the codebase has user **Asked:** Check if the codebase has user management and token cost tracking by project/user, and create an implementation plan if missing. **Done:** Analyzed the codebase and identified missing token cost tracking features; created a plan requiring token pricing models, usage logging, and cost aggregation endpoints. ### 2026-04-24 – Analyze the codebase for user management **Asked:** Analyze the codebase for user management and token usage tracking by project and user, then create an implementation plan if missing. **Done:** Identified gaps in token usage recording (missing warnings for None metadata and thinking model token handling) and provided fixes for accurate Gemini billing tracking. ### 2026-04-24 – Analyze the codebase and create a **Asked:** Analyze the codebase and create a CLAUDE.md file with development commands and architecture overview. **Done:** Created CLAUDE.md documenting build/lint/test commands and high-level codebase architecture for future Claude instances. ### 2026-04-24 – Review user management and token cost **Asked:** Review user management and token cost tracking features, then create implementation plan if missing. **Done:** Analyzed codebase and created CLAUDE.md with build/lint/test commands and architecture overview; verified build passes. ### 2026-04-24 – Analyze codebase for user management and **Asked:** Analyze codebase for user management and token usage tracking with cost analytics by project and user, then create an implementation plan. **Done:** Created backfill script for token usage events and executed it to generate 902 usage records across the system. ### 2026-04-24 – Analyze the codebase, create a CLAUDE.md **Asked:** Analyze the codebase, create a CLAUDE.md file with setup commands and architecture docs, and assess token usage tracking across users and projects. **Done:** Fixed persona data type errors in backfill script and deployed usage tracking script to backend container for testing. ### 2026-04-24 – Check if the codebase has user **Asked:** Check if the codebase has user management and token usage tracking with cost analytics by project and user, and create an implementation plan if missing. **Done:** Reviewed codebase for token tracking and cost management features; determined missing functionality and requested pricing information for used models to create implementation plan. ### 2026-04-24 – Create a CLAUDE.md documentation file and **Asked:** Create a CLAUDE.md documentation file and analyze token usage/cost tracking across users and projects. **Done:** Analyzed codebase structure, identified missing user management and token cost tracking system, and created implementation plan. ### 2026-04-24 – Check if user management and token **Asked:** Check if user management and token usage tracking by project/user exists, and create implementation plan if missing. **Done:** Analyzed data structure and identified missing user management and token cost tracking system; created implementation plan requiring LLM pricing data. ### 2026-04-24 – Check if user management and token **Asked:** Check if user management and token cost tracking exist, and create an implementation plan if missing. **Done:** Reviewed existing codebase and confirmed token usage tracking exists via `_record_usage` in LLMService; identified that historical data needs backfill using `backend/scripts/backfill_usage.py`. ### 2026-04-24 – Analyze the codebase and create a **Asked:** Analyze the codebase and create a CLAUDE.md file with common commands and architecture overview, then check if user management and token cost tracking by project/user exists, and create an implementation plan if missing. **Done:** Confirmed all 8 tasks are completed; verified logout functionality correctly clears localStorage and handles Microsoft SSO without backend deactivation. ### 2026-04-24 – Check if user management and token **Asked:** Check if user management and token cost tracking by project/user exist, create implementation plan if missing. **Done:** Confirmed all features already implemented and marked 8 tasks as completed. ### 2026-04-24 – Analyze codebase and create CLAUDE.md documentation, **Asked:** Analyze codebase and create CLAUDE.md documentation, then implement user management and token usage tracking with cost breakdown. **Done:** Created CLAUDE.md with build/test commands and architecture overview, implemented usage tracking endpoints and billing page with cost breakdown by feature and project. ### 2026-04-24 – Analyze codebase and create CLAUDE.md documentation **Asked:** Analyze codebase and create CLAUDE.md documentation file with build commands and architecture overview, then check for user management and token cost tracking system. **Done:** Identified missing token usage tracking features and created implementation plan including authentication decorators, usage API endpoints, WebSocket events, and billing UI components. ### 2026-04-24 – Create CLAUDE.md documentation and implement token **Asked:** Create CLAUDE.md documentation and implement token usage tracking with cost management by user and project. **Done:** Analyzed codebase architecture, documented development commands and setup, and created implementation plan for token tracking and cost analysis system. ### 2026-04-24 – Analyze codebase and create CLAUDE.md documentation **Asked:** Analyze codebase and create CLAUDE.md documentation with build/test commands and architecture overview, then check for user management and token cost tracking system; if missing, provide implementation plan. **Done:** Created admin panel backend with user management endpoints (list, update roles/quotas, enable/disable), usage analytics endpoints (summary by user/model/feature/day, raw event drill-down), and committed 14 admin files with all Phase D features. ### 2026-04-24 – Review user management and token cost **Asked:** Review user management and token cost tracking systems, then create implementation plan if missing. **Done:** Analyzed codebase and identified missing token cost tracking system; created implementation plan with required models and database schema. ### 2026-04-24 – Can you check if we have **Asked:** Can you check if we have user management and token cost tracking by project and user, and create an implementation plan if missing? **Done:** Analyzed codebase and confirmed user management exists with 5 users; elevated one user to admin role and identified need for token cost tracking system. ### 2026-04-24 – Analyze codebase architecture and check for **Asked:** Analyze codebase architecture and check for user management and token cost tracking system. **Done:** Created CLAUDE.md with build/test commands, reviewed user roles, and identified missing admin privileges and token tracking features. ### 2026-04-24 – Check if user management and token **Asked:** Check if user management and token cost tracking by project and user exist, and create implementation plan if missing. **Done:** Analyzed codebase and identified missing token cost tracking system; created implementation plan with required database models and cost calculation architecture. ### 2026-04-24 – Review user management, token cost tracking, **Asked:** Review user management, token cost tracking, and spending analytics by project/user, then create an implementation plan if missing. **Done:** Analyzed codebase and confirmed seed_model_pricing.py script is properly configured to run via docker compose with correct paths and environment variables. ### 2026-04-24 – Check if user management and token **Asked:** Check if user management and token usage tracking with cost breakdown by project/user exists, and create an implementation plan if missing. **Done:** Verified 70/70 tests passing and confirmed usage infrastructure foundation exists; provided implementation plan outline with model requirements pending pricing details. ### 2026-04-24 – Check if the codebase has user **Asked:** Check if the codebase has user management and token cost tracking by project/user, and create an implementation plan if missing. **Done:** Analyzed codebase structure and identified missing token cost tracking; created implementation plan requiring cost configuration for supported models. ### 2026-04-24 – Check for user management and token **Asked:** Check for user management and token usage/cost tracking system, create implementation plan if missing. **Done:** Confirmed three recording hooks are integrated and identified missing token cost tracking system requiring implementation plan with model pricing details. ### 2026-04-24 – Create and improve a CLAUDE.md documentation **Asked:** Create and improve a CLAUDE.md documentation file for the codebase with commands and architecture overview. **Done:** Updated CLAUDE.md with corrected paths, added npm dev command, clarified ASGI/socketio architecture, and documented async patterns. ### 2026-04-14 – Project catalogued **Done:** Added to Obsidian second brain with full details. --- ## Change Log | Date | Requested | Changed | Files | |------|-----------|---------|-------| | 2026-04-29 | Obsidian integration audit | Verified context transmission from HUCOM, checked for missing CLAUDE.md and Obsidian notes | cloud-code initialization, HUCOM data transfer, project logging system | | 2026-04-24 | Token cost tracking | Add user/project cost models, usage logging, cost aggregation endpoints | CLAUDE.md | | 2026-04-24 | Token usage tracking | Add warning for None metadata, handle thinking model tokens, implement project/user-level cost tracking | gemini_client.py, CLAUDE.md | | 2026-04-24 | Documentation setup | Created CLAUDE.md with dev commands and architecture overview | CLAUDE.md | | 2026-04-24 | Token cost tracking | Analyzed existing features, created implementation plan documentation | CLAUDE.md | | 2026-04-24 | Token usage tracking | Added backfill_usage.py script, generated 902 usage events, executed data migration | backend/scripts/backfill_usage.py, database | | 2026-04-24 | Token tracking assessment | Script deployed, persona field types corrected | backfill_usage.py, backend container | | 2026-04-24 | Token tracking & cost analytics | Review codebase, identify missing user management and per-project cost tracking, request model pricing | CLAUDE.md | | 2026-04-24 | Codebase analysis | CLAUDE.md creation, token tracking system design | CLAUDE.md, database schema, implementation plan | | 2026-04-24 | User management & token tracking | Add user authentication, token usage logging, cost calculation per project/user, pricing configuration | CLAUDE.md, implementation plan (to be created) | | 2026-04-24 | Token tracking system | Verified usage_events recording, identified backfill requirement for historical data | LLMService.ts, backfill_usage.py | | 2026-04-24 | Token cost tracking | Create implementation plan for user management and token spending analytics | CLAUDE.md, Architecture docs | | 2026-04-24 | Token cost tracking | Verified existing user management and token cost features, marked tasks completed | CLAUDE.md | | 2026-04-24 | Token tracking system | Added usage endpoints, WebSocket emissions, billing page, navigation link | routes/usage.ts, pages/billing.tsx, middleware/auth.ts, navigation.tsx | | 2026-04-24 | Token usage tracking | Add @active_required decorators, create usage.py routes, emit WebSocket events, build MyUsage page | backend/app/routes/usage.py, src/pages/MyUsage.tsx, WebSocket handlers | | 2026-04-24 | Token tracking system | CLAUDE.md documentation, architecture overview, implementation plan for user management and cost tracking | CLAUDE.md, documentation files | | 2026-04-24 | User management & token tracking | Added user list/update/disable endpoints, usage summary and event drill-down by user/model/feature/day, cost aggregation | backend/app/routes/admin.py | | 2026-04-24 | Token cost tracking system | Add User, Project, TokenUsage, CostConfig models; create admin panel for cost management | CLAUDE.md, schema.sql, implementation-plan.md | | 2026-04-24 | Token cost tracking system | Add token usage models, cost calculation by user/project, reporting dashboard | CLAUDE.md, User.model.ts, TokenUsage.model.ts, CostReport.tsx | | 2026-04-24 | Codebase documentation and user audit | CLAUDE.md created, user role analysis completed | CLAUDE.md | | 2026-04-24 | Token cost tracking | Database models for pricing/costs, cost calculation logic, reporting by project/user | CLAUDE.md, implementation plan document | | 2026-04-24 | Token tracking system | Verified seed_model_pricing.py setup, docker compose exec command syntax | docker-compose.yml, scripts/seed_model_pricing.py | | 2026-04-24 | Usage tracking system | LLMCallContext, cost breakdown by project/user, admin dashboard | backend/tests/test_usage_infrastructure.py, llm_service.py, conftest.py | | 2026-04-24 | Token cost tracking | Add cost config per model, create Usage and Project models, add cost calculation to LLM service | models.py, llm_service.py, CLAUDE.md | | 2026-04-24 | Token cost tracking system | Add tiktoken dependency, create user management module, implement cost tracking by project and user | requirements.txt, CLAUDE.md | | 2026-04-24 | CLAUDE.md enhancement | Fixed wiki path, added npm run dev command, clarified ASGI architecture and async Motor/PyMongo distinction | CLAUDE.md | | 2026-03-23 | Fix AI mode hanging on GCP | WebSocket → HTTP polling for all LLM routes | backend | | 2026-03-23 | Add document upload support | PDF/DOCX/TXT as focus group assets | backend, frontend | ## Related - [[modcomms/Mod Comms]] (same GCP timeout issue) - [[olivas/OliVAS]] - [[build-a-squad/Build A Squad]]