obsidian/01 Projects/semblance/Semblance.md

---
name: "Semblance — Synthetic Society"
client: Oliver Internal
status: active
tech: [React, TypeScript, Python, Quart, Socket.IO, MongoDB, Gemini API, OpenAI API]
local_path: /Users/ai_leed/Documents/Projects/Oliver/semblance
deploy: ./deploy.sh
url: https://optical-dev.oliver.solutions/semblance/
server: optical-dev
tags: [oliver, ai, synthetic-personas, focus-group, insights, gcp, socketio]
created: 2026-04-14
last_commit: 2026-04-24
commits: 122
port: 5137
db: MongoDB 7
service: Docker Compose
---

## Overview

**Semblance** is an AI-powered research platform that generates synthetic consumer personas and orchestrates autonomous focus group sessions. Users build realistic buyer profiles (via AI generation or manual entry), moderate live discussions with LLM-powered persona responses, or enable AI autonomous mode to orchestrate multi-persona conversations with speaker selection and turn-taking. The platform extracts themes and session analytics in real time via Socket.IO, with full quota enforcement, usage tracking, and admin analytics for cost visibility.

## Tech Stack

- **Frontend:** React 18 + TypeScript, Vite, TanStack Query (server state), Socket.IO client, React Router DOM, MSAL (Microsoft SSO)
- **Backend:** Python 3.11+, Quart (async web framework), python-socketio, Motor (async MongoDB driver), PyMongo (sync driver)
- **Database:** MongoDB 7 (Docker container, 127.0.0.1:27017)
- **Infrastructure:** Docker Compose, Apache 2 (reverse proxy), systemd/Docker (process management)
- **AI/ML:** Google Gemini 2.0 Flash & Pro, OpenAI GPT-4o, unified LLM interface with token counting
- **Key libraries:** asyncio (concurrent AI threads), Hypercorn (ASGI server), pydantic (validation), PyJWT (auth)

## Architecture

```
Browser (React SPA)
  ├── TanStack Query + Socket.IO client
  ├── AuthContext (JWT in memory)
  └── WebSocketContextNew (real-time events)
         ↓
Apache (optical-dev.oliver.solutions)
  ├── /semblance/ → static files (/var/www/html/semblance)
  ├── /semblance_back/api/ → proxy to 127.0.0.1:5137
  └── /semblance_back/socket.io/ → WebSocket to 127.0.0.1:5137
         ↓
Hypercorn ASGI (port 5137)
  ├── Quart app (8 blueprints: auth, personas, focus-groups, ai-personas, etc.)
  ├── python-socketio AsyncServer (room-based WebSocket emit)
  ├── 19 Services (LLM, AI runner, persona generation, theme extraction, etc.)
  └── Motor + PyMongo clients
         ↓
MongoDB 7 (10 collections: users, personas, focus_groups, usage_events, etc.)
```

**Core data flows:**

1. **Persona Generation (AI):** POST `/api/ai-personas/generate` → task_manager creates async job → ai_persona_service calls LLM (Gemini/OpenAI) with persona-basic/detailed-generation.md prompt → stores result in `personas` collection with allowlisted fields → UsageEvent records tokens + cost → WebSocket task_complete event to user.

2. **Manual Focus Group Message:** POST `/api/focus-groups/<id>/messages` → jwt_required + quota check → focus_group_response_service iterates personas → calls LLM for each persona response in parallel → stores in focus_group_messages collection → emits new_message WebSocket event.

3. **AI Autonomous Mode:** After session create, autonomous_conversation_controller spawns dedicated asyncio thread → conversation_decision_service selects next speaker → generates turn-by-turn responses → conversation_context_service maintains rolling context window → conversation_state_manager tracks turn count + termination logic → emits real-time events as messages arrive.

4. **Theme Extraction:** After session end, key_theme_service calls LLM to identify patterns → stores in focus_group_themes collection → focus_group_summary_service aggregates for session report.

## Dev Commands

```bash
# First-time setup
cd /Users/ai_leed/Documents/Projects/Oliver/semblance
npm install
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd ..
cp .env.development .env
cd backend && cp .env.example .env
# Edit backend/.env: fill SECRET_KEY, JWT_SECRET_KEY, GEMINI_API_KEY, OPENAI_API_KEY
nano .env

# Run locally (combined)
./start.sh
# Backend: http://localhost:5137
# Frontend: http://localhost:5173 (proxies /api → backend)

# Or separate terminals:
# Terminal 1 — Backend:
cd backend && source venv/bin/activate && python run.py

# Terminal 2 — Frontend:
npm run dev

# Development tasks
npm run lint              # Frontend lint
npm run build             # Production build
npm run build:dev         # Dev mode build
cd backend && python -m pytest tests/  # Backend tests
python scripts/seed_model_pricing.py   # Seed model costs locally

# Default local login: user / pass
```

## Deployment

- **Server:** optical-dev.oliver.solutions
- **Deploy:** `./deploy.sh` (on server, in `/opt/semblance/`)
- **URL:** https://optical-dev.oliver.solutions/semblance/
- **Port:** 5137 (backend, 127.0.0.1 only; public via Apache)
- **Service:** Docker Compose (`docker compose up -d --build`)
- **Local path:** /Users/ai_leed/Documents/Projects/Oliver/semblance

**Deploy process (automated by deploy.sh):**
1. Pre-flight: verify `backend/.env` has all 4 required keys (SECRET_KEY, JWT_SECRET_KEY, GEMINI_API_KEY, OPENAI_API_KEY)
2. `git pull`
3. Docker build frontend → copy dist to `/var/www/html/semblance`
4. Ensure Apache proxy config + reload
5. `docker compose up -d --build` (rebuild + restart mongo + backend)
6. Health check loop (30 retries, 10s intervals)
7. Run seed_model_pricing.py backfill
8. Verify both services healthy

**⚠️ CONSTRAINT:** Do NOT run `./deploy.sh` without explicit user instruction. Hard rule.

## Environment Variables

**Frontend (.env):**
- `VITE_FRONTEND_BASE_URL` — public app URL (dev: `http://localhost:5173`, prod: `https://optical-dev.oliver.solutions/semblance`)
- `VITE_API_BASE_URL` — backend API root (dev: `/api` proxied by Vite, prod: `https://optical-dev.oliver.solutions/semblance_back/api`)
- `VITE_WEBSOCKET_PATH` — Socket.IO path (dev: `/socket.io/`, prod: `/semblance_back/socket.io/`)
- `VITE_MSAL_REDIRECT_URI` — Microsoft SSO redirect (dev: `http://localhost:5173/`, prod: `https://optical-dev.oliver.solutions/semblance`)
- `VITE_MSAL_POST_LOGOUT_REDIRECT_URI` — Post-logout URL (same as REDIRECT_URI)
- `VITE_ENABLE_LOCAL_LOGIN` — enable user/pass login (dev: `true`, prod: `false`)
- `VITE_ENABLE_WEBSOCKET_DEBUG` — WebSocket logging (dev: `true`, prod: `false`)

**Backend (backend/.env) — required:**
- `SECRET_KEY` — Flask session secret (generate: `python3 -c "import secrets; print(secrets.token_hex(32))"`)
- `JWT_SECRET_KEY` — JWT signing key (same generation as above)
- `GEMINI_API_KEY` — Google Gemini API key
- `OPENAI_API_KEY` — OpenAI API key

**Backend (optional):**
- `MONGODB_URI` — MongoDB connection string (default: `mongodb://localhost:27017/semblance`)
- `MSAL_CLIENT_ID`, `MSAL_CLIENT_SECRET`, `MSAL_TENANT_ID` — Microsoft SSO credentials (prod only)
- `QUART_ENV` — `development` or `production`

## API / Endpoints

**Key routes (all under `/api/`):**
- `POST /auth/login` — local auth (dev only)
- `POST /ai-personas/generate` — async persona generation (returns 202 + task_id)
- `GET /tasks/<task_id>` — poll job status
- `POST /focus-groups` — create session
- `POST /focus-groups/<id>/messages` — send manual message
- `POST /focus-groups/<id>/autonomous-start` — enable AI mode
- `POST /focus-groups/<id>/autonomous-stop` — disable AI mode
- `GET /focus-groups/<id>` — retrieve session (includes messages, themes)
- `POST /admin/users` — manage users & quotas (admin only)
- `GET /admin/usage` — analytics dashboard (admin only)

**WebSocket events:**
- `task_complete` — persona/theme generation finished
- `new_message` — focus group message from human or AI
- `mode_changed` — autonomous mode toggled
- `conversation_turn` — AI turn announcement (in autonomous mode)
- `session_summary` — final themes & analytics

## Known Issues

- **Live token extraction:** Missing `usage_metadata` in some LLM responses logs warning but continues gracefully; "thinking tokens" captured when available (o1-mini limitation).
- **Backfill pricing:** Requires `--delete-existing-estimates` flag to recalculate; uses accumulated conversation context for estimation.
- **Admin filters:** ISO Z timestamp parsing previously crashed; fixed in commit 7b6a7c73; validate all date filters are ISO 8601.
- **AI autonomous mode:** Historical race conditions (split-brain UI, cross-loop WebSocket emit) fixed in commits 283b31e7–b4978989; monitor logs for asyncio.TimeoutError

## Timeline / Git History
| Date | Change |
|------|--------|
| 2026-04-24 | Add LLM usage tracking infrastructure (Phases A-C) |
| 2026-03-30 | Fix: task result not stored in useTaskPolling (false 'no personas' error) |
| 2026-03-23 | Fix AI autonomous mode: cross-loop WebSocket emit + polling fallback |
| 2026-03-23 | Allow document uploads (PDF, DOCX, TXT) as focus group assets |
| 2026-03-23 | **Critical:** Migrate task delivery WebSocket → HTTP polling (GCP 30s timeout) |
| 2026-03-23 | Fix all async LLM routes: bypass GCP 30s LB timeout |
| 2026-03-23 | Fix naive vs aware datetime crash + stuck AI mode |

## Sessions
### 2026-04-29 – Check Obsidian integration to ensure all
**Asked:** Check Obsidian integration to ensure all project changes are logged with sufficient detail for user understanding.
**Done:** Verified that the cloud-code script receives complete project context from HUCOM on startup including server and deployment information.

### 2026-04-24 – Check if the codebase has user
**Asked:** Check if the codebase has user management and token cost tracking by project/user, and create an implementation plan if missing.
**Done:** Analyzed the codebase and identified missing token cost tracking features; created a plan requiring token pricing models, usage logging, and cost aggregation endpoints.

### 2026-04-24 – Analyze the codebase for user management
**Asked:** Analyze the codebase for user management and token usage tracking by project and user, then create an implementation plan if missing.
**Done:** Identified gaps in token usage recording (missing warnings for None metadata and thinking model token handling) and provided fixes for accurate Gemini billing tracking.

### 2026-04-24 – Analyze the codebase and create a
**Asked:** Analyze the codebase and create a CLAUDE.md file with development commands and architecture overview.
**Done:** Created CLAUDE.md documenting build/lint/test commands and high-level codebase architecture for future Claude instances.

### 2026-04-24 – Review user management and token cost
**Asked:** Review user management and token cost tracking features, then create implementation plan if missing.
**Done:** Analyzed codebase and created CLAUDE.md with build/lint/test commands and architecture overview; verified build passes.

### 2026-04-24 – Analyze codebase for user management and
**Asked:** Analyze codebase for user management and token usage tracking with cost analytics by project and user, then create an implementation plan.
**Done:** Created backfill script for token usage events and executed it to generate 902 usage records across the system.

### 2026-04-24 – Analyze the codebase, create a CLAUDE.md
**Asked:** Analyze the codebase, create a CLAUDE.md file with setup commands and architecture docs, and assess token usage tracking across users and projects.
**Done:** Fixed persona data type errors in backfill script and deployed usage tracking script to backend container for testing.

### 2026-04-24 – Check if the codebase has user
**Asked:** Check if the codebase has user management and token usage tracking with cost analytics by project and user, and create an implementation plan if missing.
**Done:** Reviewed codebase for token tracking and cost management features; determined missing functionality and requested pricing information for used models to create implementation plan.

### 2026-04-24 – Create a CLAUDE.md documentation file and
**Asked:** Create a CLAUDE.md documentation file and analyze token usage/cost tracking across users and projects.
**Done:** Analyzed codebase structure, identified missing user management and token cost tracking system, and created implementation plan.

### 2026-04-24 – Check if user management and token
**Asked:** Check if user management and token usage tracking by project/user exists, and create implementation plan if missing.
**Done:** Analyzed data structure and identified missing user management and token cost tracking system; created implementation plan requiring LLM pricing data.

### 2026-04-24 – Check if user management and token
**Asked:** Check if user management and token cost tracking exist, and create an implementation plan if missing.
**Done:** Reviewed existing codebase and confirmed token usage tracking exists via `_record_usage` in LLMService; identified that historical data needs backfill using `backend/scripts/backfill_usage.py`.

### 2026-04-24 – Analyze the codebase and create a
**Asked:** Analyze the codebase and create a CLAUDE.md file with common commands and architecture overview, then check if user management and token cost tracking by project/user exists, and create an implementation plan if missing.
**Done:** Confirmed all 8 tasks are completed; verified logout functionality correctly clears localStorage and handles Microsoft SSO without backend deactivation.

### 2026-04-24 – Check if user management and token
**Asked:** Check if user management and token cost tracking by project/user exist, create implementation plan if missing.
**Done:** Confirmed all features already implemented and marked 8 tasks as completed.

### 2026-04-24 – Analyze codebase and create CLAUDE.md documentation,
**Asked:** Analyze codebase and create CLAUDE.md documentation, then implement user management and token usage tracking with cost breakdown.
**Done:** Created CLAUDE.md with build/test commands and architecture overview, implemented usage tracking endpoints and billing page with cost breakdown by feature and project.

### 2026-04-24 – Analyze codebase and create CLAUDE.md documentation
**Asked:** Analyze codebase and create CLAUDE.md documentation file with build commands and architecture overview, then check for user management and token cost tracking system.
**Done:** Identified missing token usage tracking features and created implementation plan including authentication decorators, usage API endpoints, WebSocket events, and billing UI components.

### 2026-04-24 – Create CLAUDE.md documentation and implement token
**Asked:** Create CLAUDE.md documentation and implement token usage tracking with cost management by user and project.
**Done:** Analyzed codebase architecture, documented development commands and setup, and created implementation plan for token tracking and cost analysis system.

### 2026-04-24 – Analyze codebase and create CLAUDE.md documentation
**Asked:** Analyze codebase and create CLAUDE.md documentation with build/test commands and architecture overview, then check for user management and token cost tracking system; if missing, provide implementation plan.
**Done:** Created admin panel backend with user management endpoints (list, update roles/quotas, enable/disable), usage analytics endpoints (summary by user/model/feature/day, raw event drill-down), and committed 14 admin files with all Phase D features.

### 2026-04-24 – Review user management and token cost
**Asked:** Review user management and token cost tracking systems, then create implementation plan if missing.
**Done:** Analyzed codebase and identified missing token cost tracking system; created implementation plan with required models and database schema.

### 2026-04-24 – Can you check if we have
**Asked:** Can you check if we have user management and token cost tracking by project and user, and create an implementation plan if missing?
**Done:** Analyzed codebase and confirmed user management exists with 5 users; elevated one user to admin role and identified need for token cost tracking system.

### 2026-04-24 – Analyze codebase architecture and check for
**Asked:** Analyze codebase architecture and check for user management and token cost tracking system.
**Done:** Created CLAUDE.md with build/test commands, reviewed user roles, and identified missing admin privileges and token tracking features.

### 2026-04-24 – Check if user management and token
**Asked:** Check if user management and token cost tracking by project and user exist, and create implementation plan if missing.
**Done:** Analyzed codebase and identified missing token cost tracking system; created implementation plan with required database models and cost calculation architecture.

### 2026-04-24 – Review user management, token cost tracking,
**Asked:** Review user management, token cost tracking, and spending analytics by project/user, then create an implementation plan if missing.
**Done:** Analyzed codebase and confirmed seed_model_pricing.py script is properly configured to run via docker compose with correct paths and environment variables.

### 2026-04-24 – Check if user management and token
**Asked:** Check if user management and token usage tracking with cost breakdown by project/user exists, and create an implementation plan if missing.
**Done:** Verified 70/70 tests passing and confirmed usage infrastructure foundation exists; provided implementation plan outline with model requirements pending pricing details.

### 2026-04-24 – Check if the codebase has user
**Asked:** Check if the codebase has user management and token cost tracking by project/user, and create an implementation plan if missing.
**Done:** Analyzed codebase structure and identified missing token cost tracking; created implementation plan requiring cost configuration for supported models.

### 2026-04-24 – Check for user management and token
**Asked:** Check for user management and token usage/cost tracking system, create implementation plan if missing.
**Done:** Confirmed three recording hooks are integrated and identified missing token cost tracking system requiring implementation plan with model pricing details.

### 2026-04-24 – Create and improve a CLAUDE.md documentation
**Asked:** Create and improve a CLAUDE.md documentation file for the codebase with commands and architecture overview.
**Done:** Updated CLAUDE.md with corrected paths, added npm dev command, clarified ASGI/socketio architecture, and documented async patterns.

### 2026-04-14 – Project catalogued
**Done:** Added to Obsidian second brain with full details.

---
## Change Log
| Date | Requested | Changed | Files |
|------|-----------|---------|-------|
| 2026-04-29 | Obsidian integration audit | Verified context transmission from HUCOM, checked for missing CLAUDE.md and Obsidian notes | cloud-code initialization, HUCOM data transfer, project logging system |
| 2026-04-24 | Token cost tracking | Add user/project cost models, usage logging, cost aggregation endpoints | CLAUDE.md |
| 2026-04-24 | Token usage tracking | Add warning for None metadata, handle thinking model tokens, implement project/user-level cost tracking | gemini_client.py, CLAUDE.md |
| 2026-04-24 | Documentation setup | Created CLAUDE.md with dev commands and architecture overview | CLAUDE.md |
| 2026-04-24 | Token cost tracking | Analyzed existing features, created implementation plan documentation | CLAUDE.md |
| 2026-04-24 | Token usage tracking | Added backfill_usage.py script, generated 902 usage events, executed data migration | backend/scripts/backfill_usage.py, database |
| 2026-04-24 | Token tracking assessment | Script deployed, persona field types corrected | backfill_usage.py, backend container |
| 2026-04-24 | Token tracking & cost analytics | Review codebase, identify missing user management and per-project cost tracking, request model pricing | CLAUDE.md |
| 2026-04-24 | Codebase analysis | CLAUDE.md creation, token tracking system design | CLAUDE.md, database schema, implementation plan |
| 2026-04-24 | User management & token tracking | Add user authentication, token usage logging, cost calculation per project/user, pricing configuration | CLAUDE.md, implementation plan (to be created) |
| 2026-04-24 | Token tracking system | Verified usage_events recording, identified backfill requirement for historical data | LLMService.ts, backfill_usage.py |
| 2026-04-24 | Token cost tracking | Create implementation plan for user management and token spending analytics | CLAUDE.md, Architecture docs |
| 2026-04-24 | Token cost tracking | Verified existing user management and token cost features, marked tasks completed | CLAUDE.md |
| 2026-04-24 | Token tracking system | Added usage endpoints, WebSocket emissions, billing page, navigation link | routes/usage.ts, pages/billing.tsx, middleware/auth.ts, navigation.tsx |
| 2026-04-24 | Token usage tracking | Add @active_required decorators, create usage.py routes, emit WebSocket events, build MyUsage page | backend/app/routes/usage.py, src/pages/MyUsage.tsx, WebSocket handlers |
| 2026-04-24 | Token tracking system | CLAUDE.md documentation, architecture overview, implementation plan for user management and cost tracking | CLAUDE.md, documentation files |
| 2026-04-24 | User management & token tracking | Added user list/update/disable endpoints, usage summary and event drill-down by user/model/feature/day, cost aggregation | backend/app/routes/admin.py |
| 2026-04-24 | Token cost tracking system | Add User, Project, TokenUsage, CostConfig models; create admin panel for cost management | CLAUDE.md, schema.sql, implementation-plan.md |
| 2026-04-24 | Token cost tracking system | Add token usage models, cost calculation by user/project, reporting dashboard | CLAUDE.md, User.model.ts, TokenUsage.model.ts, CostReport.tsx |
| 2026-04-24 | Codebase documentation and user audit | CLAUDE.md created, user role analysis completed | CLAUDE.md |
| 2026-04-24 | Token cost tracking | Database models for pricing/costs, cost calculation logic, reporting by project/user | CLAUDE.md, implementation plan document |
| 2026-04-24 | Token tracking system | Verified seed_model_pricing.py setup, docker compose exec command syntax | docker-compose.yml, scripts/seed_model_pricing.py |
| 2026-04-24 | Usage tracking system | LLMCallContext, cost breakdown by project/user, admin dashboard | backend/tests/test_usage_infrastructure.py, llm_service.py, conftest.py |
| 2026-04-24 | Token cost tracking | Add cost config per model, create Usage and Project models, add cost calculation to LLM service | models.py, llm_service.py, CLAUDE.md |
| 2026-04-24 | Token cost tracking system | Add tiktoken dependency, create user management module, implement cost tracking by project and user | requirements.txt, CLAUDE.md |
| 2026-04-24 | CLAUDE.md enhancement | Fixed wiki path, added npm run dev command, clarified ASGI architecture and async Motor/PyMongo distinction | CLAUDE.md |
| 2026-03-23 | Fix AI mode hanging on GCP | WebSocket → HTTP polling for all LLM routes | backend |
| 2026-03-23 | Add document upload support | PDF/DOCX/TXT as focus group assets | backend, frontend |

## Related
- [[modcomms/Mod Comms]] (same GCP timeout issue)
- [[olivas/OliVAS]]
- [[build-a-squad/Build A Squad]]