Update README with comprehensive project documentation

Rewrites the outdated README to accurately reflect the current state of the
project: corrects framework references (Quart/Hypercorn instead of Flask/Gunicorn),
documents all major features (autonomous AI conversations, multi-model LLM support,
WebSocket communication, Microsoft SSO, theme extraction), updates the project
structure and tech stack, adds architecture overview, environment configuration
table, and deployment instructions using deploy.sh/systemd.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
michael 2026-02-24 10:50:29 -06:00
parent b1be8f8c38
commit 33fb25467e

253
README.md
View file

@ -1,41 +1,82 @@
# Semblance Synthetic Society
A platform for creating and managing synthetic personas for focus groups and market research.
## Project info
**URL**: https://lovable.dev/projects/ee7a424f-7f6c-4b5d-9645-e66074cea7d3
An AI-powered platform for creating synthetic personas and running autonomous focus group sessions. Build realistic consumer profiles, moderate live discussions or let AI drive the conversation, and extract actionable themes and insights — all in real time.
## Features
- Create and manage synthetic personas with detailed profiles
- Organize personas into focus groups
- Run interactive focus group sessions
- Analyze results and extract insights
- MongoDB-based backend for data persistence
- User authentication and access control
### Persona Management
- **AI Persona Generation** — Generate detailed synthetic personas from audience briefs using multi-model LLM support
- **Manual Creation** — Build custom personas with demographic, psychographic, and behavioral attributes
- **Folder Organization** — Organize personas into folders with drag-and-drop support
- **Bulk Export** — Export persona profiles individually or in bulk (PDF, summary formats)
- **Persona Modification** — Refine and adjust AI-generated personas after creation
### Focus Group Sessions
- **Manual Moderation** — Guide discussions in real time as a human moderator
- **Autonomous AI Mode** — Let the AI runner orchestrate multi-persona conversations with intelligent speaker selection and conversation flow decisions
- **Real-Time WebSocket Communication** — Live message streaming via Socket.IO with room-based session management
- **Discussion Guide Generation** — AI-generated structured discussion guides from research objectives
- **Theme Extraction** — Automatic identification and highlighting of key themes across session transcripts
- **Session Analytics** — Summary generation and insight extraction from completed sessions
- **AI Moderator** — AI-assisted moderation with probe generation and follow-up questions
### AI Integration
- **Multi-Model Support** — Google Gemini 3 Pro (default), OpenAI GPT-4.1, OpenAI GPT-5.2
- **Prompt Template System** — 20 markdown-based prompt templates for persona generation, conversation management, theme extraction, and more
- **Configurable Reasoning** — Adjustable reasoning effort for supported models
### Authentication
- **Local Login** — Username/password authentication with JWT tokens
- **Microsoft 365 SSO** — Azure AD integration via MSAL for enterprise environments
### Dashboard & Organization
- **Project Dashboard** — Overview of personas, focus groups, and recent activity
- **Folder System** — Hierarchical organization with drag-and-drop (DND Kit)
- **Charts & Analytics** — Visual insights with Recharts
## Tech Stack
### Frontend
- **Build**: Vite 5, TypeScript 5.5
- **UI**: React 18, Tailwind CSS, shadcn-ui (Radix UI), Lucide React icons
- **State**: TanStack Query, React Hook Form + Zod validation
- **Routing**: React Router DOM
- **Real-Time**: Socket.IO client
- **Charts**: Recharts
- **Drag & Drop**: DND Kit
- **Auth**: MSAL React (Azure AD)
### Backend
- **Framework**: Quart (async Python), Hypercorn ASGI server
- **Database**: MongoDB via PyMongo + Motor (async)
- **Real-Time**: python-socketio (AsyncServer)
- **AI/LLM**: OpenAI SDK, Google Generative AI (`google-genai`)
- **Auth**: Custom Quart-compatible JWT, MSAL (Microsoft), bcrypt
### Infrastructure
- **Database**: MongoDB
- **Deployment**: systemd service, automated `deploy.sh` script
- **Process**: Hypercorn ASGI (port 5137)
## Getting Started
### Prerequisites
- Node.js & npm installed - [install with nvm](https://github.com/nvm-sh/nvm#installing-and-updating)
- Python 3.8+ installed for the backend
- MongoDB installed and running locally (default configuration: mongodb://localhost:27017)
- **Node.js** (v18+) & npm — [install with nvm](https://github.com/nvm-sh/nvm#installing-and-updating)
- **Python 3.11+**
- **MongoDB** installed and running (default: `mongodb://localhost:27017`)
### Installation
```sh
# Step 1: Clone the repository
git clone <YOUR_GIT_URL>
# Clone the repository
git clone https://github.com/your-org/synthetic-society.git
cd synthetic-society
# Step 2: Navigate to the project directory
cd <YOUR_PROJECT_NAME>
# Step 3: Install frontend dependencies
# Install frontend dependencies
npm install
# Step 4: Install backend dependencies
# Set up the backend
cd backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
@ -43,87 +84,151 @@ pip install -r requirements.txt
cd ..
```
### Environment Setup
Copy the appropriate environment file:
```sh
# For local development
cp .env.development .env
# For production
cp .env.production .env
```
See [Environment Configuration](#environment-configuration) for details on each setting.
### Running the Application
Use the provided start script to run both frontend and backend:
**Option 1 — Start script** (starts both frontend and backend):
```sh
./start.sh
```
The start script will:
1. Check for and start MongoDB if needed
2. Set up the Python virtual environment
3. Install dependencies
4. Populate the database with sample personas and focus groups
5. Start both the backend and frontend servers
Or run them separately:
**Option 2 — Run separately:**
```sh
# Start the backend
# Terminal 1: Start the backend
cd backend
source venv/bin/activate
python run.py
# In another terminal, run the frontend
# Terminal 2: Start the frontend
npm run dev
```
The frontend will be available at http://localhost:5173
The backend API is available at http://localhost:5137/api
- **Frontend**: http://localhost:5173
- **Backend API**: http://localhost:5137/api
### Default Login
- Username: user
- Password: pass
- **Username**: `user`
- **Password**: `pass`
## Technology Stack
### Frontend
- Vite
- TypeScript
- React
- React Router
- shadcn-ui
- Tailwind CSS
- Axios for API requests
### Backend
- Python
- Flask
- PyMongo (MongoDB client)
- JWT for authentication
> Local login is enabled by default in development. In production, Microsoft SSO is the primary auth method.
## Project Structure
- `/src`: Frontend source code
- `/components`: React components
- `/contexts`: React contexts for state management
- `/hooks`: Custom React hooks
- `/lib`: Utility functions and API client
- `/pages`: Main application pages
- `/types`: TypeScript type definitions
- `/backend`: Python backend
- `/app`: Flask application
- `/models`: Database models
- `/routes`: API endpoints
- `run.py`: Backend entry point
```
├── src/ # Frontend source
│ ├── components/
│ │ ├── ui/ # shadcn-ui components
│ │ ├── focus-group-session/ # Session UI (DiscussionPanel, ParticipantPanel, ThemesPanel)
│ │ ├── persona/ # Persona management components
│ │ ├── ai-recruiter/ # AI persona recruitment
│ │ ├── auth/ # Authentication components
│ │ └── dashboard/ # Dashboard components
│ ├── pages/ # Route pages (Dashboard, FocusGroups, Login, etc.)
│ ├── hooks/ # Custom hooks (WebSocket, personas, discussion guides)
│ ├── contexts/ # React contexts (Auth, WebSocket, Navigation)
│ ├── services/ # WebSocket service layer
│ ├── lib/ # API client, utilities
│ ├── types/ # TypeScript type definitions
│ ├── utils/ # Avatar, mention, discussion guide utilities
│ └── config/ # MSAL configuration
├── backend/
│ ├── run.py # Entry point (Hypercorn ASGI)
│ ├── app/
│ │ ├── routes/ # API endpoints (auth, personas, focus-groups, ai-personas, folders, tasks)
│ │ ├── services/ # Business logic (19 services)
│ │ ├── models/ # Data models (User, Persona, FocusGroup, Folder)
│ │ ├── auth/ # Custom Quart JWT implementation
│ │ └── utils/ # Prompt loader, discussion guide schema
│ └── prompts/ # LLM prompt templates (20 markdown files)
├── deploy.sh # Production deployment script
├── semblance.service # systemd service file
├── start.sh # Local development start script
├── .env.development # Development environment config
└── .env.production # Production environment config
```
## Architecture Overview
### Real-Time Communication
Socket.IO provides bidirectional WebSocket communication between the React frontend and Quart backend. The `websocket_manager_async.py` module manages room-based messaging so each focus group session operates in an isolated channel. Messages are broadcast to all participants in a room as they arrive.
### Autonomous Conversation System
Focus groups can run without human moderation:
- **AI Runner Service** (`ai_runner_service.py`) — Manages background task execution for autonomous sessions
- **Autonomous Conversation Controller** (`autonomous_conversation_controller.py`) — Orchestrates the multi-persona conversation loop
- **Conversation Decision Service** (`conversation_decision_service.py`) — Determines when to continue, who speaks next, and when to wrap up
- **Conversation Context Service** (`conversation_context_service.py`) — Maintains conversation state, history, and context window
### LLM Integration Layer
The `llm_service.py` module provides a unified interface across multiple AI providers:
- **Google Gemini** (`gemini-3-pro-preview`) — Default model
- **OpenAI** (`gpt-4.1`, `gpt-5.2`) — Alternative models with reasoning effort support
- Prompt templates in `/backend/prompts/` are loaded by `prompt_loader.py` and injected at runtime
## Environment Configuration
| Setting | Development | Production |
|---|---|---|
| **Base Path** | `/` | `/semblance/` |
| **API Base URL** | `/api` (proxied to `:5137`) | `https://ai-sandbox.oliver.solutions/semblance_back/api` |
| **WebSocket Path** | `/socket.io/` | `/semblance_back/socket.io/` |
| **MSAL Redirect** | `http://localhost:5173/` | `https://ai-sandbox.oliver.solutions/semblance` |
| **Local Login** | Enabled | Disabled |
Environment files:
- **`.env.development`** — Local development settings
- **`.env.production`** — Production server settings
- **`.env`** — Active config (copy from the appropriate file above)
## Deployment
The application is configured to be deployed at the `/semblance/` path. For hosting:
### Build
1. Build the frontend:
```sh
npm run build
```
```sh
npm run build # Production frontend build
```
2. Deploy the backend using a WSGI server like Gunicorn:
```sh
cd backend
gunicorn -w 4 "app:create_app()"
```
### Deploy
The project includes an automated deployment script:
```sh
./deploy.sh
```
This script handles:
1. Pulling latest code from git
2. Setting up the Python virtual environment
3. Installing backend dependencies
4. Building the frontend
5. Deploying built assets to the web server directory
6. Restarting the `semblance.service` systemd unit
### Manual Backend Start (Production)
```sh
cd backend
source venv/bin/activate
hypercorn "app:create_app()" --bind 0.0.0.0:5137
```
The included `semblance.service` file can be installed as a systemd unit for process management.
## Contributing