# Semblance Synthetic Society - Application Documentation
## Table of Contents
1. [Application Overview](#application-overview)
2. [Application Architecture](#application-architecture)
3. [User Activity Flow](#user-activity-flow)
4. [Data Model](#data-model)
5. [Technical Details and Specifications](#technical-details-and-specifications)
6. [API Structure](#api-structure)
7. [AI Integration](#ai-integration)
8. [Development and Deployment](#development-and-deployment)
9. [Security Considerations](#security-considerations)
10. [Key Features](#key-features)
## Application Overview
Semblance Synthetic Society is an AI-powered platform for creating and managing synthetic personas for focus groups and market research. It enables researchers to:
- Create detailed synthetic personas with demographic profiles and personality traits
- Organize personas into focus groups
- Run AI-moderated focus group sessions with autonomous conversations
- Analyze results with real-time theme extraction and reporting
- Export comprehensive insights and recommendations
## Application Architecture
### High-Level Architecture
```mermaid
graph TB
subgraph "Frontend - React/TypeScript"
A[React App
Vite + TypeScript] --> B[Components]
B --> C[Pages]
B --> D[UI Components
shadcn-ui]
A --> E[State Management
React Context + Hooks]
A --> F[API Client
Axios]
end
subgraph "Backend - Python/Flask"
G[Flask API] --> H[Routes]
H --> I[Services]
I --> J[Models]
J --> K[(MongoDB)]
I --> L[Google Gemini AI
LLM Integration]
end
F -->|HTTP/REST| G
subgraph "Infrastructure"
M[Static Hosting
Frontend Build]
N[WSGI Server
Backend API]
O[MongoDB Instance]
end
```
### Technology Stack
#### Frontend
- **Framework**: React 18.3.1 with TypeScript
- **Build Tool**: Vite 5.4.1
- **Styling**: Tailwind CSS 3.4.11 with shadcn-ui components
- **Routing**: React Router DOM 6.26.2
- **State Management**: React Context API + Custom Hooks
- **HTTP Client**: Axios 1.6.2
- **Forms**: React Hook Form 7.53.0 with Zod validation
- **Authentication**: JWT with localStorage persistence
#### Backend
- **Framework**: Flask (Python)
- **Database**: MongoDB with PyMongo
- **Authentication**: Flask-JWT-Extended
- **AI Integration**: Google Generative AI (Gemini 2.5 Pro)
- **Server**: Hypercorn (ASGI server)
- **CORS**: Flask-CORS for cross-origin requests
## User Activity Flow
### Main User Journey
```mermaid
flowchart TD
A[Landing Page] --> B{Authenticated?}
B -->|No| C[Login Page]
B -->|Yes| D[Dashboard]
C --> E[Enter Credentials]
E --> F[JWT Authentication]
F --> D
D --> G[Synthetic Users]
D --> H[Focus Groups]
D --> I[Overview Stats]
G --> J[Create Persona]
G --> K[AI Recruiter]
G --> L[View/Edit Personas]
J --> M[Manual Creation Form]
K --> N[Bulk Generation]
H --> O[Create Focus Group]
H --> P[View Groups]
O --> Q[Select Participants]
O --> R[Configure Settings]
P --> S[Focus Group Session]
S --> T[AI Moderated Discussion]
S --> U[Real-time Analytics]
S --> V[Export Results]
```
### Focus Group Session Flow
```mermaid
sequenceDiagram
participant U as User
participant F as Frontend
participant B as Backend
participant AI as AI Service
participant DB as MongoDB
U->>F: Start Focus Group Session
F->>B: Initialize Session
B->>DB: Load Focus Group Data
B->>DB: Load Participant Personas
B-->>F: Return Session Data
U->>F: Ask Question
F->>B: Send Moderator Message
B->>AI: Generate AI Response
AI-->>B: AI Moderator Response
B->>DB: Store Message
B-->>F: Return Response
loop Autonomous Mode
B->>AI: Determine Next Action
AI-->>B: Next Participant/Question
B->>AI: Generate Participant Response
AI-->>B: Persona Response
B->>DB: Store Message
B-->>F: Stream Response
end
U->>F: Request Themes
F->>B: Extract Key Themes
B->>AI: Analyze Conversation
AI-->>B: Extracted Themes
B->>DB: Store Themes
B-->>F: Return Themes
```
## Data Model
### MongoDB Collections
```mermaid
erDiagram
USERS ||--o{ PERSONAS : creates
USERS ||--o{ FOCUS_GROUPS : creates
FOCUS_GROUPS ||--o{ PERSONAS : includes
FOCUS_GROUPS ||--o{ MESSAGES : contains
FOCUS_GROUPS ||--o{ THEMES : generates
FOCUS_GROUPS ||--o{ NOTES : has
FOCUS_GROUPS ||--o{ REASONING : tracks
USERS {
ObjectId _id PK
string username
string email
string password_hash
string role
datetime created_at
}
PERSONAS {
ObjectId _id PK
string name
string age
string gender
string occupation
string location
number techSavviness
string personality
object oceanTraits
object thinkFeelDo
array goals
array frustrations
array motivations
string created_by FK
datetime created_at
}
FOCUS_GROUPS {
ObjectId _id PK
string name
string description
string objective
array participants FK
object discussionGuide
string status
string created_by FK
datetime created_at
}
MESSAGES {
ObjectId _id PK
string focus_group_id FK
string text
string type
string senderId
boolean highlighted
datetime created_at
}
THEMES {
ObjectId _id PK
string focus_group_id FK
string title
string description
array quotes
string source
datetime created_at
}
```
### Key TypeScript Interfaces
```typescript
// Persona Interface
interface Persona {
id: string;
_id?: string;
name: string;
age: string;
gender: string;
occupation: string;
location: string;
techSavviness: number;
personality: string;
oceanTraits?: {
openness: number;
conscientiousness: number;
extraversion: number;
agreeableness: number;
neuroticism: number;
};
thinkFeelDo?: {
thinks: string[];
feels: string[];
does: string[];
};
goals?: string[];
frustrations?: string[];
motivations?: string[];
scenarios?: string[];
// Additional fields...
}
// Focus Group Interface
interface FocusGroup {
_id: string;
name: string;
description: string;
objective: string;
participants: string[];
discussionGuide?: DiscussionGuide;
status: string;
created_at: string;
created_by: string;
}
// Discussion Guide Structure
interface DiscussionGuide {
introduction: string;
sections: {
id: string;
title: string;
duration: number;
items: {
id: string;
type: 'question' | 'activity' | 'probe';
content: string;
notes?: string;
}[];
}[];
conclusion: string;
}
```
## Technical Details and Specifications
### Authentication Flow
```mermaid
sequenceDiagram
participant C as Client
participant F as Frontend
participant B as Backend
participant DB as Database
C->>F: Enter Credentials
F->>B: POST /api/auth/login
B->>DB: Verify User
alt Valid Credentials
B-->>F: JWT Token + User Data
F->>F: Store in localStorage
F-->>C: Redirect to Dashboard
else Invalid Credentials
B-->>F: 401 Unauthorized
F-->>C: Show Error
end
Note over F: All API Requests
F->>F: Add JWT to Headers
F->>B: API Request with Bearer Token
B->>B: Verify JWT
alt Valid Token
B-->>F: API Response
else Invalid Token
B-->>F: 401 Unauthorized
F->>F: Clear localStorage
F-->>C: Redirect to Login
end
```
### State Management
The application uses React Context API for global state management:
1. **AuthContext**: Manages user authentication state, JWT token, and login/logout functionality
2. **Local State**: Component-level state for UI interactions
3. **API State**: React Query could be integrated for server state management (currently using direct API calls)
### API Structure
#### Authentication Endpoints
- `POST /api/auth/login` - User login
- `POST /api/auth/register` - User registration
- `GET /api/auth/me` - Get current user profile
#### Persona Management
- `GET /api/personas` - List all personas
- `POST /api/personas` - Create new persona
- `GET /api/personas/:id` - Get persona details
- `PUT /api/personas/:id` - Update persona
- `DELETE /api/personas/:id` - Delete persona
#### Focus Group Management
- `GET /api/focus-groups` - List all focus groups
- `POST /api/focus-groups` - Create new focus group
- `GET /api/focus-groups/:id` - Get focus group details
- `PUT /api/focus-groups/:id` - Update focus group
- `DELETE /api/focus-groups/:id` - Delete focus group
#### AI Operations
- `POST /api/ai-personas/generate` - Generate synthetic personas
- `POST /api/focus-group-ai/:id/start` - Start AI moderation
- `POST /api/focus-group-ai/:id/stop` - Stop AI moderation
- `POST /api/focus-group-ai/:id/message` - Send message to AI moderator
- `GET /api/focus-group-ai/:id/status` - Get moderator status
- `POST /api/focus-group-ai/:id/themes` - Extract key themes
## AI Integration
### LLM Service Architecture
```mermaid
graph TD
A[API Routes] --> B[Service Layer]
B --> C[LLM Service]
C --> D[Google Gemini API]
B --> E[Prompt Templates]
E --> F[persona-generation.md]
E --> G[focus-group-response.md]
E --> H[theme-extraction.md]
E --> I[moderator-system.md]
C --> J[Response Processing]
J --> K[JSON Extraction]
J --> L[Error Handling]
J --> M[Rate Limiting]
```
### Key AI Features
1. **Persona Generation**: Uses Gemini to create detailed, realistic personas based on demographic parameters
2. **AI Moderation**: Autonomous focus group moderation with context awareness
3. **Response Generation**: Persona-specific responses based on personality profiles
4. **Theme Extraction**: Real-time analysis of conversation to identify key themes
5. **Conversation Flow**: AI determines next speakers and follow-up questions
### Prompt Engineering
The system uses structured prompts stored in `/backend/prompts/`:
- System prompts define AI behavior and constraints
- Template prompts use variable substitution for dynamic content
- Chain-of-thought reasoning for complex decisions
## Development and Deployment
### Local Development Setup
```bash
# Frontend
npm install
npm run dev # Development server at http://localhost:5173
# Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python run.py # API server at http://localhost:5137
# MongoDB
# Ensure MongoDB is running on localhost:27017
```
### Build and Deployment
```bash
# Frontend Build
npm run build # Creates /dist folder
# Backend Deployment
cd backend
gunicorn -w 4 "app:create_app()"
```
### Environment Variables
Frontend (`.env`):
```
VITE_API_BASE_URL=/semblance_back/api
```
Backend:
```
GEMINI_API_KEY=your_api_key_here
MONGODB_URI=mongodb://localhost:27017/synthetic_society
JWT_SECRET_KEY=your_secret_key
```
## Security Considerations
1. **Authentication**: JWT-based authentication with token expiration
2. **Authorization**: Route-level protection with `@jwt_required` decorator
3. **Data Validation**: Input validation on both frontend (Zod) and backend
4. **CORS**: Configured for specific origins in production
5. **API Keys**: Environment variables for sensitive configuration
6. **Password Security**: Bcrypt hashing for password storage
7. **Session Management**: Automatic logout on token expiration
## Key Features
### Persona Management
- **Manual Creation**: Detailed form with 50+ attributes
- **AI Generation**: Bulk creation with customizable parameters
- **Persona Profiles**: Comprehensive view with attitudinal profile methodology
- **Folder Organization**: Group personas for easy management
- **Export/Import**: Download personas for backup or sharing
### Focus Group Sessions
- **Discussion Guide Editor**: Structured session planning
- **AI Moderation**: Autonomous or semi-autonomous modes
- **Real-time Participation**: Live conversation with AI personas
- **Theme Extraction**: Automatic identification of key insights
- **Note Taking**: Time-stamped notes linked to messages
- **Analytics Dashboard**: Visual representation of participation and sentiment
### Data Analysis
- **Export Options**: PDF reports, CSV data, JSON backups
- **Theme Management**: Manual and AI-generated theme tracking
- **Conversation History**: Full transcript with highlighting
- **Reasoning Transparency**: View AI decision-making process
## Best Practices for Development Team
1. **Code Organization**: Follow the existing pattern of separating concerns (components, services, types)
2. **Type Safety**: Maintain TypeScript types for all data structures
3. **Error Handling**: Use try-catch blocks with user-friendly toast notifications
4. **API Consistency**: Follow RESTful conventions for new endpoints
5. **Component Reusability**: Utilize shadcn-ui components and create custom wrappers
6. **State Management**: Keep state as local as possible, lift only when necessary
7. **Performance**: Implement pagination for large datasets
8. **Testing**: Add unit tests for critical business logic
9. **Documentation**: Update API documentation when adding new endpoints
10. **Security**: Always validate user input and sanitize data
## Future Enhancements
Based on the codebase analysis, potential areas for enhancement include:
1. **Real-time Updates**: WebSocket integration for live session updates
2. **Advanced Analytics**: More detailed sentiment analysis and reporting
3. **Multi-language Support**: Internationalization for global research
4. **Team Collaboration**: Multiple users per focus group session
5. **Template Library**: Pre-built discussion guides and persona archetypes
6. **API Rate Limiting**: Implement rate limiting for AI endpoints
7. **Caching Layer**: Redis for frequently accessed data
8. **Audit Logging**: Track all user actions for compliance
9. **Backup System**: Automated database backups
10. **Performance Monitoring**: Integration with monitoring tools