# Semblance Synthetic Society - Application Documentation ## Table of Contents 1. [Application Overview](#application-overview) 2. [Application Architecture](#application-architecture) 3. [User Activity Flow](#user-activity-flow) 4. [Data Model](#data-model) 5. [Technical Details and Specifications](#technical-details-and-specifications) 6. [API Structure](#api-structure) 7. [AI Integration](#ai-integration) 8. [Development and Deployment](#development-and-deployment) 9. [Security Considerations](#security-considerations) 10. [Key Features](#key-features) ## Application Overview Semblance Synthetic Society is an AI-powered platform for creating and managing synthetic personas for focus groups and market research. It enables researchers to: - Create detailed synthetic personas with demographic profiles and personality traits - Organize personas into focus groups - Run AI-moderated focus group sessions with autonomous conversations - Analyze results with real-time theme extraction and reporting - Export comprehensive insights and recommendations ## Application Architecture ### High-Level Architecture ```mermaid graph TB subgraph "Frontend - React/TypeScript" A[React App
Vite + TypeScript] --> B[Components] B --> C[Pages] B --> D[UI Components
shadcn-ui] A --> E[State Management
React Context + Hooks] A --> F[API Client
Axios] end subgraph "Backend - Python/Flask" G[Flask API] --> H[Routes] H --> I[Services] I --> J[Models] J --> K[(MongoDB)] I --> L[Google Gemini AI
LLM Integration] end F -->|HTTP/REST| G subgraph "Infrastructure" M[Static Hosting
Frontend Build] N[WSGI Server
Backend API] O[MongoDB Instance] end ``` ### Technology Stack #### Frontend - **Framework**: React 18.3.1 with TypeScript - **Build Tool**: Vite 5.4.1 - **Styling**: Tailwind CSS 3.4.11 with shadcn-ui components - **Routing**: React Router DOM 6.26.2 - **State Management**: React Context API + Custom Hooks - **HTTP Client**: Axios 1.6.2 - **Forms**: React Hook Form 7.53.0 with Zod validation - **Authentication**: JWT with localStorage persistence #### Backend - **Framework**: Flask (Python) - **Database**: MongoDB with PyMongo - **Authentication**: Flask-JWT-Extended - **AI Integration**: Google Generative AI (Gemini 2.5 Pro) - **Server**: Hypercorn (ASGI server) - **CORS**: Flask-CORS for cross-origin requests ## User Activity Flow ### Main User Journey ```mermaid flowchart TD A[Landing Page] --> B{Authenticated?} B -->|No| C[Login Page] B -->|Yes| D[Dashboard] C --> E[Enter Credentials] E --> F[JWT Authentication] F --> D D --> G[Synthetic Users] D --> H[Focus Groups] D --> I[Overview Stats] G --> J[Create Persona] G --> K[AI Recruiter] G --> L[View/Edit Personas] J --> M[Manual Creation Form] K --> N[Bulk Generation] H --> O[Create Focus Group] H --> P[View Groups] O --> Q[Select Participants] O --> R[Configure Settings] P --> S[Focus Group Session] S --> T[AI Moderated Discussion] S --> U[Real-time Analytics] S --> V[Export Results] ``` ### Focus Group Session Flow ```mermaid sequenceDiagram participant U as User participant F as Frontend participant B as Backend participant AI as AI Service participant DB as MongoDB U->>F: Start Focus Group Session F->>B: Initialize Session B->>DB: Load Focus Group Data B->>DB: Load Participant Personas B-->>F: Return Session Data U->>F: Ask Question F->>B: Send Moderator Message B->>AI: Generate AI Response AI-->>B: AI Moderator Response B->>DB: Store Message B-->>F: Return Response loop Autonomous Mode B->>AI: Determine Next Action AI-->>B: Next Participant/Question B->>AI: Generate Participant Response AI-->>B: Persona Response B->>DB: Store Message B-->>F: Stream Response end U->>F: Request Themes F->>B: Extract Key Themes B->>AI: Analyze Conversation AI-->>B: Extracted Themes B->>DB: Store Themes B-->>F: Return Themes ``` ## Data Model ### MongoDB Collections ```mermaid erDiagram USERS ||--o{ PERSONAS : creates USERS ||--o{ FOCUS_GROUPS : creates FOCUS_GROUPS ||--o{ PERSONAS : includes FOCUS_GROUPS ||--o{ MESSAGES : contains FOCUS_GROUPS ||--o{ THEMES : generates FOCUS_GROUPS ||--o{ NOTES : has FOCUS_GROUPS ||--o{ REASONING : tracks USERS { ObjectId _id PK string username string email string password_hash string role datetime created_at } PERSONAS { ObjectId _id PK string name string age string gender string occupation string location number techSavviness string personality object oceanTraits object thinkFeelDo array goals array frustrations array motivations string created_by FK datetime created_at } FOCUS_GROUPS { ObjectId _id PK string name string description string objective array participants FK object discussionGuide string status string created_by FK datetime created_at } MESSAGES { ObjectId _id PK string focus_group_id FK string text string type string senderId boolean highlighted datetime created_at } THEMES { ObjectId _id PK string focus_group_id FK string title string description array quotes string source datetime created_at } ``` ### Key TypeScript Interfaces ```typescript // Persona Interface interface Persona { id: string; _id?: string; name: string; age: string; gender: string; occupation: string; location: string; techSavviness: number; personality: string; oceanTraits?: { openness: number; conscientiousness: number; extraversion: number; agreeableness: number; neuroticism: number; }; thinkFeelDo?: { thinks: string[]; feels: string[]; does: string[]; }; goals?: string[]; frustrations?: string[]; motivations?: string[]; scenarios?: string[]; // Additional fields... } // Focus Group Interface interface FocusGroup { _id: string; name: string; description: string; objective: string; participants: string[]; discussionGuide?: DiscussionGuide; status: string; created_at: string; created_by: string; } // Discussion Guide Structure interface DiscussionGuide { introduction: string; sections: { id: string; title: string; duration: number; items: { id: string; type: 'question' | 'activity' | 'probe'; content: string; notes?: string; }[]; }[]; conclusion: string; } ``` ## Technical Details and Specifications ### Authentication Flow ```mermaid sequenceDiagram participant C as Client participant F as Frontend participant B as Backend participant DB as Database C->>F: Enter Credentials F->>B: POST /api/auth/login B->>DB: Verify User alt Valid Credentials B-->>F: JWT Token + User Data F->>F: Store in localStorage F-->>C: Redirect to Dashboard else Invalid Credentials B-->>F: 401 Unauthorized F-->>C: Show Error end Note over F: All API Requests F->>F: Add JWT to Headers F->>B: API Request with Bearer Token B->>B: Verify JWT alt Valid Token B-->>F: API Response else Invalid Token B-->>F: 401 Unauthorized F->>F: Clear localStorage F-->>C: Redirect to Login end ``` ### State Management The application uses React Context API for global state management: 1. **AuthContext**: Manages user authentication state, JWT token, and login/logout functionality 2. **Local State**: Component-level state for UI interactions 3. **API State**: React Query could be integrated for server state management (currently using direct API calls) ### API Structure #### Authentication Endpoints - `POST /api/auth/login` - User login - `POST /api/auth/register` - User registration - `GET /api/auth/me` - Get current user profile #### Persona Management - `GET /api/personas` - List all personas - `POST /api/personas` - Create new persona - `GET /api/personas/:id` - Get persona details - `PUT /api/personas/:id` - Update persona - `DELETE /api/personas/:id` - Delete persona #### Focus Group Management - `GET /api/focus-groups` - List all focus groups - `POST /api/focus-groups` - Create new focus group - `GET /api/focus-groups/:id` - Get focus group details - `PUT /api/focus-groups/:id` - Update focus group - `DELETE /api/focus-groups/:id` - Delete focus group #### AI Operations - `POST /api/ai-personas/generate` - Generate synthetic personas - `POST /api/focus-group-ai/:id/start` - Start AI moderation - `POST /api/focus-group-ai/:id/stop` - Stop AI moderation - `POST /api/focus-group-ai/:id/message` - Send message to AI moderator - `GET /api/focus-group-ai/:id/status` - Get moderator status - `POST /api/focus-group-ai/:id/themes` - Extract key themes ## AI Integration ### LLM Service Architecture ```mermaid graph TD A[API Routes] --> B[Service Layer] B --> C[LLM Service] C --> D[Google Gemini API] B --> E[Prompt Templates] E --> F[persona-generation.md] E --> G[focus-group-response.md] E --> H[theme-extraction.md] E --> I[moderator-system.md] C --> J[Response Processing] J --> K[JSON Extraction] J --> L[Error Handling] J --> M[Rate Limiting] ``` ### Key AI Features 1. **Persona Generation**: Uses Gemini to create detailed, realistic personas based on demographic parameters 2. **AI Moderation**: Autonomous focus group moderation with context awareness 3. **Response Generation**: Persona-specific responses based on personality profiles 4. **Theme Extraction**: Real-time analysis of conversation to identify key themes 5. **Conversation Flow**: AI determines next speakers and follow-up questions ### Prompt Engineering The system uses structured prompts stored in `/backend/prompts/`: - System prompts define AI behavior and constraints - Template prompts use variable substitution for dynamic content - Chain-of-thought reasoning for complex decisions ## Development and Deployment ### Local Development Setup ```bash # Frontend npm install npm run dev # Development server at http://localhost:5173 # Backend cd backend python -m venv venv source venv/bin/activate pip install -r requirements.txt python run.py # API server at http://localhost:5137 # MongoDB # Ensure MongoDB is running on localhost:27017 ``` ### Build and Deployment ```bash # Frontend Build npm run build # Creates /dist folder # Backend Deployment cd backend gunicorn -w 4 "app:create_app()" ``` ### Environment Variables Frontend (`.env`): ``` VITE_API_BASE_URL=/semblance_back/api ``` Backend: ``` GEMINI_API_KEY=your_api_key_here MONGODB_URI=mongodb://localhost:27017/synthetic_society JWT_SECRET_KEY=your_secret_key ``` ## Security Considerations 1. **Authentication**: JWT-based authentication with token expiration 2. **Authorization**: Route-level protection with `@jwt_required` decorator 3. **Data Validation**: Input validation on both frontend (Zod) and backend 4. **CORS**: Configured for specific origins in production 5. **API Keys**: Environment variables for sensitive configuration 6. **Password Security**: Bcrypt hashing for password storage 7. **Session Management**: Automatic logout on token expiration ## Key Features ### Persona Management - **Manual Creation**: Detailed form with 50+ attributes - **AI Generation**: Bulk creation with customizable parameters - **Persona Profiles**: Comprehensive view with attitudinal profile methodology - **Folder Organization**: Group personas for easy management - **Export/Import**: Download personas for backup or sharing ### Focus Group Sessions - **Discussion Guide Editor**: Structured session planning - **AI Moderation**: Autonomous or semi-autonomous modes - **Real-time Participation**: Live conversation with AI personas - **Theme Extraction**: Automatic identification of key insights - **Note Taking**: Time-stamped notes linked to messages - **Analytics Dashboard**: Visual representation of participation and sentiment ### Data Analysis - **Export Options**: PDF reports, CSV data, JSON backups - **Theme Management**: Manual and AI-generated theme tracking - **Conversation History**: Full transcript with highlighting - **Reasoning Transparency**: View AI decision-making process ## Best Practices for Development Team 1. **Code Organization**: Follow the existing pattern of separating concerns (components, services, types) 2. **Type Safety**: Maintain TypeScript types for all data structures 3. **Error Handling**: Use try-catch blocks with user-friendly toast notifications 4. **API Consistency**: Follow RESTful conventions for new endpoints 5. **Component Reusability**: Utilize shadcn-ui components and create custom wrappers 6. **State Management**: Keep state as local as possible, lift only when necessary 7. **Performance**: Implement pagination for large datasets 8. **Testing**: Add unit tests for critical business logic 9. **Documentation**: Update API documentation when adding new endpoints 10. **Security**: Always validate user input and sanitize data ## Future Enhancements Based on the codebase analysis, potential areas for enhancement include: 1. **Real-time Updates**: WebSocket integration for live session updates 2. **Advanced Analytics**: More detailed sentiment analysis and reporting 3. **Multi-language Support**: Internationalization for global research 4. **Team Collaboration**: Multiple users per focus group session 5. **Template Library**: Pre-built discussion guides and persona archetypes 6. **API Rate Limiting**: Implement rate limiting for AI endpoints 7. **Caching Layer**: Redis for frequently accessed data 8. **Audit Logging**: Track all user actions for compliance 9. **Backup System**: Automated database backups 10. **Performance Monitoring**: Integration with monitoring tools