semblance_backup/semblance_app_documentation.md
2025-12-19 19:26:16 +00:00

14 KiB
Executable file

Semblance Synthetic Society - Application Documentation

Table of Contents

  1. Application Overview
  2. Application Architecture
  3. User Activity Flow
  4. Data Model
  5. Technical Details and Specifications
  6. API Structure
  7. AI Integration
  8. Development and Deployment
  9. Security Considerations
  10. Key Features

Application Overview

Semblance Synthetic Society is an AI-powered platform for creating and managing synthetic personas for focus groups and market research. It enables researchers to:

  • Create detailed synthetic personas with demographic profiles and personality traits
  • Organize personas into focus groups
  • Run AI-moderated focus group sessions with autonomous conversations
  • Analyze results with real-time theme extraction and reporting
  • Export comprehensive insights and recommendations

Application Architecture

High-Level Architecture

graph TB
    subgraph "Frontend - React/TypeScript"
        A[React App<br/>Vite + TypeScript] --> B[Components]
        B --> C[Pages]
        B --> D[UI Components<br/>shadcn-ui]
        A --> E[State Management<br/>React Context + Hooks]
        A --> F[API Client<br/>Axios]
    end
    
    subgraph "Backend - Python/Flask"
        G[Flask API] --> H[Routes]
        H --> I[Services]
        I --> J[Models]
        J --> K[(MongoDB)]
        I --> L[Google Gemini AI<br/>LLM Integration]
    end
    
    F -->|HTTP/REST| G
    
    subgraph "Infrastructure"
        M[Static Hosting<br/>Frontend Build]
        N[WSGI Server<br/>Backend API]
        O[MongoDB Instance]
    end

Technology Stack

Frontend

  • Framework: React 18.3.1 with TypeScript
  • Build Tool: Vite 5.4.1
  • Styling: Tailwind CSS 3.4.11 with shadcn-ui components
  • Routing: React Router DOM 6.26.2
  • State Management: React Context API + Custom Hooks
  • HTTP Client: Axios 1.6.2
  • Forms: React Hook Form 7.53.0 with Zod validation
  • Authentication: JWT with localStorage persistence

Backend

  • Framework: Flask (Python)
  • Database: MongoDB with PyMongo
  • Authentication: Flask-JWT-Extended
  • AI Integration: Google Generative AI (Gemini 2.5 Pro)
  • Server: Hypercorn (ASGI server)
  • CORS: Flask-CORS for cross-origin requests

User Activity Flow

Main User Journey

flowchart TD
    A[Landing Page] --> B{Authenticated?}
    B -->|No| C[Login Page]
    B -->|Yes| D[Dashboard]
    C --> E[Enter Credentials]
    E --> F[JWT Authentication]
    F --> D
    
    D --> G[Synthetic Users]
    D --> H[Focus Groups]
    D --> I[Overview Stats]
    
    G --> J[Create Persona]
    G --> K[AI Recruiter]
    G --> L[View/Edit Personas]
    
    J --> M[Manual Creation Form]
    K --> N[Bulk Generation]
    
    H --> O[Create Focus Group]
    H --> P[View Groups]
    
    O --> Q[Select Participants]
    O --> R[Configure Settings]
    
    P --> S[Focus Group Session]
    S --> T[AI Moderated Discussion]
    S --> U[Real-time Analytics]
    S --> V[Export Results]

Focus Group Session Flow

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant B as Backend
    participant AI as AI Service
    participant DB as MongoDB
    
    U->>F: Start Focus Group Session
    F->>B: Initialize Session
    B->>DB: Load Focus Group Data
    B->>DB: Load Participant Personas
    B-->>F: Return Session Data
    
    U->>F: Ask Question
    F->>B: Send Moderator Message
    B->>AI: Generate AI Response
    AI-->>B: AI Moderator Response
    B->>DB: Store Message
    B-->>F: Return Response
    
    loop Autonomous Mode
        B->>AI: Determine Next Action
        AI-->>B: Next Participant/Question
        B->>AI: Generate Participant Response
        AI-->>B: Persona Response
        B->>DB: Store Message
        B-->>F: Stream Response
    end
    
    U->>F: Request Themes
    F->>B: Extract Key Themes
    B->>AI: Analyze Conversation
    AI-->>B: Extracted Themes
    B->>DB: Store Themes
    B-->>F: Return Themes

Data Model

MongoDB Collections

erDiagram
    USERS ||--o{ PERSONAS : creates
    USERS ||--o{ FOCUS_GROUPS : creates
    FOCUS_GROUPS ||--o{ PERSONAS : includes
    FOCUS_GROUPS ||--o{ MESSAGES : contains
    FOCUS_GROUPS ||--o{ THEMES : generates
    FOCUS_GROUPS ||--o{ NOTES : has
    FOCUS_GROUPS ||--o{ REASONING : tracks
    
    USERS {
        ObjectId _id PK
        string username
        string email
        string password_hash
        string role
        datetime created_at
    }
    
    PERSONAS {
        ObjectId _id PK
        string name
        string age
        string gender
        string occupation
        string location
        number techSavviness
        string personality
        object oceanTraits
        object thinkFeelDo
        array goals
        array frustrations
        array motivations
        string created_by FK
        datetime created_at
    }
    
    FOCUS_GROUPS {
        ObjectId _id PK
        string name
        string description
        string objective
        array participants FK
        object discussionGuide
        string status
        string created_by FK
        datetime created_at
    }
    
    MESSAGES {
        ObjectId _id PK
        string focus_group_id FK
        string text
        string type
        string senderId
        boolean highlighted
        datetime created_at
    }
    
    THEMES {
        ObjectId _id PK
        string focus_group_id FK
        string title
        string description
        array quotes
        string source
        datetime created_at
    }

Key TypeScript Interfaces

// Persona Interface
interface Persona {
  id: string;
  _id?: string;
  name: string;
  age: string;
  gender: string;
  occupation: string;
  location: string;
  techSavviness: number;
  personality: string;
  oceanTraits?: {
    openness: number;
    conscientiousness: number;
    extraversion: number;
    agreeableness: number;
    neuroticism: number;
  };
  thinkFeelDo?: {
    thinks: string[];
    feels: string[];
    does: string[];
  };
  goals?: string[];
  frustrations?: string[];
  motivations?: string[];
  scenarios?: string[];
  // Additional fields...
}

// Focus Group Interface
interface FocusGroup {
  _id: string;
  name: string;
  description: string;
  objective: string;
  participants: string[];
  discussionGuide?: DiscussionGuide;
  status: string;
  created_at: string;
  created_by: string;
}

// Discussion Guide Structure
interface DiscussionGuide {
  introduction: string;
  sections: {
    id: string;
    title: string;
    duration: number;
    items: {
      id: string;
      type: 'question' | 'activity' | 'probe';
      content: string;
      notes?: string;
    }[];
  }[];
  conclusion: string;
}

Technical Details and Specifications

Authentication Flow

sequenceDiagram
    participant C as Client
    participant F as Frontend
    participant B as Backend
    participant DB as Database
    
    C->>F: Enter Credentials
    F->>B: POST /api/auth/login
    B->>DB: Verify User
    alt Valid Credentials
        B-->>F: JWT Token + User Data
        F->>F: Store in localStorage
        F-->>C: Redirect to Dashboard
    else Invalid Credentials
        B-->>F: 401 Unauthorized
        F-->>C: Show Error
    end
    
    Note over F: All API Requests
    F->>F: Add JWT to Headers
    F->>B: API Request with Bearer Token
    B->>B: Verify JWT
    alt Valid Token
        B-->>F: API Response
    else Invalid Token
        B-->>F: 401 Unauthorized
        F->>F: Clear localStorage
        F-->>C: Redirect to Login
    end

State Management

The application uses React Context API for global state management:

  1. AuthContext: Manages user authentication state, JWT token, and login/logout functionality
  2. Local State: Component-level state for UI interactions
  3. API State: React Query could be integrated for server state management (currently using direct API calls)

API Structure

Authentication Endpoints

  • POST /api/auth/login - User login
  • POST /api/auth/register - User registration
  • GET /api/auth/me - Get current user profile

Persona Management

  • GET /api/personas - List all personas
  • POST /api/personas - Create new persona
  • GET /api/personas/:id - Get persona details
  • PUT /api/personas/:id - Update persona
  • DELETE /api/personas/:id - Delete persona

Focus Group Management

  • GET /api/focus-groups - List all focus groups
  • POST /api/focus-groups - Create new focus group
  • GET /api/focus-groups/:id - Get focus group details
  • PUT /api/focus-groups/:id - Update focus group
  • DELETE /api/focus-groups/:id - Delete focus group

AI Operations

  • POST /api/ai-personas/generate - Generate synthetic personas
  • POST /api/focus-group-ai/:id/start - Start AI moderation
  • POST /api/focus-group-ai/:id/stop - Stop AI moderation
  • POST /api/focus-group-ai/:id/message - Send message to AI moderator
  • GET /api/focus-group-ai/:id/status - Get moderator status
  • POST /api/focus-group-ai/:id/themes - Extract key themes

AI Integration

LLM Service Architecture

graph TD
    A[API Routes] --> B[Service Layer]
    B --> C[LLM Service]
    C --> D[Google Gemini API]
    
    B --> E[Prompt Templates]
    E --> F[persona-generation.md]
    E --> G[focus-group-response.md]
    E --> H[theme-extraction.md]
    E --> I[moderator-system.md]
    
    C --> J[Response Processing]
    J --> K[JSON Extraction]
    J --> L[Error Handling]
    J --> M[Rate Limiting]

Key AI Features

  1. Persona Generation: Uses Gemini to create detailed, realistic personas based on demographic parameters
  2. AI Moderation: Autonomous focus group moderation with context awareness
  3. Response Generation: Persona-specific responses based on personality profiles
  4. Theme Extraction: Real-time analysis of conversation to identify key themes
  5. Conversation Flow: AI determines next speakers and follow-up questions

Prompt Engineering

The system uses structured prompts stored in /backend/prompts/:

  • System prompts define AI behavior and constraints
  • Template prompts use variable substitution for dynamic content
  • Chain-of-thought reasoning for complex decisions

Development and Deployment

Local Development Setup

# Frontend
npm install
npm run dev  # Development server at http://localhost:5173

# Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python run.py  # API server at http://localhost:5137

# MongoDB
# Ensure MongoDB is running on localhost:27017

Build and Deployment

# Frontend Build
npm run build  # Creates /dist folder

# Backend Deployment
cd backend
gunicorn -w 4 "app:create_app()"

Environment Variables

Frontend (.env):

VITE_API_BASE_URL=/semblance_back/api

Backend:

GEMINI_API_KEY=your_api_key_here
MONGODB_URI=mongodb://localhost:27017/synthetic_society
JWT_SECRET_KEY=your_secret_key

Security Considerations

  1. Authentication: JWT-based authentication with token expiration
  2. Authorization: Route-level protection with @jwt_required decorator
  3. Data Validation: Input validation on both frontend (Zod) and backend
  4. CORS: Configured for specific origins in production
  5. API Keys: Environment variables for sensitive configuration
  6. Password Security: Bcrypt hashing for password storage
  7. Session Management: Automatic logout on token expiration

Key Features

Persona Management

  • Manual Creation: Detailed form with 50+ attributes
  • AI Generation: Bulk creation with customizable parameters
  • Persona Profiles: Comprehensive view with attitudinal profile methodology
  • Folder Organization: Group personas for easy management
  • Export/Import: Download personas for backup or sharing

Focus Group Sessions

  • Discussion Guide Editor: Structured session planning
  • AI Moderation: Autonomous or semi-autonomous modes
  • Real-time Participation: Live conversation with AI personas
  • Theme Extraction: Automatic identification of key insights
  • Note Taking: Time-stamped notes linked to messages
  • Analytics Dashboard: Visual representation of participation and sentiment

Data Analysis

  • Export Options: PDF reports, CSV data, JSON backups
  • Theme Management: Manual and AI-generated theme tracking
  • Conversation History: Full transcript with highlighting
  • Reasoning Transparency: View AI decision-making process

Best Practices for Development Team

  1. Code Organization: Follow the existing pattern of separating concerns (components, services, types)
  2. Type Safety: Maintain TypeScript types for all data structures
  3. Error Handling: Use try-catch blocks with user-friendly toast notifications
  4. API Consistency: Follow RESTful conventions for new endpoints
  5. Component Reusability: Utilize shadcn-ui components and create custom wrappers
  6. State Management: Keep state as local as possible, lift only when necessary
  7. Performance: Implement pagination for large datasets
  8. Testing: Add unit tests for critical business logic
  9. Documentation: Update API documentation when adding new endpoints
  10. Security: Always validate user input and sanitize data

Future Enhancements

Based on the codebase analysis, potential areas for enhancement include:

  1. Real-time Updates: WebSocket integration for live session updates
  2. Advanced Analytics: More detailed sentiment analysis and reporting
  3. Multi-language Support: Internationalization for global research
  4. Team Collaboration: Multiple users per focus group session
  5. Template Library: Pre-built discussion guides and persona archetypes
  6. API Rate Limiting: Implement rate limiting for AI endpoints
  7. Caching Layer: Redis for frequently accessed data
  8. Audit Logging: Track all user actions for compliance
  9. Backup System: Automated database backups
  10. Performance Monitoring: Integration with monitoring tools