vault backup: 2026-05-18 19:18:46

This commit is contained in:
Vadym Samoilenko 2026-05-18 19:18:46 +01:00
parent 9bc03db45e
commit 87034bf055
5 changed files with 564 additions and 1 deletions

View file

@ -17,6 +17,6 @@
"repelStrength": 10,
"linkStrength": 1,
"linkDistance": 250,
"scale": 0.29109721960434015,
"scale": 0.7932897629230251,
"close": true
}

View file

@ -0,0 +1,180 @@
---
auto_generated: true
manual_updated_at: 2026-05-18
modified: 2026-05-18
---
# Semblance Synthetic Society — Developer Manual
## Architecture Overview
Semblance is a full-stack web application with a React frontend and a Node.js backend, communicating with a MongoDB database and external LLM providers. The application is containerized using Docker and Docker Compose for development and production environments.
### High-Level Diagram
```
[Frontend (React/TS)] <--> [Backend (Node/Express)] <--> [MongoDB]
^ ^
| |
[LLM Providers] [Auth Middleware]
```
### Tech Stack
- **Frontend:**
- React 18+ with TypeScript
- React Router DOM for routing
- TanStack Query for server state management
- Socket.IO Client for real-time WebSocket communication
- Sonner for toast notifications
- Azure MSAL React for Microsoft Authentication
- Tailwind CSS (implied by UI component structure)
- **Backend:**
- Node.js
- Express (implied by API structure)
- Mongoose (MongoDB ODM)
- Socket.IO Server
- JWT for local authentication
- **Infrastructure:**
- Docker & Docker Compose
- MongoDB 7
## Local Setup
### Prerequisites
- Node.js 20+
- Docker and Docker Compose
- npm or yarn
### Installation Steps
1. **Clone the Repository**
```bash
git clone <repo-url>
cd semblance
```
2. **Configure Environment Variables**
- Copy `.env.example` to `backend/.env`:
```bash
cp backend/.env.example backend/.env
```
- Edit `backend/.env` to set:
- `MONGO_URI`: Connection string to MongoDB (defaults to `mongodb://mongo:27017/semblance_db` in Docker).
- `JWT_SECRET`: A strong secret for JWT token signing.
- `LLM_API_KEYS`: API keys for Google Gemini and OpenAI.
- `MSAL_CONFIG`: Client ID and tenant ID for Microsoft Authentication.
3. **Start Infrastructure**
```bash
docker-compose up -d mongo
```
This starts the MongoDB service with persistent volumes.
4. **Build and Start Backend**
```bash
docker-compose up --build backend
```
The backend will connect to MongoDB and start on port `5137`.
5. **Build and Start Frontend**
```bash
docker-compose --profile build up --build frontend
```
The frontend builds and serves the output to `./backend/dist-out` (or your configured dist directory). For local development with hot reload, you may need to run the frontend locally with `npm run dev` instead of the Docker build profile.
6. **Access the Application**
- Frontend: `http://localhost:5173` (if running locally) or via the configured proxy.
- Backend API: `http://localhost:5137`.
- MongoDB: `127.0.0.1:27017`.
## Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `MONGO_URI` | MongoDB connection string | Yes |
| `JWT_SECRET` | Secret key for JWT signing | Yes |
| `GEMINI_API_KEY` | Google Gemini API key | Yes |
| `OPENAI_API_KEY` | OpenAI API key | Yes |
| `MSAL_CLIENT_ID` | Azure AD App Client ID | Yes |
| `MSAL_TENANT_ID` | Azure AD Tenant ID | Yes |
| `NODE_ENV` | Environment (development/production) | No |
## Key Services & Entry Points
### Frontend Entry Points
- **`src/main.tsx`**: Root component initialization.
- **`src/App.tsx`**: Defines routing structure, providers (Auth, WebSocket, QueryClient), and route definitions.
- **Routes:**
- `/`: Index page.
- `/login`: Authentication page.
- `/synthetic-users`: Persona management.
- `/focus-groups`: Session management.
- `/dashboard`: Overview analytics.
- `/admin`: Admin panel (protected).
- **`src/contexts/`**:
- `AuthContext.tsx`: Handles local login and Microsoft SSO state. Uses `localStorage` for token persistence.
- `NavigationContext.tsx`: Manages app state for navigation history and focus group context, persisted in `localStorage`.
- `WebSocketContextNew.tsx`: Initializes a singleton Socket.IO instance. Uses a module-level `socketInitialized` flag to prevent re-initialization.
### Backend Entry Points
- **`backend/`**: Contains the Express server, routes, and controllers.
- **Services:**
- `llm`: Handles interactions with Gemini and OpenAI models.
- `aiRunner`: Orchestrates autonomous focus group logic.
- `themeExtractor`: Processes session transcripts for key themes.
### WebSocket Implementation
- **Service:** `src/services/websocketServiceNew.ts` (referenced in context).
- **Pattern:** Singleton pattern via `socketInitialized` flag.
- **Authentication:** Socket connection is authenticated via JWT token passed in the `auth_token` from `localStorage` or the `AuthContext`.
## API Reference
### Authentication Endpoints
- `POST /api/auth/login`: Local login. Expects `{ username, password }`. Returns JWT.
- `POST /api/auth/microsoft`: Microsoft SSO callback. Expects ID token. Returns JWT and user details.
### Persona Endpoints
- `GET /api/personas`: List all personas.
- `POST /api/personas`: Create a new persona.
- `PUT /api/personas/:id`: Update a persona.
- `GET /api/personas/:id`: Get persona details.
### Focus Group Endpoints
- `POST /api/focus-groups`: Create a session.
- `GET /api/focus-groups/:id/stream`: WebSocket events for session data.
### Admin Endpoints
- `GET /api/admin/users`: List users.
- `DELETE /api/admin/users/:id`: Remove a user.
## Deployment
### Production Build
1. Ensure `MONGO_URI` points to the production MongoDB instance.
2. Set `NODE_ENV=production`.
3. Build the frontend using the Docker compose build profile:
```bash
docker-compose --profile build up --build frontend
```
4. Deploy the backend container and frontend static assets to your web server (e.g., Nginx) or use a PaaS.
### Docker Compose Notes
- The `frontend` service in `docker-compose.yml` is configured for **build** only. It copies the built assets to `/app/dist-out`. The backend should serve these static files in production.
- Volumes are used for MongoDB data persistence and backend uploads.
## Known Gotchas
1. **WebSocket Singleton Issue:**
The `WebSocketContextNew.tsx` uses a module-level `socketInitialized` flag. If the app is hot-reloaded in development, the flag may not reset, causing connection issues. In production, this is generally safe, but ensure server-side disconnects are handled if the client navigates away.
2. **Navigation State Persistence:**
The `NavigationContext` persists state to `localStorage`. If a user switches browsers or clears storage, navigation history (like "Previous Route") will be lost. This may cause back-button issues if not handled gracefully in UI components.
3. **Microsoft SSO Redirect:**
The MSAL `handleRedirectPromise` is called in `AuthContext` on mount. Ensure the redirect URI is correctly registered in Azure AD and matches the application's configuration.
4. **Mention Parsing Regex:**
The `parseMentions` function uses a specific regex for `@mentions`. Be aware that complex names with special characters or punctuation may not be parsed correctly. The regex stops at conjunctions (`and`, `or`) and non-word boundaries.
5. **Port Binding:**
Backend and MongoDB are bound to `127.0.0.1` in `docker-compose.yml`. This ensures they are not exposed to the public internet, but requires port forwarding or a reverse proxy for remote access.

View file

@ -0,0 +1,100 @@
---
auto_generated: true
manual_updated_at: 2026-05-18
modified: 2026-05-18
---
# Semblance Synthetic Society — User Manual
## What This Tool Does
Semblance is an AI-powered platform designed to simulate consumer behavior through synthetic personas. It allows researchers, marketers, and product teams to create realistic AI-driven user profiles and run autonomous or moderated focus group sessions. The platform extracts actionable insights, themes, and analytics from these simulated conversations in real-time.
### Key Capabilities
- **Synthetic Persona Management:** Create, edit, and organize AI-generated or manually built consumer profiles.
- **Autonomous Focus Groups:** Run self-driving conversations where AI personas interact with each other based on research objectives.
- **Moderated Sessions:** Act as a human moderator while AI assists with probes and follow-up questions.
- **Real-Time Insights:** View live transcripts, theme extraction, and sentiment analysis during sessions.
## Who Uses It
- **Market Researchers:** To test concepts, messages, or products without recruiting real participants.
- **Product Managers:** To validate feature ideas and user flows with synthetic user personas.
- **UX Researchers:** To run rapid, low-cost qualitative research rounds.
- **Marketing Teams:** To refine messaging and identify key themes from consumer perspectives.
## How to Access
1. Navigate to the application URL provided by your organization.
2. Log in using one of the supported methods:
- **Local Login:** Enter your registered username and password.
- **Microsoft SSO:** Click the "Sign in with Microsoft" button to authenticate via your corporate Azure AD account.
3. Once authenticated, you will be directed to the Dashboard or the last viewed page.
## Main Workflows
### Workflow 1: Creating Synthetic Personas
1. **Navigate to Synthetic Users:**
- Click "Synthetic Users" in the left-hand navigation bar.
2. **Create a New Persona:**
- Click the "Create Persona" button.
- **Option A: AI Generation:**
- Select "Generate with AI."
- Provide a brief description of the target audience (e.g., "Millennial parents interested in organic skincare").
- Choose an LLM model (Default: Google Gemini 3 Pro; Options: OpenAI GPT-4.1, GPT-5.2).
- Click "Generate." The AI will populate demographic, psychographic, and behavioral attributes.
- **Option B: Manual Creation:**
- Select "Manual Creation."
- Fill in attributes such as name, age, occupation, interests, and behavioral traits manually.
3. **Refine and Save:**
- Adjust any AI-generated details if necessary.
- Save the persona.
4. **Organize Personas:**
- Use drag-and-drop to move personas into folders for better organization.
5. **Export Profiles:**
- Select one or more personas.
- Click "Export" and choose PDF or summary format to download the persona profiles.
### Workflow 2: Running a Focus Group Session
1. **Start a Session:**
- Navigate to "Focus Groups" and click "Create Session."
- Select the personas to participate.
2. **Configure Session Settings:**
- **Mode:** Choose between **Manual Moderation** (you guide the chat) or **Autonomous AI Mode** (AI drives the conversation).
- **Discussion Guide:** Optionally, ask the AI to generate a structured discussion guide based on your research objectives.
3. **Run the Session:**
- **Autonomous Mode:** Click "Start Session." Watch as AI personas converse in real-time via WebSocket. The AI moderator may intervene with probes or follow-up questions.
- **Manual Mode:** Type questions or prompts to guide the personas. Use the AI Assistant panel for suggested follow-ups.
4. **Analyze Results:**
- During the session, observe real-time theme extraction and highlighting.
- After the session ends, review the session analytics, including summary generation, key themes, and sentiment scores.
### Workflow 3: Managing Your Account
1. **Check Usage:**
- Go to "My Usage" to view your token consumption and API limits.
2. **Admin Tasks (Admins Only):**
- Access the "Admin" panel to manage users, system settings, and monitor system health.
## FAQ
**Q: How accurate are the synthetic personas?**
A: Personas are generated using multi-model LLMs (Gemini, GPT-4.1, GPT-5.2). You can refine them manually after generation to better fit your specific research needs.
**Q: Can I interrupt an autonomous session?**
A: Yes. Even in Autonomous Mode, if you have moderator privileges, you can type prompts to steer the conversation.
**Q: What data is stored?**
A: Persona data, session transcripts, and analytics are stored in the MongoDB database. Ensure you comply with your organization's data privacy policies when creating personas.
**Q: How do I export session insights?**
A: Session summaries and theme extractions are available post-session. You can often export these as reports or view them in the analytics dashboard.

View file

@ -0,0 +1,176 @@
---
auto_generated: true
manual_updated_at: 2026-05-18
modified: 2026-05-18
---
# Developer Manual: Accessible Video Processing Platform
## Architecture Overview
The platform follows a microservices-oriented architecture deployed via Docker Compose, separated into frontend, API, and background workers.
### High-Level Diagram
```mermaid
graph TD
User[User Browser] -->|HTTPS/WS| Frontend[Frontend (React)]
Frontend -->|REST API| API[FastAPI/Gunicorn]
Frontend -->|WebSocket| API
API -->|Task Publish| Redis[(Redis Broker)]
API -->|Cache/Session| Redis
API -->|Data Query| MongoDB[(MongoDB)]
Redis -->|Consume| Worker[Cellery Workers]
Worker -->|Process| FFmpeg[FFmpeg Worker]
Worker -->|Process| TTS[TTS Worker]
Worker -->|Process| Whisper[Whisper Worker]
Worker -->|Write| MongoDB
```
## Tech Stack
- **Frontend**: React 18+, TypeScript, Tailwind CSS, React Query, MSAL (Microsoft Authentication).
- **Backend**: Python, FastAPI, Gunicorn, Celery.
- **Database**: MongoDB 7.0.
- **Broker/Cache**: Redis 7.
- **AI/ML**: Gemini 2.5 Pro (Gemini API), Whisper (Speech-to-Text), Google Cloud TTS, ElevenLabs.
- **Monitoring**: Sentry (Error/Performance Tracking).
- **Infrastructure**: Docker, Docker Compose.
## Local Setup
### Prerequisites
- Docker and Docker Compose.
- Python 3.10+.
- Node.js 18+.
- Google Cloud API Credentials (for Translate, TTS).
- Google AI Studio API Key (for Gemini).
- Microsoft Entra ID (Azure AD) App Registration (for MSAL).
### Step 1: Environment Variables
Create a `.env` file in the root directory based on `.env.example`.
```bash
# Database
MONGODB_DB=accessible_video
# Redis
REDIS_URL=redis://redis:6379/0
# AI Services
GEMINI_API_KEY=your_gemini_key
GOOGLE_CLOUD_PROJECT_ID=your_project_id
GOOGLE_APPLICATION_CREDENTIALS=./path/to/key.json
# Auth
MSAL_CLIENT_ID=your_azure_client_id
MSAL_TENANT_ID=your_azure_tenant_id
# Sentry
SENTRY_DSN=your_sentry_dsn
```
### Step 2: Start Infrastructure
```bash
docker-compose up -d mongodb redis
```
### Step 3: Backend Setup
```bash
cd backend
pip install -r requirements.txt
# Copy config if necessary
cp config/mongod.conf.example config/mongod.conf
```
### Step 4: Frontend Setup
```bash
cd frontend
npm install
# Ensure .env.development or .env.local is configured with VITE_API_URL
npm run dev
```
### Step 5: Start Workers
In separate terminal windows:
```bash
# Default Worker
celery -A celery_app worker -Q default,ingest,notify,render -c 4
# Dedicated Workers
celery -A celery_app worker -Q tts -c 8
celery -A celery_app worker -Q ffmpeg -c 1
celery -A celery_app worker -Q whisper -c 1
```
## Environment Variables
| Variable | Description | Required |
| :--- | :--- | :--- |
| `MONGODB_DB` | MongoDB database name | Yes |
| `REDIS_URL` | Redis connection string | Yes |
| `GEMINI_API_KEY` | API key for Gemini 2.5 Pro | Yes |
| `MSAL_CLIENT_ID` | Azure AD Client ID | Yes |
| `MSAL_TENANT_ID` | Azure AD Tenant ID | Yes |
| `SENTRY_DSN` | Sentry Data Source Name | Optional |
| `VITE_APP_ENV` | App environment (dev/prod) | Yes |
## Key Services & Entry Points
### 1. Frontend (`frontend/src/main.tsx`)
- **Auth**: Initializes MSAL `PublicClientApplication` for Microsoft SSO.
- **Monitoring**: Initializes Sentry with PII filtering.
- **State**: Provides global contexts for Toasts, Notifications, and WebSocket connections.
### 2. WebSocket (`frontend/src/contexts/GlobalWebSocketContext.tsx`)
- Manages real-time job status updates.
- Uses `useJobStatusWebSocket` hook.
- Dispatches toast notifications based on `getStatusMessageConfig`.
### 3. Backend API (`backend/app/main.py` inferred)
- **FastAPI** handles REST requests.
- **Celery** offloads heavy processing:
- `ingest`: Video metadata extraction.
- `whisper`: Speech-to-text transcription.
- `render`: TTS synthesis and FFmpeg video compositing.
- `notify`: WebSocket dispatching.
### 4. Database Models (MongoDB)
- **Job**: Core entity tracking processing status, assets, and metadata.
- **User/Client/Reviewer**: RBAC models.
- **AuditLog**: Immutable log of reviewer actions.
## API Reference
*Note: Full OpenAPI spec available at `/docs` in running API.*
### Authentication
- **POST /auth/login**: Triggers MSAL redirect flow.
- **GET /auth/callback**: Handles MSAL token exchange.
### Jobs
- **POST /jobs**: Create a new accessibility job.
- Body: `{ video_id, target_languages, services: ["cc", "ad"] }`
- **GET /jobs**: List jobs with filters (status, owner).
- **GET /jobs/{job_id}**: Get job details and progress.
### Quality Control
- **GET /qc/list**: Get jobs awaiting review.
- **PUT /qc/{job_id}/approve**: Approve job.
- **PUT /qc/{job_id}/reject**: Reject job with comments.
- **PUT /qc/{job_id}/edit-vtt**: Update VTT content.
## Deployment
1. **Build Images**: Ensure Docker images are up to date.
2. **Production Env**: Set `VITE_APP_ENV=production` and `SENTRY_DSN`.
3. **Security**: Ensure `HTTPS` is enforced at the reverse proxy (Nginx/AWS ALB).
4. **Scale Workers**: Increase concurrency in `docker-compose.yml` or use Kubernetes for horizontal scaling.
## Known Gotchas
1. **FFmpeg Concurrency**: The `ffmpeg-worker` is configured with `concurrency=1`. Do not scale this worker beyond 1 instance per CPU core to avoid file locking issues.
2. **TTS Quotas**: Google Cloud TTS and ElevenLabs have rate limits. Monitor usage and implement backoff strategies in the TTS worker.
3. **WebSocket Reconnects**: The frontend handles auto-reconnects. Ensure the backend WebSocket server is stable; unexpected disconnections may require a manual page refresh if reconnect fails.
4. **Signed URLs**: Generated download links expire after 24 hours. If a user tries to download later, they must re-request the link from the API.
5. **MSAL Config**: Ensure `msalConfig.ts` matches the Azure AD redirect URIs exactly, including the `basename` (`/video-accessibility`) if deployed to a sub-path.

View file

@ -0,0 +1,107 @@
---
auto_generated: true
manual_updated_at: 2026-05-18
modified: 2026-05-18
---
# User Manual: Accessible Video Processing Platform
## What This Tool Does
The **Accessible Video Processing Platform** is an AI-powered solution designed to make video content inclusive and accessible to diverse audiences. It automates the complex workflow of generating closed captions, audio descriptions (with voiceovers), and multi-language translations for video content.
Instead of manually creating accessibility assets, users upload a video, and the platform uses advanced AI models (Gemini 2.5 Pro, Google Cloud TTS, Whisper) to generate high-quality, synchronized accessibility layers. The platform includes a robust quality control (QC) workflow where human reviewers verify and edit the generated content before final delivery.
## Who Uses It
This platform serves three primary user roles:
1. **Clients/Content Creators**: Users who upload videos and manage their accessibility projects. They initiate jobs, track progress, and download final accessible assets.
2. **Reviewers**: Linguists or accessibility experts who review generated captions and audio descriptions, make necessary edits, and approve the content for delivery.
3. **Administrators**: System managers who oversee user accounts, organizational settings, audit logs, and glossary management.
## How to Access
### Prerequisites
- A web browser (Chrome, Firefox, Edge, or Safari).
- Valid organizational credentials (Microsoft Account/Enterprise SSO).
### Login
1. Navigate to the application URL.
2. Click **Login**. You will be redirected to the Microsoft Authentication Library (MSAL) login screen.
3. Enter your organizational credentials to sign in.
4. Upon successful authentication, you will be redirected to the **Dashboard**.
## Main Workflows
### Workflow 1: Creating a New Accessibility Job (For Clients)
1. **Navigate to Upload**: From the Dashboard, click **New Job**.
2. **Upload Video**:
- Drag and drop your video file into the upload area or click to browse.
- The system validates the file format and size.
3. **Configure Settings**:
- **Source Language**: Select the original language of the video.
- **Target Languages**: Select one or more languages for translation (supports 50+ languages).
- **Services**: Check boxes for required services:
- *Closed Captions (CC)*
- *Audio Descriptions (AD)*
- *Multi-language Translation*
4. **Submit Job**: Click **Start Processing**.
- You will be taken to the **Jobs List** where the new job appears.
- A toast notification confirms the submission.
### Workflow 2: Tracking Job Progress
1. **Job List View**: View the status of all your jobs in real-time.
2. **Real-Time Updates**:
- Use the **WebSocket** status bar to see live progress.
- Toast notifications will pop up when status changes occur (e.g., "Caption Generation Complete").
3. **Job Details**: Click on a specific job ID to view:
- Detailed progress bars for each AI service (Whisper, Gemini, TTS).
- Logs of the processing stages.
- Estimated time remaining.
### Workflow 3: Quality Control Review (For Reviewers)
1. **Access QC Queue**: Navigate to **QC List** from the admin menu.
2. **Review Item**: Click on a job marked as **Ready for Review**.
3. **Edit Captions**:
- Open the **VTT Editor**.
- Play the video and read the synchronized captions.
- Click on any caption line to edit the text directly.
- Save changes to update the VTT file.
4. **Review Audio Descriptions**:
- Listen to the generated TTS audio.
- If the tone is inappropriate or inaccurate, flag the section for re-generation or manual note.
5. **Approve/Reject**:
- Click **Approve** to send the assets for final delivery.
- Click **Reject** with a comment to send it back to AI processing for regeneration.
6. **Audit Trail**: All review actions are logged automatically in the **Audit Log** for compliance.
### Workflow 4: Downloading Final Assets
1. **Navigate to Downloads**: Go to the **Downloads** page.
2. **Select Completed Jobs**: Filter by status "Approved".
3. **Download Assets**:
- Click the download icon next to each asset:
- *VTT File*: For captions.
- *Audio File*: For audio descriptions.
- Assets are delivered via secure, time-limited signed URLs.
## FAQ
**Q: How long does processing take?**
A: Processing time depends on video length and the number of target languages. Typically, captions take 2-5 minutes per hour of video, while audio descriptions may take longer due to TTS synthesis.
**Q: Can I edit captions before they are processed?**
A: You can provide a **Glossary** in the settings to influence the AI's terminology. However, the primary editing window is during the Quality Control phase.
**Q: Why is my job stuck on "Processing"?**
A: Check the **Job Detail** logs. If it remains stuck for more than 10 minutes, try refreshing the page. If the issue persists, contact an Administrator to check the worker queues.
**Q: How are my files secured?**
A: Files are stored in Google Cloud Storage with signed URLs that expire after 24 hours. The platform also enforces strict Role-Based Access Control (RBAC).
**Q: Who can approve my work?**
A: Only users with the **Reviewer** or **Admin** role can approve jobs. Clients cannot self-approve their own submissions.