vault backup: 2026-05-18 19:18:46
This commit is contained in:
parent
9bc03db45e
commit
87034bf055
5 changed files with 564 additions and 1 deletions
2
.obsidian/graph.json
vendored
2
.obsidian/graph.json
vendored
|
|
@ -17,6 +17,6 @@
|
|||
"repelStrength": 10,
|
||||
"linkStrength": 1,
|
||||
"linkDistance": 250,
|
||||
"scale": 0.29109721960434015,
|
||||
"scale": 0.7932897629230251,
|
||||
"close": true
|
||||
}
|
||||
180
01 Projects/semblance/DEVELOPER_MANUAL.md
Normal file
180
01 Projects/semblance/DEVELOPER_MANUAL.md
Normal file
|
|
@ -0,0 +1,180 @@
|
|||
---
|
||||
auto_generated: true
|
||||
manual_updated_at: 2026-05-18
|
||||
modified: 2026-05-18
|
||||
---
|
||||
|
||||
# Semblance Synthetic Society — Developer Manual
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
Semblance is a full-stack web application with a React frontend and a Node.js backend, communicating with a MongoDB database and external LLM providers. The application is containerized using Docker and Docker Compose for development and production environments.
|
||||
|
||||
### High-Level Diagram
|
||||
```
|
||||
[Frontend (React/TS)] <--> [Backend (Node/Express)] <--> [MongoDB]
|
||||
^ ^
|
||||
| |
|
||||
[LLM Providers] [Auth Middleware]
|
||||
```
|
||||
|
||||
### Tech Stack
|
||||
- **Frontend:**
|
||||
- React 18+ with TypeScript
|
||||
- React Router DOM for routing
|
||||
- TanStack Query for server state management
|
||||
- Socket.IO Client for real-time WebSocket communication
|
||||
- Sonner for toast notifications
|
||||
- Azure MSAL React for Microsoft Authentication
|
||||
- Tailwind CSS (implied by UI component structure)
|
||||
- **Backend:**
|
||||
- Node.js
|
||||
- Express (implied by API structure)
|
||||
- Mongoose (MongoDB ODM)
|
||||
- Socket.IO Server
|
||||
- JWT for local authentication
|
||||
- **Infrastructure:**
|
||||
- Docker & Docker Compose
|
||||
- MongoDB 7
|
||||
|
||||
## Local Setup
|
||||
|
||||
### Prerequisites
|
||||
- Node.js 20+
|
||||
- Docker and Docker Compose
|
||||
- npm or yarn
|
||||
|
||||
### Installation Steps
|
||||
|
||||
1. **Clone the Repository**
|
||||
```bash
|
||||
git clone <repo-url>
|
||||
cd semblance
|
||||
```
|
||||
|
||||
2. **Configure Environment Variables**
|
||||
- Copy `.env.example` to `backend/.env`:
|
||||
```bash
|
||||
cp backend/.env.example backend/.env
|
||||
```
|
||||
- Edit `backend/.env` to set:
|
||||
- `MONGO_URI`: Connection string to MongoDB (defaults to `mongodb://mongo:27017/semblance_db` in Docker).
|
||||
- `JWT_SECRET`: A strong secret for JWT token signing.
|
||||
- `LLM_API_KEYS`: API keys for Google Gemini and OpenAI.
|
||||
- `MSAL_CONFIG`: Client ID and tenant ID for Microsoft Authentication.
|
||||
|
||||
3. **Start Infrastructure**
|
||||
```bash
|
||||
docker-compose up -d mongo
|
||||
```
|
||||
This starts the MongoDB service with persistent volumes.
|
||||
|
||||
4. **Build and Start Backend**
|
||||
```bash
|
||||
docker-compose up --build backend
|
||||
```
|
||||
The backend will connect to MongoDB and start on port `5137`.
|
||||
|
||||
5. **Build and Start Frontend**
|
||||
```bash
|
||||
docker-compose --profile build up --build frontend
|
||||
```
|
||||
The frontend builds and serves the output to `./backend/dist-out` (or your configured dist directory). For local development with hot reload, you may need to run the frontend locally with `npm run dev` instead of the Docker build profile.
|
||||
|
||||
6. **Access the Application**
|
||||
- Frontend: `http://localhost:5173` (if running locally) or via the configured proxy.
|
||||
- Backend API: `http://localhost:5137`.
|
||||
- MongoDB: `127.0.0.1:27017`.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description | Required |
|
||||
|----------|-------------|----------|
|
||||
| `MONGO_URI` | MongoDB connection string | Yes |
|
||||
| `JWT_SECRET` | Secret key for JWT signing | Yes |
|
||||
| `GEMINI_API_KEY` | Google Gemini API key | Yes |
|
||||
| `OPENAI_API_KEY` | OpenAI API key | Yes |
|
||||
| `MSAL_CLIENT_ID` | Azure AD App Client ID | Yes |
|
||||
| `MSAL_TENANT_ID` | Azure AD Tenant ID | Yes |
|
||||
| `NODE_ENV` | Environment (development/production) | No |
|
||||
|
||||
## Key Services & Entry Points
|
||||
|
||||
### Frontend Entry Points
|
||||
- **`src/main.tsx`**: Root component initialization.
|
||||
- **`src/App.tsx`**: Defines routing structure, providers (Auth, WebSocket, QueryClient), and route definitions.
|
||||
- **Routes:**
|
||||
- `/`: Index page.
|
||||
- `/login`: Authentication page.
|
||||
- `/synthetic-users`: Persona management.
|
||||
- `/focus-groups`: Session management.
|
||||
- `/dashboard`: Overview analytics.
|
||||
- `/admin`: Admin panel (protected).
|
||||
- **`src/contexts/`**:
|
||||
- `AuthContext.tsx`: Handles local login and Microsoft SSO state. Uses `localStorage` for token persistence.
|
||||
- `NavigationContext.tsx`: Manages app state for navigation history and focus group context, persisted in `localStorage`.
|
||||
- `WebSocketContextNew.tsx`: Initializes a singleton Socket.IO instance. Uses a module-level `socketInitialized` flag to prevent re-initialization.
|
||||
|
||||
### Backend Entry Points
|
||||
- **`backend/`**: Contains the Express server, routes, and controllers.
|
||||
- **Services:**
|
||||
- `llm`: Handles interactions with Gemini and OpenAI models.
|
||||
- `aiRunner`: Orchestrates autonomous focus group logic.
|
||||
- `themeExtractor`: Processes session transcripts for key themes.
|
||||
|
||||
### WebSocket Implementation
|
||||
- **Service:** `src/services/websocketServiceNew.ts` (referenced in context).
|
||||
- **Pattern:** Singleton pattern via `socketInitialized` flag.
|
||||
- **Authentication:** Socket connection is authenticated via JWT token passed in the `auth_token` from `localStorage` or the `AuthContext`.
|
||||
|
||||
## API Reference
|
||||
|
||||
### Authentication Endpoints
|
||||
- `POST /api/auth/login`: Local login. Expects `{ username, password }`. Returns JWT.
|
||||
- `POST /api/auth/microsoft`: Microsoft SSO callback. Expects ID token. Returns JWT and user details.
|
||||
|
||||
### Persona Endpoints
|
||||
- `GET /api/personas`: List all personas.
|
||||
- `POST /api/personas`: Create a new persona.
|
||||
- `PUT /api/personas/:id`: Update a persona.
|
||||
- `GET /api/personas/:id`: Get persona details.
|
||||
|
||||
### Focus Group Endpoints
|
||||
- `POST /api/focus-groups`: Create a session.
|
||||
- `GET /api/focus-groups/:id/stream`: WebSocket events for session data.
|
||||
|
||||
### Admin Endpoints
|
||||
- `GET /api/admin/users`: List users.
|
||||
- `DELETE /api/admin/users/:id`: Remove a user.
|
||||
|
||||
## Deployment
|
||||
|
||||
### Production Build
|
||||
1. Ensure `MONGO_URI` points to the production MongoDB instance.
|
||||
2. Set `NODE_ENV=production`.
|
||||
3. Build the frontend using the Docker compose build profile:
|
||||
```bash
|
||||
docker-compose --profile build up --build frontend
|
||||
```
|
||||
4. Deploy the backend container and frontend static assets to your web server (e.g., Nginx) or use a PaaS.
|
||||
|
||||
### Docker Compose Notes
|
||||
- The `frontend` service in `docker-compose.yml` is configured for **build** only. It copies the built assets to `/app/dist-out`. The backend should serve these static files in production.
|
||||
- Volumes are used for MongoDB data persistence and backend uploads.
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
1. **WebSocket Singleton Issue:**
|
||||
The `WebSocketContextNew.tsx` uses a module-level `socketInitialized` flag. If the app is hot-reloaded in development, the flag may not reset, causing connection issues. In production, this is generally safe, but ensure server-side disconnects are handled if the client navigates away.
|
||||
|
||||
2. **Navigation State Persistence:**
|
||||
The `NavigationContext` persists state to `localStorage`. If a user switches browsers or clears storage, navigation history (like "Previous Route") will be lost. This may cause back-button issues if not handled gracefully in UI components.
|
||||
|
||||
3. **Microsoft SSO Redirect:**
|
||||
The MSAL `handleRedirectPromise` is called in `AuthContext` on mount. Ensure the redirect URI is correctly registered in Azure AD and matches the application's configuration.
|
||||
|
||||
4. **Mention Parsing Regex:**
|
||||
The `parseMentions` function uses a specific regex for `@mentions`. Be aware that complex names with special characters or punctuation may not be parsed correctly. The regex stops at conjunctions (`and`, `or`) and non-word boundaries.
|
||||
|
||||
5. **Port Binding:**
|
||||
Backend and MongoDB are bound to `127.0.0.1` in `docker-compose.yml`. This ensures they are not exposed to the public internet, but requires port forwarding or a reverse proxy for remote access.
|
||||
100
01 Projects/semblance/USER_MANUAL.md
Normal file
100
01 Projects/semblance/USER_MANUAL.md
Normal file
|
|
@ -0,0 +1,100 @@
|
|||
---
|
||||
auto_generated: true
|
||||
manual_updated_at: 2026-05-18
|
||||
modified: 2026-05-18
|
||||
---
|
||||
|
||||
# Semblance Synthetic Society — User Manual
|
||||
|
||||
## What This Tool Does
|
||||
|
||||
Semblance is an AI-powered platform designed to simulate consumer behavior through synthetic personas. It allows researchers, marketers, and product teams to create realistic AI-driven user profiles and run autonomous or moderated focus group sessions. The platform extracts actionable insights, themes, and analytics from these simulated conversations in real-time.
|
||||
|
||||
### Key Capabilities
|
||||
- **Synthetic Persona Management:** Create, edit, and organize AI-generated or manually built consumer profiles.
|
||||
- **Autonomous Focus Groups:** Run self-driving conversations where AI personas interact with each other based on research objectives.
|
||||
- **Moderated Sessions:** Act as a human moderator while AI assists with probes and follow-up questions.
|
||||
- **Real-Time Insights:** View live transcripts, theme extraction, and sentiment analysis during sessions.
|
||||
|
||||
## Who Uses It
|
||||
|
||||
- **Market Researchers:** To test concepts, messages, or products without recruiting real participants.
|
||||
- **Product Managers:** To validate feature ideas and user flows with synthetic user personas.
|
||||
- **UX Researchers:** To run rapid, low-cost qualitative research rounds.
|
||||
- **Marketing Teams:** To refine messaging and identify key themes from consumer perspectives.
|
||||
|
||||
## How to Access
|
||||
|
||||
1. Navigate to the application URL provided by your organization.
|
||||
2. Log in using one of the supported methods:
|
||||
- **Local Login:** Enter your registered username and password.
|
||||
- **Microsoft SSO:** Click the "Sign in with Microsoft" button to authenticate via your corporate Azure AD account.
|
||||
3. Once authenticated, you will be directed to the Dashboard or the last viewed page.
|
||||
|
||||
## Main Workflows
|
||||
|
||||
### Workflow 1: Creating Synthetic Personas
|
||||
|
||||
1. **Navigate to Synthetic Users:**
|
||||
- Click "Synthetic Users" in the left-hand navigation bar.
|
||||
|
||||
2. **Create a New Persona:**
|
||||
- Click the "Create Persona" button.
|
||||
- **Option A: AI Generation:**
|
||||
- Select "Generate with AI."
|
||||
- Provide a brief description of the target audience (e.g., "Millennial parents interested in organic skincare").
|
||||
- Choose an LLM model (Default: Google Gemini 3 Pro; Options: OpenAI GPT-4.1, GPT-5.2).
|
||||
- Click "Generate." The AI will populate demographic, psychographic, and behavioral attributes.
|
||||
- **Option B: Manual Creation:**
|
||||
- Select "Manual Creation."
|
||||
- Fill in attributes such as name, age, occupation, interests, and behavioral traits manually.
|
||||
|
||||
3. **Refine and Save:**
|
||||
- Adjust any AI-generated details if necessary.
|
||||
- Save the persona.
|
||||
|
||||
4. **Organize Personas:**
|
||||
- Use drag-and-drop to move personas into folders for better organization.
|
||||
|
||||
5. **Export Profiles:**
|
||||
- Select one or more personas.
|
||||
- Click "Export" and choose PDF or summary format to download the persona profiles.
|
||||
|
||||
### Workflow 2: Running a Focus Group Session
|
||||
|
||||
1. **Start a Session:**
|
||||
- Navigate to "Focus Groups" and click "Create Session."
|
||||
- Select the personas to participate.
|
||||
|
||||
2. **Configure Session Settings:**
|
||||
- **Mode:** Choose between **Manual Moderation** (you guide the chat) or **Autonomous AI Mode** (AI drives the conversation).
|
||||
- **Discussion Guide:** Optionally, ask the AI to generate a structured discussion guide based on your research objectives.
|
||||
|
||||
3. **Run the Session:**
|
||||
- **Autonomous Mode:** Click "Start Session." Watch as AI personas converse in real-time via WebSocket. The AI moderator may intervene with probes or follow-up questions.
|
||||
- **Manual Mode:** Type questions or prompts to guide the personas. Use the AI Assistant panel for suggested follow-ups.
|
||||
|
||||
4. **Analyze Results:**
|
||||
- During the session, observe real-time theme extraction and highlighting.
|
||||
- After the session ends, review the session analytics, including summary generation, key themes, and sentiment scores.
|
||||
|
||||
### Workflow 3: Managing Your Account
|
||||
|
||||
1. **Check Usage:**
|
||||
- Go to "My Usage" to view your token consumption and API limits.
|
||||
2. **Admin Tasks (Admins Only):**
|
||||
- Access the "Admin" panel to manage users, system settings, and monitor system health.
|
||||
|
||||
## FAQ
|
||||
|
||||
**Q: How accurate are the synthetic personas?**
|
||||
A: Personas are generated using multi-model LLMs (Gemini, GPT-4.1, GPT-5.2). You can refine them manually after generation to better fit your specific research needs.
|
||||
|
||||
**Q: Can I interrupt an autonomous session?**
|
||||
A: Yes. Even in Autonomous Mode, if you have moderator privileges, you can type prompts to steer the conversation.
|
||||
|
||||
**Q: What data is stored?**
|
||||
A: Persona data, session transcripts, and analytics are stored in the MongoDB database. Ensure you comply with your organization's data privacy policies when creating personas.
|
||||
|
||||
**Q: How do I export session insights?**
|
||||
A: Session summaries and theme extractions are available post-session. You can often export these as reports or view them in the analytics dashboard.
|
||||
176
01 Projects/video-accessibility-old/DEVELOPER_MANUAL.md
Normal file
176
01 Projects/video-accessibility-old/DEVELOPER_MANUAL.md
Normal file
|
|
@ -0,0 +1,176 @@
|
|||
---
|
||||
auto_generated: true
|
||||
manual_updated_at: 2026-05-18
|
||||
modified: 2026-05-18
|
||||
---
|
||||
|
||||
# Developer Manual: Accessible Video Processing Platform
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
The platform follows a microservices-oriented architecture deployed via Docker Compose, separated into frontend, API, and background workers.
|
||||
|
||||
### High-Level Diagram
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
User[User Browser] -->|HTTPS/WS| Frontend[Frontend (React)]
|
||||
Frontend -->|REST API| API[FastAPI/Gunicorn]
|
||||
Frontend -->|WebSocket| API
|
||||
API -->|Task Publish| Redis[(Redis Broker)]
|
||||
API -->|Cache/Session| Redis
|
||||
API -->|Data Query| MongoDB[(MongoDB)]
|
||||
Redis -->|Consume| Worker[Cellery Workers]
|
||||
Worker -->|Process| FFmpeg[FFmpeg Worker]
|
||||
Worker -->|Process| TTS[TTS Worker]
|
||||
Worker -->|Process| Whisper[Whisper Worker]
|
||||
Worker -->|Write| MongoDB
|
||||
```
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Frontend**: React 18+, TypeScript, Tailwind CSS, React Query, MSAL (Microsoft Authentication).
|
||||
- **Backend**: Python, FastAPI, Gunicorn, Celery.
|
||||
- **Database**: MongoDB 7.0.
|
||||
- **Broker/Cache**: Redis 7.
|
||||
- **AI/ML**: Gemini 2.5 Pro (Gemini API), Whisper (Speech-to-Text), Google Cloud TTS, ElevenLabs.
|
||||
- **Monitoring**: Sentry (Error/Performance Tracking).
|
||||
- **Infrastructure**: Docker, Docker Compose.
|
||||
|
||||
## Local Setup
|
||||
|
||||
### Prerequisites
|
||||
- Docker and Docker Compose.
|
||||
- Python 3.10+.
|
||||
- Node.js 18+.
|
||||
- Google Cloud API Credentials (for Translate, TTS).
|
||||
- Google AI Studio API Key (for Gemini).
|
||||
- Microsoft Entra ID (Azure AD) App Registration (for MSAL).
|
||||
|
||||
### Step 1: Environment Variables
|
||||
Create a `.env` file in the root directory based on `.env.example`.
|
||||
|
||||
```bash
|
||||
# Database
|
||||
MONGODB_DB=accessible_video
|
||||
|
||||
# Redis
|
||||
REDIS_URL=redis://redis:6379/0
|
||||
|
||||
# AI Services
|
||||
GEMINI_API_KEY=your_gemini_key
|
||||
GOOGLE_CLOUD_PROJECT_ID=your_project_id
|
||||
GOOGLE_APPLICATION_CREDENTIALS=./path/to/key.json
|
||||
|
||||
# Auth
|
||||
MSAL_CLIENT_ID=your_azure_client_id
|
||||
MSAL_TENANT_ID=your_azure_tenant_id
|
||||
|
||||
# Sentry
|
||||
SENTRY_DSN=your_sentry_dsn
|
||||
```
|
||||
|
||||
### Step 2: Start Infrastructure
|
||||
```bash
|
||||
docker-compose up -d mongodb redis
|
||||
```
|
||||
|
||||
### Step 3: Backend Setup
|
||||
```bash
|
||||
cd backend
|
||||
pip install -r requirements.txt
|
||||
# Copy config if necessary
|
||||
cp config/mongod.conf.example config/mongod.conf
|
||||
```
|
||||
|
||||
### Step 4: Frontend Setup
|
||||
```bash
|
||||
cd frontend
|
||||
npm install
|
||||
# Ensure .env.development or .env.local is configured with VITE_API_URL
|
||||
npm run dev
|
||||
```
|
||||
|
||||
### Step 5: Start Workers
|
||||
In separate terminal windows:
|
||||
```bash
|
||||
# Default Worker
|
||||
celery -A celery_app worker -Q default,ingest,notify,render -c 4
|
||||
|
||||
# Dedicated Workers
|
||||
celery -A celery_app worker -Q tts -c 8
|
||||
celery -A celery_app worker -Q ffmpeg -c 1
|
||||
celery -A celery_app worker -Q whisper -c 1
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description | Required |
|
||||
| :--- | :--- | :--- |
|
||||
| `MONGODB_DB` | MongoDB database name | Yes |
|
||||
| `REDIS_URL` | Redis connection string | Yes |
|
||||
| `GEMINI_API_KEY` | API key for Gemini 2.5 Pro | Yes |
|
||||
| `MSAL_CLIENT_ID` | Azure AD Client ID | Yes |
|
||||
| `MSAL_TENANT_ID` | Azure AD Tenant ID | Yes |
|
||||
| `SENTRY_DSN` | Sentry Data Source Name | Optional |
|
||||
| `VITE_APP_ENV` | App environment (dev/prod) | Yes |
|
||||
|
||||
## Key Services & Entry Points
|
||||
|
||||
### 1. Frontend (`frontend/src/main.tsx`)
|
||||
- **Auth**: Initializes MSAL `PublicClientApplication` for Microsoft SSO.
|
||||
- **Monitoring**: Initializes Sentry with PII filtering.
|
||||
- **State**: Provides global contexts for Toasts, Notifications, and WebSocket connections.
|
||||
|
||||
### 2. WebSocket (`frontend/src/contexts/GlobalWebSocketContext.tsx`)
|
||||
- Manages real-time job status updates.
|
||||
- Uses `useJobStatusWebSocket` hook.
|
||||
- Dispatches toast notifications based on `getStatusMessageConfig`.
|
||||
|
||||
### 3. Backend API (`backend/app/main.py` inferred)
|
||||
- **FastAPI** handles REST requests.
|
||||
- **Celery** offloads heavy processing:
|
||||
- `ingest`: Video metadata extraction.
|
||||
- `whisper`: Speech-to-text transcription.
|
||||
- `render`: TTS synthesis and FFmpeg video compositing.
|
||||
- `notify`: WebSocket dispatching.
|
||||
|
||||
### 4. Database Models (MongoDB)
|
||||
- **Job**: Core entity tracking processing status, assets, and metadata.
|
||||
- **User/Client/Reviewer**: RBAC models.
|
||||
- **AuditLog**: Immutable log of reviewer actions.
|
||||
|
||||
## API Reference
|
||||
|
||||
*Note: Full OpenAPI spec available at `/docs` in running API.*
|
||||
|
||||
### Authentication
|
||||
- **POST /auth/login**: Triggers MSAL redirect flow.
|
||||
- **GET /auth/callback**: Handles MSAL token exchange.
|
||||
|
||||
### Jobs
|
||||
- **POST /jobs**: Create a new accessibility job.
|
||||
- Body: `{ video_id, target_languages, services: ["cc", "ad"] }`
|
||||
- **GET /jobs**: List jobs with filters (status, owner).
|
||||
- **GET /jobs/{job_id}**: Get job details and progress.
|
||||
|
||||
### Quality Control
|
||||
- **GET /qc/list**: Get jobs awaiting review.
|
||||
- **PUT /qc/{job_id}/approve**: Approve job.
|
||||
- **PUT /qc/{job_id}/reject**: Reject job with comments.
|
||||
- **PUT /qc/{job_id}/edit-vtt**: Update VTT content.
|
||||
|
||||
## Deployment
|
||||
|
||||
1. **Build Images**: Ensure Docker images are up to date.
|
||||
2. **Production Env**: Set `VITE_APP_ENV=production` and `SENTRY_DSN`.
|
||||
3. **Security**: Ensure `HTTPS` is enforced at the reverse proxy (Nginx/AWS ALB).
|
||||
4. **Scale Workers**: Increase concurrency in `docker-compose.yml` or use Kubernetes for horizontal scaling.
|
||||
|
||||
## Known Gotchas
|
||||
|
||||
1. **FFmpeg Concurrency**: The `ffmpeg-worker` is configured with `concurrency=1`. Do not scale this worker beyond 1 instance per CPU core to avoid file locking issues.
|
||||
2. **TTS Quotas**: Google Cloud TTS and ElevenLabs have rate limits. Monitor usage and implement backoff strategies in the TTS worker.
|
||||
3. **WebSocket Reconnects**: The frontend handles auto-reconnects. Ensure the backend WebSocket server is stable; unexpected disconnections may require a manual page refresh if reconnect fails.
|
||||
4. **Signed URLs**: Generated download links expire after 24 hours. If a user tries to download later, they must re-request the link from the API.
|
||||
5. **MSAL Config**: Ensure `msalConfig.ts` matches the Azure AD redirect URIs exactly, including the `basename` (`/video-accessibility`) if deployed to a sub-path.
|
||||
107
01 Projects/video-accessibility-old/USER_MANUAL.md
Normal file
107
01 Projects/video-accessibility-old/USER_MANUAL.md
Normal file
|
|
@ -0,0 +1,107 @@
|
|||
---
|
||||
auto_generated: true
|
||||
manual_updated_at: 2026-05-18
|
||||
modified: 2026-05-18
|
||||
---
|
||||
|
||||
# User Manual: Accessible Video Processing Platform
|
||||
|
||||
## What This Tool Does
|
||||
|
||||
The **Accessible Video Processing Platform** is an AI-powered solution designed to make video content inclusive and accessible to diverse audiences. It automates the complex workflow of generating closed captions, audio descriptions (with voiceovers), and multi-language translations for video content.
|
||||
|
||||
Instead of manually creating accessibility assets, users upload a video, and the platform uses advanced AI models (Gemini 2.5 Pro, Google Cloud TTS, Whisper) to generate high-quality, synchronized accessibility layers. The platform includes a robust quality control (QC) workflow where human reviewers verify and edit the generated content before final delivery.
|
||||
|
||||
## Who Uses It
|
||||
|
||||
This platform serves three primary user roles:
|
||||
|
||||
1. **Clients/Content Creators**: Users who upload videos and manage their accessibility projects. They initiate jobs, track progress, and download final accessible assets.
|
||||
2. **Reviewers**: Linguists or accessibility experts who review generated captions and audio descriptions, make necessary edits, and approve the content for delivery.
|
||||
3. **Administrators**: System managers who oversee user accounts, organizational settings, audit logs, and glossary management.
|
||||
|
||||
## How to Access
|
||||
|
||||
### Prerequisites
|
||||
- A web browser (Chrome, Firefox, Edge, or Safari).
|
||||
- Valid organizational credentials (Microsoft Account/Enterprise SSO).
|
||||
|
||||
### Login
|
||||
1. Navigate to the application URL.
|
||||
2. Click **Login**. You will be redirected to the Microsoft Authentication Library (MSAL) login screen.
|
||||
3. Enter your organizational credentials to sign in.
|
||||
4. Upon successful authentication, you will be redirected to the **Dashboard**.
|
||||
|
||||
## Main Workflows
|
||||
|
||||
### Workflow 1: Creating a New Accessibility Job (For Clients)
|
||||
|
||||
1. **Navigate to Upload**: From the Dashboard, click **New Job**.
|
||||
2. **Upload Video**:
|
||||
- Drag and drop your video file into the upload area or click to browse.
|
||||
- The system validates the file format and size.
|
||||
3. **Configure Settings**:
|
||||
- **Source Language**: Select the original language of the video.
|
||||
- **Target Languages**: Select one or more languages for translation (supports 50+ languages).
|
||||
- **Services**: Check boxes for required services:
|
||||
- *Closed Captions (CC)*
|
||||
- *Audio Descriptions (AD)*
|
||||
- *Multi-language Translation*
|
||||
4. **Submit Job**: Click **Start Processing**.
|
||||
- You will be taken to the **Jobs List** where the new job appears.
|
||||
- A toast notification confirms the submission.
|
||||
|
||||
### Workflow 2: Tracking Job Progress
|
||||
|
||||
1. **Job List View**: View the status of all your jobs in real-time.
|
||||
2. **Real-Time Updates**:
|
||||
- Use the **WebSocket** status bar to see live progress.
|
||||
- Toast notifications will pop up when status changes occur (e.g., "Caption Generation Complete").
|
||||
3. **Job Details**: Click on a specific job ID to view:
|
||||
- Detailed progress bars for each AI service (Whisper, Gemini, TTS).
|
||||
- Logs of the processing stages.
|
||||
- Estimated time remaining.
|
||||
|
||||
### Workflow 3: Quality Control Review (For Reviewers)
|
||||
|
||||
1. **Access QC Queue**: Navigate to **QC List** from the admin menu.
|
||||
2. **Review Item**: Click on a job marked as **Ready for Review**.
|
||||
3. **Edit Captions**:
|
||||
- Open the **VTT Editor**.
|
||||
- Play the video and read the synchronized captions.
|
||||
- Click on any caption line to edit the text directly.
|
||||
- Save changes to update the VTT file.
|
||||
4. **Review Audio Descriptions**:
|
||||
- Listen to the generated TTS audio.
|
||||
- If the tone is inappropriate or inaccurate, flag the section for re-generation or manual note.
|
||||
5. **Approve/Reject**:
|
||||
- Click **Approve** to send the assets for final delivery.
|
||||
- Click **Reject** with a comment to send it back to AI processing for regeneration.
|
||||
6. **Audit Trail**: All review actions are logged automatically in the **Audit Log** for compliance.
|
||||
|
||||
### Workflow 4: Downloading Final Assets
|
||||
|
||||
1. **Navigate to Downloads**: Go to the **Downloads** page.
|
||||
2. **Select Completed Jobs**: Filter by status "Approved".
|
||||
3. **Download Assets**:
|
||||
- Click the download icon next to each asset:
|
||||
- *VTT File*: For captions.
|
||||
- *Audio File*: For audio descriptions.
|
||||
- Assets are delivered via secure, time-limited signed URLs.
|
||||
|
||||
## FAQ
|
||||
|
||||
**Q: How long does processing take?**
|
||||
A: Processing time depends on video length and the number of target languages. Typically, captions take 2-5 minutes per hour of video, while audio descriptions may take longer due to TTS synthesis.
|
||||
|
||||
**Q: Can I edit captions before they are processed?**
|
||||
A: You can provide a **Glossary** in the settings to influence the AI's terminology. However, the primary editing window is during the Quality Control phase.
|
||||
|
||||
**Q: Why is my job stuck on "Processing"?**
|
||||
A: Check the **Job Detail** logs. If it remains stuck for more than 10 minutes, try refreshing the page. If the issue persists, contact an Administrator to check the worker queues.
|
||||
|
||||
**Q: How are my files secured?**
|
||||
A: Files are stored in Google Cloud Storage with signed URLs that expire after 24 hours. The platform also enforces strict Role-Based Access Control (RBAC).
|
||||
|
||||
**Q: Who can approve my work?**
|
||||
A: Only users with the **Reviewer** or **Admin** role can approve jobs. Clients cannot self-approve their own submissions.
|
||||
Loading…
Add table
Reference in a new issue