# Video Accessibility Platform - Technical Documentation *Generated: August 24, 2025* ## Table of Contents 1. [Application Architecture](#application-architecture) 2. [Data Models](#data-models) 3. [Detailed Features & Functions](#detailed-features--functions) 4. [Process Flows](#process-flows) 5. [User Journey](#user-journey) 6. [Technical Concepts](#technical-concepts) 7. [Security Architecture](#security-architecture) 8. [API Specifications](#api-specifications) --- ## Application Architecture ### High-Level Architecture ```mermaid graph TB subgraph "Client Layer" UI[React SPA
TypeScript + Vite] CDN[Google Cloud CDN
Static Asset Delivery] end subgraph "API Layer" API[FastAPI Backend
Python 3.11+] AUTH[JWT Authentication
RBAC Authorization] end subgraph "Processing Layer" CELERY[Celery Workers
Background Processing] REDIS[Redis Queue
Task Management] end subgraph "Data Layer" MONGO[MongoDB Atlas
Document Database] GCS[Google Cloud Storage
File Storage] end subgraph "AI Services" GEMINI[Gemini 2.5 Pro
Video Analysis] TRANSLATE[Google Translate
Localization] TTS[Text-to-Speech
Google TTS + ElevenLabs] end subgraph "External Services" EMAIL[SendGrid
Email Notifications] MONITORING[OpenTelemetry
Observability] end UI --> API API --> AUTH API --> CELERY API --> MONGO API --> GCS CELERY --> REDIS CELERY --> GEMINI CELERY --> TRANSLATE CELERY --> TTS CELERY --> EMAIL API --> MONITORING UI --> CDN ``` ### Technology Stack Overview #### Backend Stack - **Framework**: FastAPI with async/await support - **Language**: Python 3.11+ - **Database**: MongoDB Atlas with Motor async driver - **Queue System**: Redis + Celery for distributed task processing - **Cloud Storage**: Google Cloud Storage with signed URL security - **AI Services**: Gemini 2.5 Pro, Google Translate, Google/ElevenLabs TTS - **Authentication**: JWT with HttpOnly refresh cookies and RBAC - **Observability**: OpenTelemetry tracing, Sentry error tracking, Prometheus metrics - **Validation**: Pydantic models with strict typing #### Frontend Stack - **Framework**: React 18 with Vite build system - **Language**: TypeScript for type safety - **Routing**: React Router v6 with role-based route guards - **State Management**: TanStack Query (server state) + Zustand (minimal UI state) - **Styling**: Tailwind CSS with responsive design principles - **Forms**: React Hook Form with Zod validation - **File Upload**: React Dropzone with real-time progress tracking #### Infrastructure - **Containerization**: Docker with optimized multi-stage builds - **Deployment**: Google Cloud Run for API and worker services - **CDN**: Google Cloud CDN for frontend asset delivery - **Database**: MongoDB Atlas (fully managed) - **Caching**: Redis (managed service) - **Infrastructure as Code**: Terraform for complete stack management ### Monorepo Structure ``` video_accessibility/ ├── backend/ # FastAPI Python backend │ ├── app/ │ │ ├── api/ # API route handlers │ │ ├── models/ # Pydantic data models │ │ ├── services/ # Business logic services │ │ ├── tasks/ # Celery background tasks │ │ └── core/ # Configuration and utilities │ ├── tests/ # Backend test suites │ └── requirements.txt # Python dependencies ├── frontend/ # React TypeScript SPA │ ├── src/ │ │ ├── components/ # Reusable UI components │ │ ├── routes/ # Page-level components │ │ ├── lib/ # Utilities and configurations │ │ └── types/ # TypeScript type definitions │ ├── public/ # Static assets │ └── package.json # Node.js dependencies ├── infra/ # Infrastructure as Code │ ├── cloud-run/ # Google Cloud Run deployment │ └── cloud-cdn/ # CDN configuration ├── examples/ # Test video files for development └── docker-compose.yml # Local development environment ``` --- ## Data Models ### Job State Machine ```mermaid stateDiagram-v2 [*] --> created: Client uploads video created --> ingesting: Worker starts processing ingesting --> ai_processing: Video uploaded to Gemini ai_processing --> pending_qc: AI generation complete pending_qc --> approved_english: QC approves content pending_qc --> rejected: QC rejects content rejected --> [*]: Job terminated approved_english --> translating: Multi-language processing translating --> tts_generating: Translation complete tts_generating --> pending_final_review: TTS synthesis complete pending_final_review --> completed: Final approval pending_final_review --> rejected: Final rejection completed --> [*]: Assets delivered to client ``` ### Core Data Models #### Job Document (MongoDB) ```javascript { "_id": ObjectId("..."), "client_id": ObjectId("..."), "title": "Marketing Video Q3 2025", "source": { "filename": "marketing_q3.mp4", "original_filename": "Marketing Video - Final Cut.mp4", "gcs_uri": "gs://accessible-video/job123/source.mp4", "duration_s": 180.5, "language": "en" }, "requested_outputs": { "captions_vtt": true, "audio_description_vtt": true, "audio_description_mp3": true, "languages": ["es", "fr", "de"], "transcreation": ["es"] // Cultural adaptation vs direct translation }, "status": "pending_qc", // State machine value "review": { "notes": "Minor timing adjustments needed", "reviewer_id": ObjectId("..."), "history": [ { "at": "2025-08-24T10:30:00Z", "status": "pending_qc", "by": "reviewer@company.com", "notes": "Ready for QC review" } ] }, "outputs": { "en": { "captions_vtt_gcs": "gs://accessible-video/job123/en/captions.vtt", "ad_vtt_gcs": "gs://accessible-video/job123/en/ad.vtt", "ad_mp3_gcs": "gs://accessible-video/job123/en/ad.mp3" }, "es": { "captions_vtt_gcs": "gs://accessible-video/job123/es/captions.vtt", "ad_vtt_gcs": "gs://accessible-video/job123/es/ad.vtt", "ad_mp3_gcs": "gs://accessible-video/job123/es/ad.mp3", "origin": "transcreate", // vs "translate" "qa_notes": "Cultural references adapted for Spanish market" } }, "ai": { "ingestion_json": { "captions": [...], "audio_descriptions": [...], "confidence": 0.94 }, "confidence": 0.94 }, "created_at": "2025-08-24T09:00:00Z", "updated_at": "2025-08-24T10:35:00Z" } ``` #### User Document (MongoDB) ```javascript { "_id": ObjectId("..."), "email": "client@company.com", "hashed_password": "$2b$12$...", "full_name": "Jane Smith", "role": "client", // client | reviewer | admin "is_active": true, "created_at": "2025-01-15T08:00:00Z" } ``` ### File Storage Structure (Google Cloud Storage) ``` gs://accessible-video/{jobId}/ ├── source.mp4 # Original uploaded video ├── en/ # English outputs │ ├── captions.vtt # English closed captions │ ├── ad.vtt # English audio description script │ └── ad.mp3 # English audio description audio ├── es/ # Spanish outputs (example) │ ├── captions.vtt # Spanish captions (translated/transcreated) │ ├── ad.vtt # Spanish audio description script │ └── ad.mp3 # Spanish TTS audio └── {additional_languages}/ # Additional language folders └── ... # Same structure per language ``` ### API Data Models (Pydantic) Key backend models ensuring type safety: - **JobResponse**: Complete job data for API responses - **CreateJobRequest**: Video upload with metadata - **UpdateJobRequest**: Partial job updates - **ReviewAction**: QC approval/rejection with notes - **VttUpdateRequest**: Caption editing operations - **DownloadResponse**: Signed URL generation - **UserResponse**: User data without sensitive fields --- ## Detailed Features & Functions ### 1. Video Upload & Processing System #### Upload Features - **Drag-and-drop interface** with React Dropzone - **Progress tracking** with real-time upload percentage - **File validation** (format, size, duration limits) - **Secure multipart upload** to Google Cloud Storage - **Metadata extraction** using ffprobe for video properties #### AI Processing Pipeline - **Gemini 2.5 Pro integration** for content analysis - **Structured JSON output** with self-healing parsing - **Dual output generation**: Closed captions + Audio descriptions - **Confidence scoring** for quality assessment - **Error recovery** mechanisms for failed AI calls ### 2. Quality Control (QC) System #### VTT Editor Component ```typescript interface VttEditorProps { jobId: string; language: string; type: 'captions' | 'audio_description'; readonly?: boolean; } ``` Features: - **Rich text editing** of VTT caption content - **Timing preservation** during text modifications - **Bulk timing adjustments** for synchronization fixes - **Preview integration** with HTML5 video player - **Validation** of WebVTT format compliance - **Auto-save** functionality with conflict resolution #### Review Workflow - **Assignment system** for reviewer workload management - **Approval/rejection** with mandatory review notes - **History tracking** for audit trail and accountability - **Quality metrics** tracking for process improvement ### 3. Multi-Language Translation System #### Translation Options - **Standard Translation**: Google Cloud Translate for direct language conversion - **Transcreation**: Gemini-powered cultural adaptation for marketing content - **VTT Structure Preservation**: Maintains timing and formatting across languages #### Process Flow 1. QC approval triggers translation pipeline 2. Parallel processing for multiple target languages 3. VTT timing synchronization across all languages 4. Quality validation before TTS synthesis ### 4. Text-to-Speech (TTS) Integration #### Supported Providers - **Google Cloud TTS**: High-quality natural voices - **ElevenLabs**: Premium AI voices for enhanced quality - **Voice Configuration**: Per-language voice selection #### Audio Processing - **Per-cue synthesis**: Individual audio clips for each VTT segment - **Timing synchronization**: Audio duration matches VTT timing - **Cross-fade stitching**: Seamless audio transitions - **Format optimization**: MP3 output with appropriate bitrates ### 5. Role-Based Access Control (RBAC) #### User Roles & Permissions ```mermaid graph TD subgraph "Role Hierarchy" ADMIN[Admin
Full system access] REVIEWER[Reviewer
QC operations] CLIENT[Client
Own jobs only] end subgraph "Client Permissions" C1[Create jobs] C2[View own jobs] C3[Download outputs] C4[Update job metadata] end subgraph "Reviewer Permissions" R1[View all jobs] R2[Edit VTT content] R3[Approve/reject jobs] R4[Access QC dashboard] end subgraph "Admin Permissions" A1[User management] A2[Bulk operations] A3[System configuration] A4[Analytics access] end CLIENT --> C1 CLIENT --> C2 CLIENT --> C3 CLIENT --> C4 REVIEWER --> C1 REVIEWER --> C2 REVIEWER --> C3 REVIEWER --> C4 REVIEWER --> R1 REVIEWER --> R2 REVIEWER --> R3 REVIEWER --> R4 ADMIN --> C1 ADMIN --> C2 ADMIN --> C3 ADMIN --> C4 ADMIN --> R1 ADMIN --> R2 ADMIN --> R3 ADMIN --> R4 ADMIN --> A1 ADMIN --> A2 ADMIN --> A3 ADMIN --> A4 ``` ### 6. File Management & Security #### Secure File Access - **Signed URLs** with 24-hour expiration for all file downloads - **CORS configuration** allowing cross-origin access for authenticated users - **Content-type validation** ensuring proper file handling - **Audit logging** for all file access operations #### Storage Optimization - **Lifecycle policies** for automatic cleanup of expired files - **Compression** for VTT text files - **CDN integration** for fast global file delivery ### 7. Notification System #### Email Integration (SendGrid) - **HTML email templates** with professional styling - **Signed URL embedding** for secure file access - **Notification triggers**: Job completion, approval, rejection - **Expiration reminders** for download links #### Real-time Updates - **WebSocket connections** for live status updates (planned) - **Browser notifications** for important events - **In-app notification center** for user alerts ### 8. Dashboard & Analytics #### Client Dashboard - **Recent jobs overview** with status indicators - **Upload statistics** and processing times - **Download history** with expiration tracking - **Usage analytics** for billing and optimization #### Admin Analytics - **System performance metrics** (processing times, error rates) - **User activity tracking** and engagement metrics - **Resource utilization** monitoring - **Quality control statistics** and reviewer performance --- ## Process Flows ### Complete Video Processing Pipeline ```mermaid sequenceDiagram participant Client participant Frontend participant API participant GCS as Cloud Storage participant Celery as Worker Queue participant Gemini as AI Service participant Reviewer participant TTS as Text-to-Speech participant Email as Notification Client->>Frontend: Upload video file Frontend->>API: POST /jobs (multipart) API->>GCS: Store video file API->>Celery: Queue ingestion task API-->>Frontend: Job created (status: created) Celery->>GCS: Download video Celery->>Gemini: Process video with AI Gemini-->>Celery: Return captions + audio descriptions Celery->>GCS: Upload VTT files Celery->>API: Update job (status: pending_qc) Reviewer->>Frontend: Access QC dashboard Frontend->>API: GET /admin/qc API-->>Frontend: Jobs pending review Reviewer->>Frontend: Edit and approve VTT content Frontend->>API: POST /jobs/{id}/actions/approve_english API->>Celery: Queue translation task Celery->>API: Process translations Celery->>TTS: Generate audio descriptions TTS-->>Celery: Return MP3 files Celery->>GCS: Upload translated assets Celery->>API: Update job (status: pending_final_review) Reviewer->>Frontend: Final review and approval Frontend->>API: POST /jobs/{id}/actions/complete API->>Email: Send completion notification Email-->>Client: Notification with download links API->>API: Update job (status: completed) ``` ### Quality Control Workflow ```mermaid flowchart TD START([Job AI Processing Complete]) --> PENDING[Status: pending_qc] PENDING --> ASSIGN[Assign to Reviewer] ASSIGN --> REVIEW[Reviewer Opens VTT Editor] REVIEW --> EDIT{Edit Required?} EDIT -->|Yes| MODIFY[Modify VTT Content] MODIFY --> VALIDATE[Validate VTT Format] VALIDATE --> SAVE[Auto-save Changes] SAVE --> DECISION{Approval Decision} EDIT -->|No| DECISION DECISION -->|Approve| APPROVE[POST approve_english] DECISION -->|Reject| REJECT[POST reject with notes] APPROVE --> TRANSLATION[Trigger Translation Pipeline] TRANSLATION --> FINAL_REVIEW[Status: pending_final_review] REJECT --> REJECTED[Status: rejected] REJECTED --> NOTIFY_CLIENT[Email Client with Notes] NOTIFY_CLIENT --> END([Job Terminated]) FINAL_REVIEW --> FINAL_DECISION{Final Approval?} FINAL_DECISION -->|Approve| COMPLETE[Status: completed] FINAL_DECISION -->|Reject| FINAL_REJECT[Status: rejected] COMPLETE --> NOTIFY_COMPLETION[Email with Download Links] NOTIFY_COMPLETION --> END FINAL_REJECT --> NOTIFY_CLIENT ``` ### Authentication & Authorization Flow ```mermaid sequenceDiagram participant User participant Frontend participant API participant Auth as Auth Service participant DB as Database User->>Frontend: Enter credentials Frontend->>API: POST /auth/login API->>DB: Validate user credentials DB-->>API: User data + role API->>Auth: Generate JWT access token API->>Auth: Generate refresh token Auth-->>API: Tokens created API->>API: Set refresh token as HttpOnly cookie API-->>Frontend: Access token + user data Frontend->>Frontend: Store access token in memory Note over Frontend,API: Subsequent API requests Frontend->>API: Request with Authorization header API->>Auth: Validate access token Auth-->>API: Token valid + user claims API->>API: Check RBAC permissions API-->>Frontend: Authorized response Note over Frontend,API: Token refresh flow API-->>Frontend: 401 Unauthorized (expired token) Frontend->>API: POST /auth/refresh (with cookie) API->>Auth: Validate refresh token Auth->>Auth: Generate new access token Auth-->>API: New access token API-->>Frontend: New access token Frontend->>API: Retry original request ``` --- ## User Journey ### Client User Journey ```mermaid journey title Video Accessibility Client Journey section Video Upload Login to Platform: 5: Client Navigate to New Job: 4: Client Select Video File: 3: Client Fill Job Details: 3: Client Upload with Progress: 4: Client Receive Confirmation: 5: Client section Processing Wait Receive Processing Email: 4: Client Check Job Status: 3: Client Wait for QC Complete: 2: Client section Review & Download Receive Completion Email: 5: Client Access Download Links: 4: Client Download VTT Files: 5: Client Download Audio Files: 5: Client Validate Output Quality: 4: Client ``` ### Reviewer User Journey ```mermaid journey title Reviewer Quality Control Journey section Daily Workflow Login to QC Dashboard: 4: Reviewer View Pending Jobs Queue: 4: Reviewer Select Job for Review: 3: Reviewer section Content Review Open VTT Editor: 4: Reviewer Review Generated Captions: 3: Reviewer Edit Text Content: 2: Reviewer Adjust Timing if Needed: 2: Reviewer Preview with Video Player: 4: Reviewer section Decision Making Assess Content Quality: 3: Reviewer Add Review Notes: 3: Reviewer Make Approval Decision: 4: Reviewer Submit Review: 5: Reviewer section Final Review Review Translated Content: 3: Reviewer Validate TTS Output: 4: Reviewer Final Quality Check: 3: Reviewer Complete Job Approval: 5: Reviewer ``` ### Detailed User Flow Scenarios #### Scenario 1: Successful Job Completion 1. **Client Upload** (5 minutes) - User drags video file to upload zone - Fills required metadata (title, target languages) - Monitors upload progress to completion - Receives job ID and initial confirmation 2. **AI Processing** (10-30 minutes, automated) - Celery worker picks up ingestion task - Video uploaded to Gemini 2.5 Pro - AI generates captions and audio descriptions - VTT files created and stored in GCS - Job status updated to "pending_qc" 3. **Quality Control** (15-45 minutes) - Reviewer receives notification of pending job - Opens VTT editor for content review - Makes text corrections and timing adjustments - Previews content with integrated video player - Approves job triggering translation pipeline 4. **Multi-Language Processing** (20-60 minutes, automated) - Translation worker processes all target languages - TTS service generates audio descriptions - All output files uploaded to structured GCS paths - Job status updated to "pending_final_review" 5. **Final Review & Delivery** (10-20 minutes) - Reviewer validates all translated outputs - Performs final quality check on audio files - Approves job for completion - Client receives email with signed download URLs #### Scenario 2: Quality Control Rejection 1. **Initial Processing** (same as Scenario 1, steps 1-2) 2. **QC Rejection** (10-15 minutes) - Reviewer identifies quality issues - Adds detailed notes explaining problems - Rejects job with specific feedback - Job status changed to "rejected" 3. **Client Notification & Resolution** - Client receives rejection email with reviewer notes - Client can contact support for clarification - New job may need to be created with improved source material --- ## Technical Concepts ### Asynchronous Processing Architecture #### FastAPI Async/Await Pattern The backend leverages Python's async/await functionality for high-concurrency request handling: ```python @router.post("/jobs/{job_id}/vtt") async def update_vtt( job_id: str, vtt_data: VttUpdateRequest, current_user: User = Depends(get_current_user) ): # Non-blocking database operations job = await job_service.get_job(job_id) await job_service.update_vtt_content(job, vtt_data) return await job_service.validate_and_return(job) ``` #### Celery Distributed Task Processing Background tasks are handled by Celery workers for scalable processing: ```python @celery_app.task(bind=True, max_retries=3) def ingest_and_process_with_ai(self, job_id: str): try: # Long-running AI processing result = gemini_service.process_video(job_id) return result except Exception as exc: # Exponential backoff retry raise self.retry(exc=exc, countdown=60 * (2 ** self.request.retries)) ``` ### AI Integration Patterns #### Self-Healing JSON Parsing Gemini AI responses are parsed with fallback mechanisms: ```python def parse_ai_response(raw_response: str) -> dict: try: return json.loads(raw_response) except json.JSONDecodeError: # Self-healing: Extract JSON from markdown code blocks json_match = re.search(r'```json\s*(\{.*?\})\s*```', raw_response, re.DOTALL) if json_match: return json.loads(json_match.group(1)) raise InvalidAIResponseError("Could not parse AI response") ``` #### Structured AI Prompts Gemini prompts are designed for consistent JSON output: ```python GEMINI_PROMPT_TEMPLATE = """ Analyze this video and generate accessibility content in this exact JSON format: { "captions": [ {"start": "00:00:00.000", "end": "00:00:05.000", "text": "Caption text here"} ], "audio_descriptions": [ {"start": "00:00:00.000", "end": "00:00:05.000", "text": "Visual description here"} ], "confidence": 0.95 } Requirements: - Use WebVTT timestamp format (HH:MM:SS.mmm) - Ensure no overlapping time ranges - Maximum 32 characters per caption line - Audio descriptions should be concise and descriptive """ ``` ### WebVTT Processing & Validation #### VTT Format Handling The platform maintains strict WebVTT compliance: ```python def validate_vtt_format(vtt_content: str) -> bool: lines = vtt_content.strip().split('\n') if not lines[0].startswith('WEBVTT'): return False # Validate timestamp format and sequence for i, line in enumerate(lines): if '-->' in line: timestamps = line.split('-->') if not is_valid_timestamp(timestamps[0].strip()) or \ not is_valid_timestamp(timestamps[1].strip()): return False return True ``` #### Timing Preservation During Translation Translation maintains original VTT timing structure: ```python def translate_vtt_preserving_timing(vtt_content: str, target_language: str) -> str: cues = parse_vtt_cues(vtt_content) # Extract only text content for translation text_to_translate = [cue.text for cue in cues] translated_text = translate_service.translate_batch(text_to_translate, target_language) # Reconstruct VTT with original timing for i, cue in enumerate(cues): cue.text = translated_text[i] return serialize_to_vtt(cues) ``` ### Security Implementation Details #### JWT Token Strategy ```python # Access Token (15-minute lifespan, stored in memory) access_token_data = { "sub": user.email, "role": user.role, "exp": datetime.utcnow() + timedelta(minutes=15) } # Refresh Token (7-day lifespan, HttpOnly cookie) refresh_token_data = { "sub": user.email, "type": "refresh", "exp": datetime.utcnow() + timedelta(days=7) } ``` #### Signed URL Generation ```python def generate_signed_url(blob_name: str, expiration: int = 24*60*60) -> str: """Generate signed URL with 24-hour expiration""" bucket = gcs_client.bucket(GCS_BUCKET_NAME) blob = bucket.blob(blob_name) return blob.generate_signed_url( version="v4", expiration=datetime.utcnow() + timedelta(seconds=expiration), method="GET" ) ``` ### Real-time Communication Patterns #### TanStack Query for State Management ```typescript // Automatic cache management and background refetching const { data: job, isLoading, error } = useQuery({ queryKey: ['job', jobId], queryFn: () => api.getJob(jobId), refetchInterval: 5000, // Poll every 5 seconds for status updates staleTime: 1000 * 60, // Consider data fresh for 1 minute }); // Optimistic updates for better UX const updateJobMutation = useMutation({ mutationFn: api.updateJob, onMutate: async (newJobData) => { // Cancel outgoing refetches await queryClient.cancelQueries(['job', jobId]); // Snapshot previous value const previousJob = queryClient.getQueryData(['job', jobId]); // Optimistically update cache queryClient.setQueryData(['job', jobId], newJobData); return { previousJob }; }, onError: (err, newJobData, context) => { // Rollback on error queryClient.setQueryData(['job', jobId], context.previousJob); }, }); ``` ### Error Handling & Resilience #### Exponential Backoff Retry Pattern ```python class RetryableTaskMixin: def retry_with_backoff(self, exc, base_delay=60): """Exponential backoff with jitter""" delay = base_delay * (2 ** self.request.retries) jitter = random.uniform(0, 0.1) * delay return self.retry(exc=exc, countdown=int(delay + jitter)) @celery_app.task(bind=True, base=RetryableTaskMixin, max_retries=5) def process_with_gemini(self, job_id: str): try: return gemini_service.process(job_id) except (ConnectionError, TimeoutError) as exc: return self.retry_with_backoff(exc) ``` #### Circuit Breaker Pattern for External Services ```python class CircuitBreaker: def __init__(self, failure_threshold=5, recovery_timeout=60): self.failure_threshold = failure_threshold self.recovery_timeout = recovery_timeout self.failure_count = 0 self.last_failure_time = None self.state = 'CLOSED' # CLOSED, OPEN, HALF_OPEN def call(self, func, *args, **kwargs): if self.state == 'OPEN': if time.time() - self.last_failure_time > self.recovery_timeout: self.state = 'HALF_OPEN' else: raise CircuitBreakerOpenError("Service unavailable") try: result = func(*args, **kwargs) self.reset() return result except Exception as e: self.record_failure() raise e ``` ### Performance Optimization Strategies #### Database Indexing Strategy ```javascript // MongoDB indexes for optimal query performance db.jobs.createIndex({ "status": 1, "created_at": -1 }); // Job list queries db.jobs.createIndex({ "client_id": 1 }); // Client-specific filtering db.jobs.createIndex({ "review.reviewer_id": 1 }); // Reviewer workload db.users.createIndex({ "email": 1 }, { unique: true }); // User authentication ``` #### Caching Strategy ```python # Redis caching for frequently accessed data @lru_cache(maxsize=1000) def get_user_permissions(user_id: str) -> List[str]: """Cache user permissions for 5 minutes""" return permission_service.get_permissions(user_id) # Cache invalidation on user role changes def update_user_role(user_id: str, new_role: UserRole): user_service.update_role(user_id, new_role) get_user_permissions.cache_clear() # Invalidate cache ``` --- ## Security Architecture ### Multi-Layer Security Model ```mermaid graph TB subgraph "Client Layer Security" CSP[Content Security Policy] CORS[CORS Configuration] HTTPS[HTTPS Enforcement] end subgraph "API Layer Security" JWT[JWT Authentication] RBAC[Role-Based Access Control] RATE[Rate Limiting] INPUT[Input Validation] end subgraph "Data Layer Security" ENCRYPT[Data Encryption at Rest] SIGNED[Signed URLs] IAM[IAM Policies] end subgraph "Infrastructure Security" VPC[Private VPC] FIREWALL[Cloud Firewall] SECRETS[Secret Management] end CSP --> JWT CORS --> RBAC HTTPS --> RATE JWT --> ENCRYPT RBAC --> SIGNED RATE --> IAM INPUT --> VPC ENCRYPT --> FIREWALL SIGNED --> SECRETS ``` ### Authentication Security #### Token Storage Strategy - **Access Tokens**: Stored in JavaScript memory (not localStorage) to prevent XSS attacks - **Refresh Tokens**: Stored in HttpOnly cookies to prevent JavaScript access - **Token Rotation**: Automatic rotation on refresh to limit exposure window - **Secure Transmission**: All tokens transmitted over HTTPS only #### Password Security ```python # Strong password hashing with bcrypt def hash_password(password: str) -> str: salt = bcrypt.gensalt(rounds=12) # Computational cost of 2^12 return bcrypt.hashpw(password.encode('utf-8'), salt).decode('utf-8') def verify_password(password: str, hashed: str) -> bool: return bcrypt.checkpw(password.encode('utf-8'), hashed.encode('utf-8')) ``` ### Authorization Implementation #### Role-Based Middleware ```python def require_roles(*allowed_roles: UserRole): """Decorator for endpoint authorization""" def decorator(func): async def wrapper(*args, **kwargs): current_user = kwargs.get('current_user') if not current_user or current_user.role not in allowed_roles: raise HTTPException( status_code=403, detail=f"Access denied. Required roles: {allowed_roles}" ) return await func(*args, **kwargs) return wrapper return decorator # Usage example @router.delete("/jobs/{job_id}") async def delete_job( job_id: str, current_user: User = Depends(require_roles(UserRole.ADMIN, UserRole.CLIENT)) ): # Additional ownership check for clients if current_user.role == UserRole.CLIENT: job = await job_service.get_job(job_id) if job.client_id != current_user.id: raise HTTPException(status_code=404, detail="Job not found") await job_service.delete_job(job_id) ``` ### Data Protection Measures #### Input Validation & Sanitization ```python class VttUpdateRequest(BaseModel): content: str = Field(..., max_length=100000) # Limit size language: str = Field(..., regex=r'^[a-z]{2}(-[A-Z]{2})?$') # ISO language codes @validator('content') def validate_vtt_format(cls, v): if not v.startswith('WEBVTT'): raise ValueError('Invalid VTT format') return bleach.clean(v, tags=[], strip=True) # Remove any HTML ``` #### File Upload Security ```python ALLOWED_VIDEO_TYPES = {'video/mp4', 'video/quicktime', 'video/x-msvideo'} MAX_FILE_SIZE = 5 * 1024 * 1024 * 1024 # 5GB async def validate_upload(file: UploadFile): # Check file type if file.content_type not in ALLOWED_VIDEO_TYPES: raise HTTPException(400, "Invalid file type") # Check file size if file.size > MAX_FILE_SIZE: raise HTTPException(400, "File too large") # Scan file header for additional validation header = await file.read(1024) await file.seek(0) # Reset for actual upload if not is_valid_video_header(header): raise HTTPException(400, "Invalid video file") ``` ### Infrastructure Security #### Google Cloud Security Configuration ```yaml # Cloud Run security settings metadata: annotations: run.googleapis.com/ingress: all run.googleapis.com/ingress-status: all spec: template: metadata: annotations: autoscaling.knative.dev/maxScale: "10" run.googleapis.com/execution-environment: gen2 run.googleapis.com/cpu-throttling: "false" spec: containerConcurrency: 100 timeoutSeconds: 300 serviceAccountName: video-accessibility-runner@project.iam.gserviceaccount.com containers: - image: gcr.io/project/api:latest env: - name: ENVIRONMENT value: production resources: limits: cpu: 2000m memory: 4Gi securityContext: runAsNonRoot: true runAsUser: 1000 ``` ### Audit Logging #### Security Event Logging ```python class SecurityLogger: @staticmethod def log_authentication_attempt(email: str, success: bool, ip: str): logger.info( "Authentication attempt", extra={ "event_type": "auth_attempt", "email": email, "success": success, "ip_address": ip, "timestamp": datetime.utcnow() } ) @staticmethod def log_permission_denied(user_id: str, resource: str, action: str): logger.warning( "Permission denied", extra={ "event_type": "permission_denied", "user_id": user_id, "resource": resource, "action": action, "timestamp": datetime.utcnow() } ) ``` --- ## API Specifications ### Authentication Endpoints #### POST `/api/v1/auth/login` **Request Body:** ```json { "email": "user@example.com", "password": "secure_password" } ``` **Response:** ```json { "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...", "user": { "id": "user_123", "email": "user@example.com", "full_name": "John Doe", "role": "client" } } ``` **Response Headers:** ``` Set-Cookie: refresh_token=; HttpOnly; Secure; SameSite=Lax; Path=/api/v1/auth/refresh; Max-Age=604800 ``` #### POST `/api/v1/auth/refresh` **Headers:** ``` Cookie: refresh_token= ``` **Response:** ```json { "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9..." } ``` ### Job Management Endpoints #### POST `/api/v1/jobs` **Content-Type:** `multipart/form-data` **Form Fields:** - `title`: string (required) - Job title - `language`: string (required) - Source video language (ISO 639-1) - `target_languages`: array of strings - Target languages for translation - `captions_vtt`: boolean - Generate closed captions - `audio_description_vtt`: boolean - Generate audio description script - `audio_description_mp3`: boolean - Generate audio description audio - `transcreation_languages`: array of strings - Languages requiring cultural adaptation - `file`: file (required) - Video file (MP4, MOV, AVI) **Response:** ```json { "id": "job_123", "title": "Marketing Video Q3 2025", "status": "created", "source": { "filename": "marketing_q3.mp4", "original_filename": "Marketing Video - Final Cut.mp4", "duration_s": 180.5, "language": "en" }, "requested_outputs": { "captions_vtt": true, "audio_description_vtt": true, "audio_description_mp3": true, "languages": ["es", "fr"], "transcreation": ["es"] }, "created_at": "2025-08-24T10:00:00Z" } ``` #### GET `/api/v1/jobs` **Query Parameters:** - `status`: string (optional) - Filter by job status - `mine`: boolean (optional) - Show only user's jobs (clients only) - `page`: integer (optional, default: 1) - Page number - `limit`: integer (optional, default: 20) - Items per page **Response:** ```json { "jobs": [ { "id": "job_123", "title": "Marketing Video Q3 2025", "status": "pending_qc", "created_at": "2025-08-24T10:00:00Z", "client": { "id": "user_456", "full_name": "Jane Smith", "email": "jane@company.com" } } ], "total": 1, "page": 1, "limit": 20, "pages": 1 } ``` #### GET `/api/v1/jobs/{job_id}` **Response:** ```json { "id": "job_123", "title": "Marketing Video Q3 2025", "status": "pending_qc", "source": { "filename": "marketing_q3.mp4", "original_filename": "Marketing Video - Final Cut.mp4", "gcs_uri": "gs://accessible-video/job_123/source.mp4", "duration_s": 180.5, "language": "en" }, "requested_outputs": { "captions_vtt": true, "audio_description_vtt": true, "audio_description_mp3": true, "languages": ["es", "fr"], "transcreation": ["es"] }, "outputs": { "en": { "captions_vtt_gcs": "gs://accessible-video/job_123/en/captions.vtt", "ad_vtt_gcs": "gs://accessible-video/job_123/en/ad.vtt" } }, "review": { "notes": "Content looks good, minor timing adjustments made", "reviewer_id": "reviewer_789", "history": [ { "at": "2025-08-24T11:30:00Z", "status": "pending_qc", "by": "reviewer@company.com", "notes": "Assigned for QC review" } ] }, "ai": { "confidence": 0.94, "ingestion_json": { "captions": [...], "audio_descriptions": [...] } }, "created_at": "2025-08-24T10:00:00Z", "updated_at": "2025-08-24T11:35:00Z" } ``` ### Job Action Endpoints #### POST `/api/v1/jobs/{job_id}/actions/approve_english` **Authorization:** Reviewer or Admin role required **Request Body:** ```json { "notes": "Content approved after minor timing adjustments" } ``` **Response:** ```json { "message": "Job approved for translation processing", "job": { "id": "job_123", "status": "translating", "review": { "notes": "Content approved after minor timing adjustments", "reviewer_id": "reviewer_789" } } } ``` #### POST `/api/v1/jobs/{job_id}/actions/reject` **Authorization:** Reviewer or Admin role required **Request Body:** ```json { "notes": "Audio quality is poor, please provide higher quality source video" } ``` **Response:** ```json { "message": "Job rejected", "job": { "id": "job_123", "status": "rejected", "review": { "notes": "Audio quality is poor, please provide higher quality source video", "reviewer_id": "reviewer_789" } } } ``` ### File Operation Endpoints #### GET `/api/v1/jobs/{job_id}/downloads` **Authorization:** Job owner, Reviewer, or Admin **Response:** ```json { "expires_at": "2025-08-25T12:00:00Z", "downloads": { "source": { "url": "https://storage.googleapis.com/accessible-video/job_123/source.mp4?X-Goog-Algorithm=...", "filename": "marketing_q3.mp4", "size_bytes": 52428800 }, "outputs": { "en": { "captions_vtt": { "url": "https://storage.googleapis.com/accessible-video/job_123/en/captions.vtt?X-Goog-Algorithm=...", "filename": "captions_en.vtt", "size_bytes": 8192 }, "ad_vtt": { "url": "https://storage.googleapis.com/accessible-video/job_123/en/ad.vtt?X-Goog-Algorithm=...", "filename": "audio_description_en.vtt", "size_bytes": 6144 }, "ad_mp3": { "url": "https://storage.googleapis.com/accessible-video/job_123/en/ad.mp3?X-Goog-Algorithm=...", "filename": "audio_description_en.mp3", "size_bytes": 2097152 } }, "es": { "captions_vtt": { "url": "https://storage.googleapis.com/accessible-video/job_123/es/captions.vtt?X-Goog-Algorithm=...", "filename": "captions_es.vtt", "size_bytes": 9216 } } } } } ``` #### GET `/api/v1/jobs/{job_id}/vtt` **Authorization:** Reviewer or Admin **Query Parameters:** - `language`: string (required) - Language code (e.g., "en", "es") - `type`: string (required) - VTT type ("captions" or "audio_description") **Response:** ```json { "content": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\nOpening scene with mountain landscape.\n\n00:00:05.000 --> 00:00:10.000\nNarrator introduces the topic of accessibility.", "language": "en", "type": "captions", "last_modified": "2025-08-24T11:35:00Z" } ``` #### PATCH `/api/v1/jobs/{job_id}/vtt` **Authorization:** Reviewer or Admin **Request Body:** ```json { "content": "WEBVTT\n\n00:00:00.000 --> 00:00:05.000\nUpdated caption text here.\n\n00:00:05.000 --> 00:00:10.000\nNarrator introduces the topic of accessibility.", "language": "en", "type": "captions" } ``` **Response:** ```json { "message": "VTT content updated successfully", "validation": { "valid": true, "cue_count": 2, "total_duration": "00:00:10.000" } } ``` ### Error Response Format All API endpoints return consistent error responses: ```json { "error": { "code": "VALIDATION_ERROR", "message": "Request validation failed", "details": [ { "field": "email", "message": "Invalid email format" } ] }, "request_id": "req_123456789" } ``` **Common HTTP Status Codes:** - `200` - Success - `201` - Created - `400` - Bad Request (validation errors) - `401` - Unauthorized (invalid/expired token) - `403` - Forbidden (insufficient permissions) - `404` - Not Found - `409` - Conflict (duplicate resource) - `422` - Unprocessable Entity (business logic error) - `429` - Too Many Requests (rate limited) - `500` - Internal Server Error --- ## Conclusion This video accessibility platform represents a comprehensive, production-ready solution that combines cutting-edge AI technology with human quality control workflows. The architecture emphasizes scalability, security, and user experience while maintaining strict accessibility standards and multi-language support. Key architectural strengths include: - **Scalable Processing**: Async FastAPI backend with distributed Celery workers - **Robust AI Integration**: Self-healing AI response parsing with confidence scoring - **Enterprise Security**: Multi-layered security with JWT, RBAC, and signed URLs - **Quality Assurance**: Human-in-the-loop workflows with comprehensive audit trails - **Developer Experience**: Full TypeScript integration and comprehensive API documentation - **Operational Excellence**: Complete observability with OpenTelemetry, metrics, and error tracking The platform successfully addresses the complex challenges of automated video accessibility while maintaining the flexibility needed for diverse client requirements and quality standards.