# Functional Requirements — Accessible Video Processing Platform ## Purpose This document specifies what the system must do from a user perspective. Implementation details belong in [architecture.md](architecture.md). Non-functional requirements (performance, security) belong in [architecture.md](architecture.md#security-model). --- ## User Roles | Role | Who | Primary action | |------|-----|---------------| | CLIENT | Paying customer | Upload videos, download deliverables | | REVIEWER | Oliver internal | QC approve/reject captions + audio description | | LINGUIST | Language specialist | Review and approve translated content per language | | PM | Project manager | Assign linguists, give final approval, monitor all jobs | | ADMIN | Platform operator | Manage users, view audit log, configure platform | --- ## Core Features ### R-01: Video Upload | Requirement | Detail | |-------------|--------| | R-01.1 | Client can upload an MP4 video file | | R-01.2 | File is stored in GCS; client receives a job ID | | R-01.3 | Upload progress is displayed in real time | | R-01.4 | System validates file type (MP4 only) and size on upload | | R-01.5 | Upload creates a job record in CREATED state | ### R-02: AI Processing Pipeline | Requirement | Detail | |-------------|--------| | R-02.1 | System automatically generates closed captions in VTT format using Gemini 2.5 Pro | | R-02.2 | System generates audio description VTT (scene descriptions for blind/low-vision viewers) | | R-02.3 | System generates SDH captions (includes sound effects and speaker IDs) | | R-02.4 | System generates a descriptive transcript | | R-02.5 | All generated VTT files are validated for correct format before advancing to QC | | R-02.6 | Job status is updated in real time via WebSocket during processing | ### R-03: Quality Control Workflow | Requirement | Detail | |-------------|--------| | R-03.1 | Reviewer can view video with captions side-by-side | | R-03.2 | Reviewer can edit individual VTT cues (text + timing) inline | | R-03.3 | Reviewer can approve English content (advances to APPROVED_ENGLISH) | | R-03.4 | Reviewer can reject a job with a reason (advances to REJECTED) | | R-03.5 | Reviewer can send QC feedback without full rejection (advances to QC_FEEDBACK) | | R-03.6 | All QC actions are recorded in the audit log with timestamp and user ID | | R-03.7 | VTT edits create a version snapshot before overwriting (version history maintained) | ### R-04: Per-Language QC | Requirement | Detail | |-------------|--------| | R-04.1 | PM can assign a specific linguist to each output language | | R-04.2 | Linguist can approve or reject their assigned language | | R-04.3 | Language statuses are independent — approving French does not affect German | | R-04.4 | Linguist cannot approve a language not assigned to them | | R-04.5 | Job advances to PENDING_FINAL_REVIEW only when all languages are approved | | R-04.6 | Per-language QC actions are recorded in the audit log | ### R-05: Translation and TTS | Requirement | Detail | |-------------|--------| | R-05.1 | System translates captions and AD into all requested output languages | | R-05.2 | System applies cultural transcreation (Gemini-assisted) where configured | | R-05.3 | System uses client-specific glossary terms in translation prompts | | R-05.4 | System synthesises audio description audio via Google TTS or ElevenLabs | | R-05.5 | TTS is performed per cue to preserve timing | | R-05.6 | TTS failures result in TTS_FAILED state; manual retry is supported | ### R-06: Glossary Management | Requirement | Detail | |-------------|--------| | R-06.1 | Admin can create, read, update, and delete glossary terms per client organisation | | R-06.2 | Glossary terms specify source term, target language, preferred translation | | R-06.3 | System uses exact match first, then vector similarity (≥0.75) for retrieval | | R-06.4 | Up to 50 terms are injected per translation prompt | | R-06.5 | Glossary embeddings are indexed in Atlas Vector Search | ### R-07: VTT Version Control | Requirement | Detail | |-------------|--------| | R-07.1 | System creates a snapshot before each VTT edit save | | R-07.2 | Reviewer can view version history with author, timestamp, and diff | | R-07.3 | Reviewer can restore any previous version | | R-07.4 | Concurrent edit conflict is detected and reported to the later editor | ### R-08: Final Review and Delivery | Requirement | Detail | |-------------|--------| | R-08.1 | PM can view all final deliverable files before approving | | R-08.2 | PM approval triggers client notification email | | R-08.3 | Email contains signed GCS download URLs (24h expiry) | | R-08.4 | Client can download captions, audio descriptions, and accessible video | | R-08.5 | Job status advances to COMPLETED after PM approval | ### R-09: Authentication and Access Control | Requirement | Detail | |-------------|--------| | R-09.1 | Users authenticate via email/password (local) or Microsoft SSO (enterprise) | | R-09.2 | Access tokens are valid for 15 minutes; refresh tokens for 7 days | | R-09.3 | Refresh tokens are stored in HttpOnly cookies only | | R-09.4 | All API endpoints enforce RBAC — role checked server-side on every request | | R-09.5 | Login is rate-limited to 5 attempts per 5-minute window | ### R-10: Audit Logging | Requirement | Detail | |-------------|--------| | R-10.1 | Every state-changing action by a reviewer, linguist, or PM creates an audit log entry | | R-10.2 | Audit log entries contain: actor user ID, action type, job ID, timestamp, before/after state | | R-10.3 | Admin can view the full audit log filtered by user, job, or date range | | R-10.4 | Audit log entries are immutable once written | ### R-11: Real-time Notifications | Requirement | Detail | |-------------|--------| | R-11.1 | Job status changes are pushed to connected clients via WebSocket | | R-11.2 | WebSocket events are org-scoped — users only receive events for their organisation | | R-11.3 | WebSocket connection recovers automatically after disconnect (exponential backoff) | --- ## Out of Scope (Current Version) | Feature | Reason | |---------|--------| | Automated transcription (Whisper) | Gemini handles transcription; Whisper worker exists but not active | | CI/CD pipeline | Manual deploy via scripts; CI exists but does not run full test suite | | Load testing | Not implemented; deferred to Phase 7 | | Multi-tenant billing | Cost tracking via oliver-cost-tracker SDK (read-only dashboard) | --- ## Maintenance **Update triggers:** New feature scope confirmed, requirement changed by stakeholder, QC workflow changes. **Verification:** Every R-XX.X requirement maps to at least one test in [tests/README.md](../../tests/README.md) or [/tmp/audit/test-plan.md](/tmp/audit/test-plan.md).