video-accessibility/docs/project/architecture.md
Vadym Samoilenko a3b300b76a docs: add canonical documentation + audit cleanup
- AGENTS.md: canonical project entry point (Quick Nav, pipeline, constraints)
- docs/: complete docs tree — architecture, API spec, DB schema, infra,
  runbook, requirements, tech stack, principles, reference ADRs, guides,
  tasks backlog, testing strategy
- tests/README.md: test commands, structure, known gaps
- README.md / CLAUDE.md / DEPLOYMENT.md: updated with canonical doc links
- .archive/: backup of pre-documentation-pipeline originals
- backend/uv.lock: uv dependency lockfile
- Delete committed __pycache__ .pyc files (should have been gitignored)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 14:22:51 +01:00

7.7 KiB

Architecture — Accessible Video Processing Platform

System Overview

Three-tier monorepo: React SPA frontend → FastAPI backend → Celery worker pool. Persistent stores are MongoDB Atlas (documents) and Redis (queue + cache). All AI processing happens asynchronously in Celery tasks. All file I/O is via GCS signed URLs.

Browser → Apache → FastAPI (sync surface)
                 → Celery Workers (async AI pipeline)
                 → MongoDB Atlas (job state)
                 → Redis (task queue + rate limit state)
                 → GCS (video + VTT files)

Job State Machine

16 states. Transitions are one-directional except for the QC feedback loop.

State Description Triggered by
CREATED Job record created Upload complete
INGESTING Worker has picked up the job Celery task start
AI_PROCESSING Gemini 2.5 Pro generating VTT Ingestion complete
PENDING_QC VTT ready for reviewer AI processing done
QC_FEEDBACK Reviewer sent feedback, not rejected Reviewer action
APPROVED_ENGLISH English content approved QC approve (EN)
APPROVED_SOURCE Source language approved QC approve (source)
TRANSLATING Google Translate + transcreation running Approval triggers
TTS_GENERATING Per-cue audio synthesis in progress Translation done
TTS_FAILED TTS service error — manual retry required ElevenLabs/Google error
RENDERING_VIDEO FFmpeg compositing accessible video TTS done
RENDER_FAILED FFmpeg error — manual retry required FFmpeg error
RENDERING_QC Rendering complete, awaiting final QC Render done
PENDING_FINAL_REVIEW PM reviewing final deliverables QC approved
REJECTED Job permanently rejected Reviewer action
COMPLETED Client notified, signed URLs delivered PM final approval

Terminal states: COMPLETED, REJECTED. Manual-retry states: TTS_FAILED, RENDER_FAILED. Feedback loop: QC_FEEDBACK → (fix) → PENDING_QC.


Component Map

Backend (backend/app/)

Layer Path Responsibility
API routes api/v1/routes_*.py HTTP + WebSocket endpoints, RBAC enforcement
Core core/security.py JWT encode/decode, password hashing
Core core/authz.py RBAC permission checks, MembershipContext
Core core/dependencies.py FastAPI DI — get_current_user, get_database
Core core/config.py Pydantic settings from env vars
Models models/job.py Job document schema + JobStatus enum (16 states)
Models models/user.py User document with roles
Services services/gemini.py Gemini 2.5 Pro API wrapper
Services services/gcs.py GCS V4 signed URLs, upload/download
Services services/language_qc.py Per-language QC state machine
Services services/glossary_service.py Hybrid exact + vector glossary retrieval
Services services/audit_logger.py Audit trail — all state-changing actions
Services services/microsoft_auth.py Microsoft SSO JWKS validation
Services services/websocket.py WebSocket connection manager
Tasks tasks/ingest_and_ai.py Main ingestion Celery task
Tasks tasks/translate_and_synthesize.py Translation + TTS pipeline
Tasks tasks/ffmpeg_operations.py Video rendering
Middleware middleware/rate_limiting.py Redis-backed request throttling
Middleware middleware/validation.py MongoDB injection protection

Frontend (frontend/src/)

Layer Path Responsibility
Routes routes/auth/ Login, refresh, Microsoft SSO
Routes routes/jobs/ Job list, job detail, VTT editor
Routes routes/admin/ QC dashboard, audit log, user management
Routes routes/org/ Organisation settings, invite members
Hooks hooks/useJob.tsx Job state + API calls
Hooks hooks/useJobStatusWebSocket.ts WS connection with backoff reconnect
Contexts contexts/GlobalWebSocketContext.tsx WS singleton per session
Contexts contexts/NotificationContext.tsx Toast notifications
Lib lib/auth.ts JWT in-memory store, refresh flow
Lib lib/api.ts Axios instance with auth interceptor
Components components/VttEditor.tsx Inline VTT editing with preview
Components components/VideoWithCaptions.tsx Multi-language video player
Components components/Layout/Sidebar.tsx Role-aware navigation

Auth Architecture

Token Storage TTL Purpose
Access token JS memory (React context) 15 min Bearer for all API calls
Refresh token HttpOnly cookie 7 days Obtain new access tokens

Token flow: Login → both tokens issued → access token in memory → on expiry, silent refresh via cookie → new access token in memory. On logout, both tokens revoked.

Critical: get_current_user() in dependencies.py must reject refresh tokens used as Bearer tokens (type check on payload).


RBAC Matrix

Resource CLIENT REVIEWER LINGUIST PM ADMIN
Upload video
View own jobs
View all jobs
Edit VTT
QC approve/reject
Assign linguist
Final review
User management
Audit log

Implementation: authz.pyMembershipContext, require_org_role(role), require_platform_admin().


Security Model

Control Implementation File
Rate limiting Redis-backed, 5 req/5 min on login middleware/rate_limiting.py
Input validation MongoDB operator blocklist + Pydantic middleware/validation.py
File access GCS V4 signed URLs, 24h expiry services/gcs.py
Audit trail Every state-changing action logged services/audit_logger.py
Secrets GCP Secret Manager in production core/secrets_config.py
Error messages Generic HTTP errors — no internal detail routes_auth.py
Token type check Reject refresh tokens as Bearer core/dependencies.py

Known gaps (from security audit 2026-04-29): Login endpoint currently bypasses rate limiting (debugging artifact — must be fixed before launch). Microsoft SSO uses synchronous requests.get() in async context.


Glossary Retrieval (Hybrid)

Two-pass retrieval for translation prompt injection:

Pass Method Threshold Limit
1 — Exact String match on source term All matches
2 — Vector Atlas Vector Search on embedding ≥ 0.75 similarity Top 20

Merged result: exact matches first, then vector matches, deduplicated, truncated to 50 terms. Injected as a block in the Gemini translation prompt.

Index: glossary_embedding_index in MongoDB Atlas.


WebSocket Architecture

  • Server: services/websocket.pyConnectionManager class, org-scoped broadcasts
  • Client: hooks/useJobStatusWebSocket.ts — exponential backoff reconnect
  • Auth: WS upgrade requires valid access token
  • Events: broadcast_to_org(org_id, event) — no cross-tenant leakage

Maintenance

Update triggers: New job state added, auth flow change, new service integrated, RBAC change. Verification: State machine table matches JobStatus enum in models/job.py. RBAC matrix matches authz.py role checks.