video-accessibility/AGENTS.md
Vadym Samoilenko a3b300b76a docs: add canonical documentation + audit cleanup
- AGENTS.md: canonical project entry point (Quick Nav, pipeline, constraints)
- docs/: complete docs tree — architecture, API spec, DB schema, infra,
  runbook, requirements, tech stack, principles, reference ADRs, guides,
  tasks backlog, testing strategy
- tests/README.md: test commands, structure, known gaps
- README.md / CLAUDE.md / DEPLOYMENT.md: updated with canonical doc links
- .archive/: backup of pre-documentation-pipeline originals
- backend/uv.lock: uv dependency lockfile
- Delete committed __pycache__ .pyc files (should have been gitignored)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 14:22:51 +01:00

97 lines
4.5 KiB
Markdown

# Accessible Video Processing Platform — Project Entry Point
<!-- SCOPE: root | owner: ln-111 | generated: 2026-04-29 -->
## What Is This Project
AI-powered SaaS platform that generates legally-required accessibility assets from video files: closed captions, audio descriptions, SDH captions, and descriptive transcripts. Outputs are reviewed through a human QC workflow before client delivery. 50+ language translation and cultural transcreation are built in.
**Client:** Oliver Internal
**Server:** optical-web-1
**Status:** 85% production-ready
---
## Quick Navigation
| Need | Go to |
|------|-------|
| Architecture, data flow, state machine | [docs/project/architecture.md](docs/project/architecture.md) |
| Tech stack versions and config | [docs/project/tech_stack.md](docs/project/tech_stack.md) |
| API endpoint reference | [docs/project/api_spec.md](docs/project/api_spec.md) |
| Database collections and indexes | [docs/project/database_schema.md](docs/project/database_schema.md) |
| Infrastructure inventory | [docs/project/infrastructure.md](docs/project/infrastructure.md) |
| Runbook — deploy, restart, rollback | [docs/project/runbook.md](docs/project/runbook.md) |
| Functional requirements | [docs/project/requirements.md](docs/project/requirements.md) |
| Development principles | [docs/principles.md](docs/principles.md) |
| Reference — ADRs, guides, research | [docs/reference/README.md](docs/reference/README.md) |
| Task management | [docs/tasks/README.md](docs/tasks/README.md) |
| Test strategy and commands | [tests/README.md](tests/README.md) |
| Documentation hub | [docs/README.md](docs/README.md) |
---
## Entry Points by Audience
| Audience | Start here |
|----------|-----------|
| New developer | [docs/project/runbook.md](docs/project/runbook.md) → local setup section |
| Reviewer / QC | [docs/project/requirements.md](docs/project/requirements.md) → QC workflow section |
| DevOps | [docs/project/infrastructure.md](docs/project/infrastructure.md) + [docs/project/runbook.md](docs/project/runbook.md) |
| Security reviewer | [docs/project/architecture.md](docs/project/architecture.md) → security section |
| AI agent | Read this file → pick topic → read `_index`-equivalent doc → synthesize |
---
## Core Pipeline (one-line summary per stage)
| Stage | What happens | Key file |
|-------|-------------|---------|
| Upload | MP4 → GCS + MongoDB job record | `routes_files.py` |
| Ingestion | Celery worker transcribes with Gemini 2.5 Pro | `tasks/ingest_and_ai.py` |
| AI Processing | VTT generated, validated, stored in GCS | `services/gemini.py` |
| QC Review | Reviewer edits VTT, approves or rejects | `services/language_qc.py` |
| Translation | Google Translate + transcreation per language | `tasks/translate_and_synthesize.py` |
| TTS | Per-cue audio synthesis (Google TTS / ElevenLabs) | `services/tts.py` |
| Final Review | PM approves deliverables | `routes_language_qc.py` |
| Delivery | Signed GCS URLs emailed to client | `services/emailer.py` |
See full state machine (16 states) in [docs/project/architecture.md](docs/project/architecture.md#job-state-machine).
---
## Development Commands
| Action | Command |
|--------|---------|
| Start local (Docker + Vite) | `./scripts/run-local.sh` |
| Rebuild after code change | `./scripts/run-local.sh --rebuild` |
| Stop all local services | `./scripts/run-local.sh --stop` |
| Backend lint | `cd backend && ruff check .` |
| Backend type-check | `cd backend && mypy .` (run in Docker container) |
| Frontend lint | `cd frontend && npm run lint` |
| Frontend type-check | `cd frontend && npm run type-check` |
| Backend tests | `cd backend && poetry run pytest` |
| Frontend tests | `cd frontend && npm run test` |
| E2E tests | `cd frontend && npm run test:e2e` |
---
## Key Constraints
- **NO SSH to optical-web-1** without explicit user instruction — hard rule in CLAUDE.md
- **Access tokens in memory only** (not localStorage) — auth architecture constraint
- **Refresh tokens in HttpOnly cookies** — security requirement
- **Signed GCS URLs** expire in 24h — do not cache or store URLs
- **RBAC enforced server-side** — never trust client-supplied role claims
- **All reviewer actions emit audit log entries** — compliance requirement
---
## Maintenance
**Update triggers:** New route added, deployment target changes, key dependency version change, new team member onboarded.
**Verification:** All links in Quick Navigation resolve. Entry commands are correct against current scripts/.
<!-- END SCOPE: root -->