- AGENTS.md: canonical project entry point (Quick Nav, pipeline, constraints) - docs/: complete docs tree — architecture, API spec, DB schema, infra, runbook, requirements, tech stack, principles, reference ADRs, guides, tasks backlog, testing strategy - tests/README.md: test commands, structure, known gaps - README.md / CLAUDE.md / DEPLOYMENT.md: updated with canonical doc links - .archive/: backup of pre-documentation-pipeline originals - backend/uv.lock: uv dependency lockfile - Delete committed __pycache__ .pyc files (should have been gitignored) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
386 lines
No EOL
15 KiB
Markdown
386 lines
No EOL
15 KiB
Markdown
# Accessible Video Processing Platform
|
|
|
|
A comprehensive AI-powered platform for generating accessible video content with closed captions, audio descriptions, and multi-language translations. Features a complete workflow from video upload to final delivery with quality control processes.
|
|
|
|
**Documentation:** See [AGENTS.md](AGENTS.md) for full navigation, or [docs/README.md](docs/README.md) for the documentation hub.
|
|
|
|
## ✅ Current Status: **Production-Ready** (85% Complete)
|
|
|
|
**Lines of Code:** 20,471 total (12,198 backend + 8,273 frontend)
|
|
|
|
## 🚀 Key Features Implemented
|
|
|
|
### Core Functionality ✅
|
|
- **AI-Powered Processing**: Complete Gemini 2.5 Pro integration for intelligent caption and audio description generation
|
|
- **Multi-Language Pipeline**: Google Translate + cultural transcreation with 50+ language support
|
|
- **Quality Control Workflow**: Full reviewer approval/rejection system with VTT editing capabilities
|
|
- **Audio Description TTS**: Google Cloud TTS and ElevenLabs integration with audio synthesis
|
|
- **Real-time Updates**: WebSocket-powered job status tracking and notifications
|
|
- **Advanced Video Player**: Multi-language caption support with timeline navigation
|
|
- **Role-Based Access Control**: Complete CLIENT/REVIEWER/ADMIN role system
|
|
|
|
### Security & Infrastructure ✅
|
|
- **JWT Authentication**: Secure access/refresh token system with HttpOnly cookies
|
|
- **Audit Logging**: Comprehensive audit trail for all reviewer actions
|
|
- **Signed URLs**: Secure Google Cloud Storage file access (24h expiry)
|
|
- **Input Validation**: Complete request validation and sanitization
|
|
- **HTTPS/CORS**: Production-ready security configuration
|
|
|
|
### User Experience ✅
|
|
- **Responsive Design**: Mobile-first Tailwind CSS implementation
|
|
- **Real-time Feedback**: Live job progress tracking and notifications
|
|
- **Advanced File Management**: Drag-and-drop uploads with progress indicators
|
|
- **VTT Editor**: Inline caption editing with live preview
|
|
- **Download Portal**: Secure asset delivery with organized file structure
|
|
|
|
## 🛠 Tech Stack
|
|
|
|
### Backend (FastAPI + Python 3.11)
|
|
- **FastAPI 0.115.0** - Modern async web framework with OpenAPI documentation
|
|
- **Celery 5.3.4** - Distributed task queue with Redis broker
|
|
- **MongoDB 7.0** - Document database with replica set support
|
|
- **Redis 7.2** - Caching and message queuing
|
|
- **Google Cloud Platform** - Storage, AI services, Secret Manager, TTS
|
|
- **Pydantic 2.5** - Data validation and serialization
|
|
- **OpenTelemetry** - Observability and monitoring
|
|
- **Sentry** - Error tracking and performance monitoring
|
|
|
|
### Frontend (React 19 + TypeScript)
|
|
- **React 19.1.1** - Modern UI framework with latest features
|
|
- **Vite 7.1.2** - Lightning-fast build tool and dev server
|
|
- **TypeScript 5.8** - Full type safety throughout application
|
|
- **TanStack Query 5.85** - Advanced server state management with caching
|
|
- **React Router 7.8** - Client-side routing with protected routes
|
|
- **Tailwind CSS 4.1** - Utility-first CSS framework
|
|
- **Zustand 5.0** - Lightweight client state management
|
|
- **React Hook Form + Zod** - Form handling with schema validation
|
|
|
|
## 🏗 Architecture Overview
|
|
|
|
### Complete Job Processing Pipeline ✅
|
|
```
|
|
Upload → Ingestion → AI Processing → QC Review → Translation → TTS → Final Review → Delivery
|
|
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
|
|
GCS Gemini 2.5 VTT Generation Human Google Text-to- Reviewer Email +
|
|
Storage Pro + Validation Review Translate Speech Approval Downloads
|
|
```
|
|
|
|
### System Architecture
|
|
- **Monorepo Structure**: `/backend`, `/frontend`, `/infra` with clear separation
|
|
- **Microservices Ready**: Modular FastAPI services with proper dependency injection
|
|
- **Event-Driven**: WebSocket real-time updates with connection management
|
|
- **Scalable Workers**: Celery task queue with auto-retry and error recovery
|
|
- **Secure by Design**: RBAC, signed URLs, audit logging, input validation
|
|
|
|
## 🚀 Getting Started
|
|
|
|
### Prerequisites
|
|
- **Python 3.11+** (backend development)
|
|
- **Node.js 18+** (frontend development)
|
|
- **Docker & Docker Compose** (required for local development)
|
|
- **Google Cloud Project** with APIs enabled (for video processing)
|
|
|
|
### 🐳 Local Development with Docker (Recommended)
|
|
|
|
This is the recommended approach for local development. Backend services run in Docker containers while the frontend runs via Vite dev server for fast hot-reload.
|
|
|
|
#### Initial Setup
|
|
```bash
|
|
# 1. Clone the repository
|
|
git clone <repository>
|
|
cd video_accessibility
|
|
|
|
# 2. Copy and configure environment files
|
|
cp .env.prod.example .env.local
|
|
# Edit .env.local with your API keys and settings
|
|
|
|
# 3. Set up frontend environment
|
|
cp frontend/.env.example frontend/.env.local
|
|
# The defaults should work for local development
|
|
|
|
# 4. Ensure GCP credentials are in place
|
|
# Copy your GCP service account JSON to: ./secrets/gcp-credentials.json
|
|
```
|
|
|
|
#### Starting the Development Environment
|
|
|
|
**Step 1: Start Backend Services (Docker)**
|
|
```bash
|
|
# Start API, Worker, MongoDB, and Redis in Docker
|
|
./scripts/run-local.sh
|
|
|
|
# Services will be available at:
|
|
# - API: http://localhost:8003
|
|
# - API Docs: http://localhost:8003/docs
|
|
# - MongoDB: mongodb://localhost:27017
|
|
# - Redis: redis://localhost:6379
|
|
```
|
|
|
|
**Step 2: Start Frontend (Vite Dev Server)**
|
|
```bash
|
|
# In a separate terminal
|
|
cd frontend
|
|
npm install # First time only
|
|
npm run dev
|
|
|
|
# Frontend will be available at:
|
|
# - Application: http://localhost:6001/video-accessibility
|
|
```
|
|
|
|
#### Useful Commands
|
|
```bash
|
|
# View logs
|
|
docker compose logs -f api # API logs
|
|
docker compose logs -f worker # Worker logs
|
|
docker compose logs -f # All logs
|
|
|
|
# Restart a service
|
|
docker compose restart api
|
|
docker compose restart worker
|
|
|
|
# Rebuild and restart (after code changes)
|
|
./scripts/run-local.sh --rebuild
|
|
|
|
# Stop all services
|
|
./scripts/run-local.sh --stop
|
|
# or
|
|
docker compose down
|
|
```
|
|
|
|
#### Test User Credentials (Local Development Only)
|
|
|
|
For testing different user roles locally:
|
|
|
|
```
|
|
Admin: admin@example.com / admin
|
|
Production: production@example.com / production
|
|
Reviewer: reviewer@example.com / reviewer
|
|
Client: client@example.com / client123
|
|
```
|
|
|
|
**Note**: These test users are only for local development. Production uses Microsoft authentication.
|
|
|
|
### Alternative: Native Development (Without Docker)
|
|
|
|
For development without Docker, you'll need to run each service manually:
|
|
|
|
```bash
|
|
# Terminal 1: MongoDB
|
|
mongod --dbpath ./data/db
|
|
|
|
# Terminal 2: Redis
|
|
redis-server
|
|
|
|
# Terminal 3: Backend API
|
|
cd backend
|
|
poetry install
|
|
poetry run uvicorn app.main:app --reload --port 8000
|
|
|
|
# Terminal 4: Celery Worker
|
|
cd backend
|
|
poetry run celery -A app.tasks worker --loglevel=info
|
|
|
|
# Terminal 5: Frontend
|
|
cd frontend
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
**Note**: The Docker approach is strongly recommended as it ensures consistency and simplifies setup.
|
|
|
|
### Testing & Quality
|
|
```bash
|
|
# Backend tests + linting
|
|
cd backend
|
|
poetry run pytest
|
|
poetry run ruff check .
|
|
poetry run mypy .
|
|
|
|
# Frontend tests + linting
|
|
cd frontend
|
|
npm run test
|
|
npm run test:e2e
|
|
npm run lint
|
|
npm run type-check
|
|
```
|
|
|
|
## 📁 Project Structure
|
|
|
|
```
|
|
video_accessibility/ # Root monorepo
|
|
├── backend/ # FastAPI Python backend (12,198 LOC)
|
|
│ ├── app/
|
|
│ │ ├── api/v1/ # REST API endpoints
|
|
│ │ │ ├── auth.py # JWT authentication
|
|
│ │ │ ├── jobs.py # Job CRUD & workflow
|
|
│ │ │ ├── admin.py # Admin operations
|
|
│ │ │ └── files.py # File management
|
|
│ │ ├── core/ # Core configuration
|
|
│ │ ├── models/ # Database models
|
|
│ │ ├── schemas/ # Pydantic request/response schemas
|
|
│ │ ├── services/ # External service integrations
|
|
│ │ │ ├── gemini.py # AI processing
|
|
│ │ │ ├── gcs.py # Google Cloud Storage
|
|
│ │ │ ├── translation.py # Multi-language support
|
|
│ │ │ └── tts.py # Text-to-speech
|
|
│ │ ├── tasks/ # Celery background workers
|
|
│ │ ├── middleware/ # Request processing
|
|
│ │ └── telemetry/ # Observability
|
|
│ ├── tests/ # Comprehensive test suite
|
|
│ └── Dockerfile # Container configuration
|
|
├── frontend/ # React TypeScript SPA (8,273 LOC)
|
|
│ ├── src/
|
|
│ │ ├── routes/ # Page components
|
|
│ │ │ ├── auth/ # Login system
|
|
│ │ │ ├── jobs/ # Job management
|
|
│ │ │ ├── qc/ # Quality control
|
|
│ │ │ └── admin/ # Admin interface
|
|
│ │ ├── components/ # Reusable UI components
|
|
│ │ │ ├── VideoWithCaptions.tsx # Advanced video player
|
|
│ │ │ ├── VttEditor.tsx # Caption editing
|
|
│ │ │ └── UploadDropzone.tsx # File upload
|
|
│ │ ├── lib/ # Utilities and API client
|
|
│ │ ├── hooks/ # Custom React hooks
|
|
│ │ └── types/ # TypeScript definitions
|
|
│ ├── tests/ # Unit + E2E tests
|
|
│ ├── .env.local # Local development config
|
|
│ └── Dockerfile # Container configuration
|
|
├── scripts/
|
|
│ ├── run-local.sh # Local development startup
|
|
│ ├── deploy.sh # Production deployment
|
|
│ ├── full-deploy.sh # Full production rebuild
|
|
│ └── build-frontend.sh # Frontend build script
|
|
├── docker-compose.yml # Base Docker configuration
|
|
├── docker-compose.local.yml # Local development overrides
|
|
├── docker-compose.prod.yml # Production overrides
|
|
├── .env.local # Local environment variables
|
|
├── .env.production # Production environment variables
|
|
├── CLAUDE.md # Development guidelines
|
|
└── video_accessibility_development_plan.txt # Complete specification
|
|
```
|
|
|
|
## ⚙️ Configuration
|
|
|
|
### Environment Variables
|
|
**Backend** (`backend/.env`):
|
|
```bash
|
|
# Database
|
|
MONGODB_URL=mongodb://admin:password@localhost:27017/accessible_video
|
|
REDIS_URL=redis://localhost:6379/0
|
|
|
|
# Authentication
|
|
JWT_SECRET_KEY=your-jwt-secret
|
|
JWT_REFRESH_SECRET_KEY=your-refresh-secret
|
|
|
|
# AI Services
|
|
GEMINI_API_KEY=your-gemini-key
|
|
ELEVENLABS_API_KEY=your-elevenlabs-key
|
|
|
|
# Google Cloud
|
|
GCS_BUCKET_NAME=your-bucket-name
|
|
GOOGLE_CLOUD_PROJECT=your-project-id
|
|
|
|
# Email
|
|
SENDGRID_API_KEY=your-sendgrid-key
|
|
|
|
# Monitoring
|
|
SENTRY_DSN=your-sentry-dsn
|
|
```
|
|
|
|
**Frontend** (`frontend/.env`):
|
|
```bash
|
|
VITE_API_URL=http://localhost:8000
|
|
VITE_SENTRY_DSN=your-sentry-dsn
|
|
VITE_ENVIRONMENT=development
|
|
```
|
|
|
|
### Google Cloud Setup
|
|
1. **Create GCP Project** with billing enabled
|
|
2. **Enable APIs**:
|
|
- Cloud Storage API
|
|
- Cloud Translation API
|
|
- Cloud Text-to-Speech API
|
|
- Vertex AI API (for Gemini)
|
|
- Secret Manager API
|
|
3. **Create Service Account** with roles:
|
|
- Storage Admin
|
|
- AI Platform Admin
|
|
- Secret Manager Admin
|
|
4. **Download JSON key** and set `GOOGLE_APPLICATION_CREDENTIALS`
|
|
|
|
## 🚢 Deployment Options
|
|
|
|
### Production Architecture (Google Cloud)
|
|
- **Frontend**: Cloud Storage + Cloud CDN (static hosting)
|
|
- **Backend API**: Cloud Run (serverless, auto-scaling)
|
|
- **Workers**: Cloud Run (Celery with Redis)
|
|
- **Database**: MongoDB Atlas (managed)
|
|
- **Queue**: Cloud Memorystore (Redis)
|
|
- **Storage**: Google Cloud Storage
|
|
- **Monitoring**: Cloud Monitoring + Sentry
|
|
|
|
### Docker Production
|
|
```bash
|
|
# Build production images
|
|
docker-compose -f docker-compose.prod.yml up -d
|
|
```
|
|
|
|
## 🔒 Security Features
|
|
|
|
### Implemented Security ✅
|
|
- **JWT Authentication**: Access (15min) + refresh (7 days) token rotation
|
|
- **RBAC System**: CLIENT/REVIEWER/ADMIN roles with endpoint protection
|
|
- **Secure Storage**: HttpOnly cookies for refresh tokens
|
|
- **File Security**: Signed URLs with 24h expiry, no client access to raw files
|
|
- **Input Validation**: Comprehensive Pydantic validation on all endpoints
|
|
- **Audit Logging**: Complete trail of all reviewer actions and system events
|
|
- **CORS Protection**: Configured for production domains
|
|
- **Rate Limiting**: Request throttling and validation middleware
|
|
|
|
## 🔧 API Documentation
|
|
|
|
### Key Endpoints Implemented
|
|
```
|
|
POST /api/v1/auth/login # Authentication
|
|
POST /api/v1/jobs # Create job with file upload
|
|
GET /api/v1/jobs # List jobs (filtered by role)
|
|
GET /api/v1/jobs/{id} # Job details with real-time status
|
|
POST /api/v1/jobs/{id}/actions/* # Workflow actions (approve/reject/complete)
|
|
GET /api/v1/jobs/{id}/vtt # VTT content retrieval
|
|
PATCH /api/v1/jobs/{id}/vtt # VTT editing and updates
|
|
GET /api/v1/jobs/{id}/downloads # Signed download URLs
|
|
WS /api/v1/ws/jobs/{id} # Real-time job status updates
|
|
```
|
|
|
|
**OpenAPI Documentation**: http://localhost:8000/docs
|
|
|
|
## 🎯 Development Status
|
|
|
|
### ✅ Completed (Production Ready)
|
|
- **User Management**: Full authentication, RBAC, password management
|
|
- **Job Pipeline**: Complete video processing workflow with state machine
|
|
- **Quality Control**: VTT editor, approval workflows, reviewer dashboards
|
|
- **Real-time Features**: WebSocket updates, live notifications
|
|
- **Multi-language**: Translation pipeline with cultural transcreation
|
|
- **File Management**: Secure uploads, downloads, asset validation
|
|
- **Admin Features**: User management, system monitoring, audit logs
|
|
|
|
### ⚠️ Needs Attention (Minor)
|
|
- **Integration Tests**: Framework exists but needs completion
|
|
- **Email Templates**: Service implemented, templates may need customization
|
|
- **Performance Testing**: No load testing implemented yet
|
|
- **Documentation**: API docs complete, user guides could be enhanced
|
|
|
|
### 🎯 Recommended Next Steps
|
|
1. **Complete integration test suite** for end-to-end validation
|
|
2. **Performance testing** with realistic video processing loads
|
|
3. **Production deployment** configuration and CI/CD pipeline
|
|
4. **User documentation** and training materials
|
|
5. **Monitoring dashboards** for production operations
|
|
|
|
## 📚 Development Resources
|
|
|
|
- **Complete Specification**: `video_accessibility_development_plan.txt`
|
|
- **Development Guidelines**: `CLAUDE.md`
|
|
- **API Documentation**: http://localhost:8000/docs (when running)
|
|
- **Test Coverage Reports**: `backend/htmlcov/` (after running tests) |