No description
| .github/workflows | ||
| backend | ||
| docs | ||
| frontend | ||
| infra | ||
| .DS_Store | ||
| .gitignore | ||
| CLAUDE.md | ||
| docker-compose.yml | ||
| Makefile | ||
| mongo-init.js | ||
| mongo-keyfile | ||
| README.md | ||
| video_accessibility_development_plan.txt | ||
Accessible Video Processing Platform
An AI-powered platform for generating accessible video content including closed captions, audio descriptions, and multi-language translations.
Features
- AI-Powered Processing: Uses Gemini 2.5 Pro for intelligent caption and audio description generation
- Multi-Language Support: Automatic translation and cultural transcreation
- Quality Control Workflow: Built-in review and approval process
- Audio Description: Text-to-speech generation for voiceovers
- Secure File Handling: Google Cloud Storage with signed URLs
- Role-Based Access: Client, reviewer, and admin roles with appropriate permissions
Tech Stack
Backend
- FastAPI - Modern Python web framework
- Celery - Distributed task queue for video processing
- MongoDB - Document database for job and user data
- Redis - Task queue broker and caching
- Google Cloud Services - Storage, AI, and TTS
Frontend
- React 18 - UI framework
- Vite - Fast build tool and dev server
- TypeScript - Type safety
- TanStack Query - Data fetching and caching
- Tailwind CSS - Utility-first styling
Getting Started
Prerequisites
- Python 3.11+
- Node.js 18+
- Poetry (for Python dependency management)
- MongoDB (Atlas recommended)
- Redis
- Google Cloud Project with required APIs enabled
Installation
-
Clone and setup environment:
git clone <repository> cd accessible-video make setup-env -
Install dependencies:
make install -
Configure environment variables:
- Update
backend/.envwith your database, API keys, and service credentials - Update
frontend/.envwith your API base URL
- Update
Development
Start all services (requires tmux):
make dev
Or start services individually:
# Terminal 1 - Backend API
make dev-backend
# Terminal 2 - Frontend SPA
make dev-frontend
# Terminal 3 - Celery Worker
make dev-worker
The application will be available at:
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
Testing
# Run all tests
make test-backend
make test-frontend
# Lint code
make lint
Architecture
Job Processing Pipeline
- Upload: Client uploads MP4 video
- Ingestion: Video is processed and analyzed by Gemini 2.5 Pro
- QC Review: Human reviewer approves/rejects English captions and audio descriptions
- Translation: Approved content is translated to target languages
- TTS Generation: Audio descriptions are converted to speech
- Final Review: Reviewer approves final multi-language assets
- Delivery: Client receives email with download links
File Structure
backend/ # FastAPI application
├── app/
│ ├── api/ # REST API routes
│ ├── core/ # Configuration and shared utilities
│ ├── models/ # Pydantic data models
│ ├── services/ # External service integrations
│ ├── tasks/ # Celery background tasks
│ └── prompts/ # AI prompt templates
└── tests/ # Test suite
frontend/ # React SPA
├── src/
│ ├── components/ # Reusable UI components
│ ├── routes/ # Page components
│ ├── lib/ # Utilities and API client
│ ├── hooks/ # Custom React hooks
│ └── types/ # TypeScript definitions
└── public/ # Static assets
Configuration
Required Environment Variables
Backend (.env):
MONGODB_URI- MongoDB connection stringREDIS_URL- Redis connection stringJWT_SECRET- Secret for JWT token signingGEMINI_API_KEY- Google Gemini API keyGCS_BUCKET- Google Cloud Storage bucket nameSENDGRID_API_KEY- SendGrid for email notifications
Frontend (.env):
VITE_API_BASE_URL- Backend API URL
Google Cloud Setup
- Create a GCP project
- Enable required APIs:
- Cloud Storage API
- Cloud Translation API
- Cloud Text-to-Speech API
- Vertex AI API (for Gemini)
- Create service account with appropriate permissions
- Download service account key and configure
GOOGLE_APPLICATION_CREDENTIALS
Deployment
The application is designed for deployment on Google Cloud:
- Backend: Cloud Run with auto-scaling
- Workers: Cloud Run with Celery
- Frontend: Cloud Storage + Cloud CDN
- Database: MongoDB Atlas
- Queue: Cloud Memorystore (Redis)
See /infra directory for deployment configurations.
Security
- JWT authentication with refresh token rotation
- Role-based access control (RBAC)
- Signed URLs for secure file access
- Audit logging for all reviewer actions
- HTTPS enforcement in production
Development Guide
Always refer to the complete development plan in video_accessibility_development_plan.txt for detailed specifications and requirements. The CLAUDE.md file contains additional development guidelines and phase-by-phase implementation details.