# Veo 3.1 Video Generator A full-stack web application for generating AI videos using Google's Veo 3.1 models. Generate videos from text prompts with advanced features including frame interpolation, reference images, and customizable parameters. ## Quick Start ```bash # Clone and navigate to project cd veo3_poc # Ensure service-account.json is in the root directory # Run development servers (both frontend and backend) ./run-dev.sh # Access the application # Frontend: http://localhost:3000 # Backend API: http://localhost:7394 ``` ## Architecture - **Frontend**: React 18 + Vite + Material-UI 5 + Montserrat typography - **Backend**: Flask 3.0 + Google Gen AI SDK 1.47.0 - **Authentication**: Microsoft Azure AD SSO (MSAL 2.0) - **Storage**: Google Cloud Storage for temporary video and image files - **Deployment**: Systemd service + Apache reverse proxy ## Features ### Core Video Generation - **Text-to-Video Generation**: Create videos from descriptive text prompts - **Image-to-Video Generation**: Upload first frame images to guide video generation - **Quad Model Support**: Choose between four models: - **Veo 3.1** (Standard): High-quality with advanced features - $0.40/sec - **Veo 3.1 Fast**: Optimized speed with frame interpolation - $0.15/sec ### Veo 3.1 Advanced Features - **Frame Interpolation**: Upload both first and last frames to generate smooth transitions between them (8-second videos only) - **Reference Images**: Guide video content with up to 3 reference images for consistent characters, objects, or styles (16:9 aspect ratio, 8-second videos, Standard model only) - **Conditional UI**: Advanced features automatically appear/disappear based on selected model capabilities ### Job Management - **Multi-Video Generation**: Generate 1-4 videos per request with batch processing - **Unlimited Job Queue**: Submit unlimited video generation jobs with FIFO processing - **Advanced Job Management**: Cancel, retry, and delete jobs with complete cleanup - **Real-time Queue Visualization**: Live status updates with three-section queue display ### Customizable Parameters - Video length (4, 6, or 8 seconds) - Aspect ratio (16:9 landscape or 9:16 portrait) - Person generation policy (allow/don't allow) - Custom seed values for reproducible results - Audio generation toggle ### Additional Features - **Intelligent File Management**: Auto-cleanup after download, comprehensive GCS cleanup - **Usage Tracking**: Webhook integration for monitoring generation requests - **Development Mode**: Local development with authentication bypass ## Prerequisites - Python 3.13+ (or 3.8+) - Node.js 16+ - Google Cloud Project with Veo 3.1 API access - Google Cloud Storage bucket - Service account JSON key with appropriate permissions - Microsoft Azure AD application configured (for production SSO) ## Setup Instructions ### Backend Setup 1. Navigate to the backend directory: ```bash cd backend ``` 2. Create and activate virtual environment: ```bash python -m venv venv source venv/bin/activate # Linux/Mac # or venv\Scripts\activate # Windows ``` 3. Install dependencies: ```bash pip install -r requirements.txt ``` 4. Configure environment variables: ```bash # For development cp .env.development .env # For production cp .env.production .env # Edit .env with your specific configuration if needed ``` 5. Run in development: ```bash python app.py ``` ### Frontend Setup 1. Navigate to the frontend directory: ```bash cd frontend ``` 2. Install dependencies: ```bash npm install ``` 3. Configure environment variables: ```bash # For development cp .env.development .env # For production cp .env.production .env # Edit .env with your specific configuration if needed ``` 4. Run in development: ```bash npm run dev ``` 5. Build for production: ```bash npm run build ``` ## Production Deployment ### Backend Deployment (systemd service) 1. Copy the backend files to your server 2. Update paths in `veo-video-generator.service` 3. Copy service file: ```bash sudo cp veo-video-generator.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable veo-video-generator sudo systemctl start veo-video-generator ``` ### Frontend Deployment 1. Build the frontend: ```bash cd frontend npm run build ``` 2. Copy `dist/` contents to your web server directory: ```bash cp -r dist/* /path/to/your/web/server/veo/ ``` ### Apache Configuration Add the Apache configuration to your virtual host. Update paths as needed. #### Required Apache Modules Ensure these modules are enabled: ```bash sudo a2enmod proxy sudo a2enmod proxy_http sudo a2enmod rewrite sudo a2enmod headers sudo a2enmod expires sudo systemctl restart apache2 ``` #### Configuration Files 1. **Main Apache Config**: Use `apache.conf` for virtual host configuration 2. **Frontend .htaccess**: Copy `apache-htaccess.txt` to `/path/to/your/web/server/veo/.htaccess` ## Project Structure ``` veo3_poc/ ├── backend/ # Flask backend application │ ├── routes/ # API and health check endpoints │ │ ├── api.py # Main API routes (generate, status, download, cleanup) │ │ └── health.py # Health check endpoints │ ├── utils/ # Utility modules │ │ ├── auth.py # Google Cloud authentication │ │ └── storage.py # GCS operations and image processing │ ├── app.py # Flask app initialization and CORS config │ ├── config.py # Configuration management │ ├── video_generator.py # Core 3.1 integration logic │ ├── requirements.txt # Python dependencies │ ├── .env.development # Development environment config │ ├── .env.production # Production environment config │ └── temp_downloads/ # Temporary video storage ├── frontend/ # React frontend application │ ├── src/ │ │ ├── components/ # React components │ │ │ ├── VideoForm.jsx # Main video generation form │ │ │ ├── VideoGenerator.jsx # Top-level container │ │ │ ├── ProgressIndicator.jsx # Status display │ │ │ ├── Layout.jsx # App layout wrapper │ │ │ ├── AuthGuard.jsx # Authentication wrapper │ │ │ └── DevAuthWrapper.jsx # Dev mode auth bypass │ │ ├── config/ # MSAL configuration │ │ ├── services/ # API service layer │ │ ├── hooks/ # Custom React hooks │ │ └── App.jsx # Main app component │ ├── .env.development # Development environment config │ ├── .env.production # Production environment config │ └── package.json # Node.js dependencies ├── service-account.json # Google Cloud service account key ├── run-dev.sh # Development startup script ├── apache.conf # Apache virtual host configuration ├── apache-htaccess.txt # Frontend .htaccess rules └── veo-video-generator.service # Systemd service definition ``` ## Configuration ### Environment File Structure The application uses environment-specific configuration files: **Backend:** - `.env.development` - Debug mode, localhost CORS, development settings - `.env.production` - Production mode, strict CORS, optimized for deployment - `.env` - Active environment file (copy from development or production) **Frontend:** - `.env.development` - Localhost API, authentication bypass (`VITE_DEV_MODE=true`) - `.env.production` - Production API, MSAL authentication enabled (`VITE_DEV_MODE=false`) - `.env` - Active environment file (copy from development or production) ### Backend Environment Variables | Variable | Description | Default/Example | |----------|-------------|-----------------| | `PROJECT_ID` | Google Cloud project ID | `optical-414516` | | `REGION` | Google Cloud region | `us-central1` | | `MODEL_ID` | Default Veo model identifier | `veo-3.0-generate-preview` | | `MODEL_FAST_ID` | Default Veo Fast model identifier | `veo-3.0-fast-generate-preview` | | `OUTPUT_GCS_BUCKET_NAME` | GCS bucket for temporary storage | `optical-veo3-test` | | `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `./service-account.json` | | `PORT` | Backend server port | `7394` | | `FLASK_ENV` | Environment mode | `development` or `production` | | `FLASK_DEBUG` | Debug mode | `True` or `False` | | `FRONTEND_URL` | Frontend URL for CORS | `http://localhost:3000` or production URL | | `WEBHOOK_URL` | Usage tracking webhook URL | Optional | | `WEBHOOK_ENABLED` | Enable usage tracking | `true` or `false` | **Available Models:** - `veo-3.1-generate-preview` - Veo 3.1 Standard (with advanced features) - `veo-3.1-fast-generate-preview` - Veo 3.1 Fast (frame interpolation only) ### Frontend Environment Variables | Variable | Description | Example | |----------|-------------|---------| | `VITE_API_BASE_URL` | Backend API URL | `http://localhost:7394` | | `VITE_APP_TITLE` | Application title | `Veo Video Generator (Dev)` | | `VITE_DEV_MODE` | Development mode flag | `true` or `false` | | `VITE_MSAL_CLIENT_ID` | Azure AD client ID | `dd434534-...` | | `VITE_MSAL_AUTHORITY` | Azure AD authority URL | `https://login.microsoftonline.com/...` | | `VITE_MSAL_REDIRECT_URI` | Authentication redirect URI | `http://localhost:3000` | ## Key Dependencies ### Backend - `flask==3.0.0` - Web framework - `flask-cors==4.0.0` - Cross-origin resource sharing - `google-genai==1.47.0` - Google Gen AI SDK for 3.1 (with advanced features support) - `google-cloud-storage==2.12.0` - GCS file operations - `google-cloud-aiplatform==1.38.0` - Vertex AI platform - `hypercorn==0.15.0` - ASGI server for production - `python-dotenv==1.0.0` - Environment configuration - `Pillow==10.1.0` - Image processing and format conversion ### Frontend - `react==18.2.0` - UI framework - `@mui/material==5.15.1` - Material-UI component library - `@azure/msal-react==2.0.7` - Microsoft authentication - `axios==1.6.2` - HTTP client - `vite==5.0.8` - Build tool and dev server - `@fontsource/montserrat==5.0.16` - Typography ## API Endpoints ### Main API Routes (`/api`) | Method | Endpoint | Description | Request Body | |--------|----------|-------------|--------------| | `POST` | `/api/generate` | Start video generation | `{ prompt, model_name, video_length_sec, aspect_ratio, person_generation, sampleCount, seed, generate_audio, image, lastFrame, referenceImage1, referenceImage2, referenceImage3 }` | | `GET` | `/api/status/` | Check generation status | - | | `GET` | `/api/download/` | Download completed content (auto-deletes job) | - | | `GET` | `/api/download//video/` | Download individual video | - | | `GET` | `/api/user-jobs` | Get all jobs for user | Query: `user_email` | | `GET` | `/api/queue-status` | Get overall queue status | - | | `POST` | `/api/cancel/` | Cancel queued/processing job | - | | `POST` | `/api/retry/` | Retry failed/cancelled job | - | | `DELETE` | `/api/delete/` | Delete job completely | - | | `DELETE` | `/api/cleanup/` | Manual cleanup of temp files | - | **Veo 3.1 Image Parameters:** - `image` - First frame image (optional, all models) - `lastFrame` - Last frame image for interpolation (optional, Veo 3.1 only, requires 8-second duration) - `referenceImage1`, `referenceImage2`, `referenceImage3` - Reference images for content guidance (optional, Veo 3.1 Standard only, requires 16:9 aspect ratio and 8-second duration) ### Health Check Routes | Method | Endpoint | Description | |--------|----------|-------------| | `GET` | `/health` | Detailed health check with configuration info | | `GET` | `/ping` | Simple ping response | ## Video Generation Lifecycle ### Job Submission & Queuing 1. **User Input**: User provides prompt, optional images (first frame, last frame, reference images), and generation parameters (1-4 videos) 2. **Job Creation**: Backend creates unique job ID and validates parameters including Veo 3.1 feature constraints 3. **Queue Management**: Job added to global FIFO queue (unlimited per user) 4. **Queue Position**: Job displayed in "In Queue" section with position indicator ### Processing Pipeline 5. **Queue Processing**: Background thread picks next job when processing slot available (max 2 concurrent) 6. **Status Transition**: Job moves from "In Queue" to "Currently Processing" section 7. **Image Processing** (if provided): - First frame: Validated, converted to JPEG, uploaded to GCS (all models) - Last frame: Processed for frame interpolation (Veo 3.1 only) - Reference images: Up to 3 images processed for content guidance (Veo 3.1 Standard only) 8. **API Calls**: Multiple requests sent to Google Gen AI SDK with appropriate parameters for selected model 9. **Backend Polling**: Long-running operations polled every 30 seconds with retry logic 10. **Progress Updates**: Frontend polls status every 2 seconds for real-time updates ### Completion & Cleanup 11. **Video Download**: Completed videos downloaded from GCS to local temp storage 12. **File Packaging**: Multiple videos and images packaged into downloadable zip 13. **User Download**: Videos served to user with multiple download options 14. **Auto-cleanup**: Job automatically deleted 5 seconds after successful download ### Job Management Actions - **Cancel**: Remove from queue or stop active processing - **Retry**: Re-queue failed/cancelled jobs with original parameters - **Delete**: Complete removal of job data, local files, and GCS resources - **Download Options**: Individual videos or complete zip package ## Security - CORS configured for specific frontend domain(s) - Azure AD SSO authentication in production (bypassed in dev mode) - Automatic cleanup of temporary files after download - Service account with minimal required GCS permissions - Secure headers in Apache configuration - Backend service runs as non-root user in production ## Monitoring and Logging ### Backend Logs ```bash # View systemd service logs (production) sudo journalctl -u veo-video-generator -f # View Flask app logs (development) # Logs printed to terminal running app.py ``` ### Frontend Logs - Browser console for React errors - Network tab for API request/response debugging - Apache access logs: `/var/log/apache2/access.log` ### Usage Tracking - Webhook integration sends generation requests to configured endpoint - Tracks: user email, prompt, model, timestamp - Can be disabled via `WEBHOOK_ENABLED=false` ## Troubleshooting ### Common Issues | Issue | Possible Cause | Solution | |-------|----------------|----------| | **Authentication fails** | Azure AD misconfiguration | Verify `VITE_MSAL_CLIENT_ID`, `VITE_MSAL_AUTHORITY`, and redirect URIs match Azure AD app | | **Backend connection error** | Service not running or CORS issue | Check `systemctl status veo-video-generator` and `FRONTEND_URL` in backend `.env` | | **Video generation fails** | Invalid credentials or API access | Verify service account permissions and Veo 3.1 APIs are enabled in GCP | | **Image upload rejected** | Invalid format or size | Ensure image is <10MB and meets minimum 720x720 resolution | | **Download hangs** | GCS permission issue | Check service account has `storage.objects.get` permission on bucket | | **Model not found** | Wrong region or model ID | Verify Veo 3.1 is available in specified `REGION` | | **Reference images fail** | Wrong model or constraints | Reference images require Veo 3.1 Standard model, 16:9 aspect ratio, and 8-second duration | | **Last frame fails** | Wrong constraints | Last frame interpolation requires Veo 3.1 model (Standard or Fast) and 8-second duration | | **SDK parameter error** | Outdated SDK version | Ensure `google-genai>=1.47.0` is installed for Veo 3.1 features | ### Veo 3.1 Feature Requirements **Frame Interpolation (Last Frame):** - ✅ Supported models: `veo-3.1-generate-preview`, `veo-3.1-fast-generate-preview` - ✅ Required duration: 8 seconds - ✅ Supported aspect ratios: 16:9, 9:16 **Reference Images:** - ✅ Supported model: `veo-3.1-generate-preview` (Standard only, NOT Fast) - ✅ Required duration: 8 seconds - ✅ Required aspect ratio: 16:9 only - ✅ Maximum images: 3 reference images - ❌ Not supported in: Veo 3.1 Fast ### Debug Mode Enable detailed logging in development: ```bash # Backend FLASK_DEBUG=True in .env # Frontend Check browser console with React DevTools ``` ## Development ### Local Development Setup For local testing without authentication: 1. **Quick Start** (runs both backend and frontend): ```bash ./run-dev.sh ``` 2. **Manual Start**: **Backend** (Terminal 1): ```bash cd backend cp .env.development .env python app.py ``` **Frontend** (Terminal 2): ```bash cd frontend npm run dev ``` ### Development Features - **Authentication Bypass**: MSAL/SSO automatically bypassed when `VITE_DEV_MODE=true` - **CORS**: Configured for `localhost:3000` and `127.0.0.1:3000` - **Hot Reload**: Vite dev server auto-reloads frontend on file changes - **Debug Mode**: Flask runs with detailed error pages and auto-reload - **Mock User**: Shows "Dev User" in the interface header ### Development URLs - Backend API: `http://localhost:7394` - Frontend: `http://localhost:3000` - No authentication required in dev mode ## Additional Files - **`user_docs.md`**: Comprehensive user documentation and feature guide - **`CLAUDE.md`**: AI assistant guidance for working with this codebase - **`extract_usage_logs.sh`**: Script for extracting usage data from webhook logs - **`veo3.zip`**: Archive of production deployment artifacts - **`.gitignore`**: Git exclusions (includes `.env`, `node_modules`, `temp_downloads`, etc.) ## Video Generation Architecture ### Job Queue System - **Global Queue**: FIFO processing with unlimited submissions per user - **Concurrent Processing**: Maximum 2 jobs processing simultaneously - **Status Tracking**: In-memory job status dictionary (consider Redis for scaling) - **User Limits**: No queue limits, but 1-4 videos per individual request ### Queue Display Sections 1. **Currently Processing**: Jobs actively generating videos (highlighted in blue) 2. **In Queue**: Jobs waiting for processing slots (highlighted in orange) 3. **History**: Completed, failed, or cancelled jobs (standard styling) ### File Management - **Local Storage**: `temp_downloads/job_{job_id}/` for each job - **GCS Integration**: Temporary images uploaded to `temp_images/` bucket path - **Auto-cleanup**: Jobs deleted 5 seconds after successful download - **Manual Cleanup**: Complete job deletion via delete button - **Download Formats**: Individual MP4s or complete ZIP packages ### Job Actions by Status - **Queued**: Cancel, Delete - **Processing**: Cancel, Delete - **Failed/Cancelled**: Retry, Delete - **Completed**: Download All, Download Individual Videos (auto-deletes after download) ## Notes - The original `veo.py` standalone script has been replaced by the full-stack application - **Quad model support**: Veo 3.1 (Standard & Fast) - **Veo 3.1 advanced features**: Frame interpolation and reference images with conditional UI - Multi-video generation support (1-4 videos per request) - Unlimited job submissions with intelligent queue management - Complete job lifecycle management with cancel/retry/delete functionality - Generated videos are automatically cleaned up after download - Image uploads are automatically converted to JPEG format regardless of input format - The application uses in-memory job status tracking (consider Redis for production scaling) - SDK upgraded to `google-genai==1.47.0` for Veo 3.1 feature support