511 lines
No EOL
20 KiB
Markdown
511 lines
No EOL
20 KiB
Markdown
# Veo 3.1 Video Generator
|
|
|
|
A full-stack web application for generating AI videos using Google's Veo 3.1 models. Generate videos from text prompts with advanced features including frame interpolation, reference images, and customizable parameters.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Clone and navigate to project
|
|
cd veo3_poc
|
|
|
|
# Ensure service-account.json is in the root directory
|
|
|
|
# Run development servers (both frontend and backend)
|
|
./run-dev.sh
|
|
|
|
# Access the application
|
|
# Frontend: http://localhost:3000
|
|
# Backend API: http://localhost:7394
|
|
```
|
|
|
|
## Architecture
|
|
|
|
- **Frontend**: React 18 + Vite + Material-UI 5 + Montserrat typography
|
|
- **Backend**: Flask 3.0 + Google Gen AI SDK 1.47.0
|
|
- **Authentication**: Microsoft Azure AD SSO (MSAL 2.0)
|
|
- **Storage**: Google Cloud Storage for temporary video and image files
|
|
- **Deployment**: Systemd service + Apache reverse proxy
|
|
|
|
## Features
|
|
|
|
### Core Video Generation
|
|
- **Text-to-Video Generation**: Create videos from descriptive text prompts
|
|
- **Image-to-Video Generation**: Upload first frame images to guide video generation
|
|
- **Quad Model Support**: Choose between four models:
|
|
- **Veo 3.1** (Standard): High-quality with advanced features - $0.40/sec
|
|
- **Veo 3.1 Fast**: Optimized speed with frame interpolation - $0.15/sec
|
|
|
|
|
|
### Veo 3.1 Advanced Features
|
|
- **Frame Interpolation**: Upload both first and last frames to generate smooth transitions between them (8-second videos only)
|
|
- **Reference Images**: Guide video content with up to 3 reference images for consistent characters, objects, or styles (16:9 aspect ratio, 8-second videos, Standard model only)
|
|
- **Conditional UI**: Advanced features automatically appear/disappear based on selected model capabilities
|
|
|
|
### Job Management
|
|
- **Multi-Video Generation**: Generate 1-4 videos per request with batch processing
|
|
- **Unlimited Job Queue**: Submit unlimited video generation jobs with FIFO processing
|
|
- **Advanced Job Management**: Cancel, retry, and delete jobs with complete cleanup
|
|
- **Real-time Queue Visualization**: Live status updates with three-section queue display
|
|
|
|
### Customizable Parameters
|
|
- Video length (4, 6, or 8 seconds)
|
|
- Aspect ratio (16:9 landscape or 9:16 portrait)
|
|
- Person generation policy (allow/don't allow)
|
|
- Custom seed values for reproducible results
|
|
- Audio generation toggle
|
|
|
|
### Additional Features
|
|
- **Intelligent File Management**: Auto-cleanup after download, comprehensive GCS cleanup
|
|
- **Usage Tracking**: Webhook integration for monitoring generation requests
|
|
- **Development Mode**: Local development with authentication bypass
|
|
|
|
## Prerequisites
|
|
|
|
- Python 3.13+ (or 3.8+)
|
|
- Node.js 16+
|
|
- Google Cloud Project with Veo 3.1 API access
|
|
- Google Cloud Storage bucket
|
|
- Service account JSON key with appropriate permissions
|
|
- Microsoft Azure AD application configured (for production SSO)
|
|
|
|
## Setup Instructions
|
|
|
|
### Backend Setup
|
|
|
|
1. Navigate to the backend directory:
|
|
```bash
|
|
cd backend
|
|
```
|
|
|
|
2. Create and activate virtual environment:
|
|
```bash
|
|
python -m venv venv
|
|
source venv/bin/activate # Linux/Mac
|
|
# or
|
|
venv\Scripts\activate # Windows
|
|
```
|
|
|
|
3. Install dependencies:
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
4. Configure environment variables:
|
|
```bash
|
|
# For development
|
|
cp .env.development .env
|
|
|
|
# For production
|
|
cp .env.production .env
|
|
# Edit .env with your specific configuration if needed
|
|
```
|
|
|
|
5. Run in development:
|
|
```bash
|
|
python app.py
|
|
```
|
|
|
|
### Frontend Setup
|
|
|
|
1. Navigate to the frontend directory:
|
|
```bash
|
|
cd frontend
|
|
```
|
|
|
|
2. Install dependencies:
|
|
```bash
|
|
npm install
|
|
```
|
|
|
|
3. Configure environment variables:
|
|
```bash
|
|
# For development
|
|
cp .env.development .env
|
|
|
|
# For production
|
|
cp .env.production .env
|
|
# Edit .env with your specific configuration if needed
|
|
```
|
|
|
|
4. Run in development:
|
|
```bash
|
|
npm run dev
|
|
```
|
|
|
|
5. Build for production:
|
|
```bash
|
|
npm run build
|
|
```
|
|
|
|
## Production Deployment
|
|
|
|
### Backend Deployment (systemd service)
|
|
|
|
1. Copy the backend files to your server
|
|
2. Update paths in `veo-video-generator.service`
|
|
3. Copy service file:
|
|
```bash
|
|
sudo cp veo-video-generator.service /etc/systemd/system/
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable veo-video-generator
|
|
sudo systemctl start veo-video-generator
|
|
```
|
|
|
|
### Frontend Deployment
|
|
|
|
1. Build the frontend:
|
|
```bash
|
|
cd frontend
|
|
npm run build
|
|
```
|
|
|
|
2. Copy `dist/` contents to your web server directory:
|
|
```bash
|
|
cp -r dist/* /path/to/your/web/server/veo/
|
|
```
|
|
|
|
### Apache Configuration
|
|
|
|
Add the Apache configuration to your virtual host. Update paths as needed.
|
|
|
|
#### Required Apache Modules
|
|
Ensure these modules are enabled:
|
|
```bash
|
|
sudo a2enmod proxy
|
|
sudo a2enmod proxy_http
|
|
sudo a2enmod rewrite
|
|
sudo a2enmod headers
|
|
sudo a2enmod expires
|
|
sudo systemctl restart apache2
|
|
```
|
|
|
|
#### Configuration Files
|
|
1. **Main Apache Config**: Use `apache.conf` for virtual host configuration
|
|
2. **Frontend .htaccess**: Copy `apache-htaccess.txt` to `/path/to/your/web/server/veo/.htaccess`
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
veo3_poc/
|
|
├── backend/ # Flask backend application
|
|
│ ├── routes/ # API and health check endpoints
|
|
│ │ ├── api.py # Main API routes (generate, status, download, cleanup)
|
|
│ │ └── health.py # Health check endpoints
|
|
│ ├── utils/ # Utility modules
|
|
│ │ ├── auth.py # Google Cloud authentication
|
|
│ │ └── storage.py # GCS operations and image processing
|
|
│ ├── app.py # Flask app initialization and CORS config
|
|
│ ├── config.py # Configuration management
|
|
│ ├── video_generator.py # Core 3.1 integration logic
|
|
│ ├── requirements.txt # Python dependencies
|
|
│ ├── .env.development # Development environment config
|
|
│ ├── .env.production # Production environment config
|
|
│ └── temp_downloads/ # Temporary video storage
|
|
├── frontend/ # React frontend application
|
|
│ ├── src/
|
|
│ │ ├── components/ # React components
|
|
│ │ │ ├── VideoForm.jsx # Main video generation form
|
|
│ │ │ ├── VideoGenerator.jsx # Top-level container
|
|
│ │ │ ├── ProgressIndicator.jsx # Status display
|
|
│ │ │ ├── Layout.jsx # App layout wrapper
|
|
│ │ │ ├── AuthGuard.jsx # Authentication wrapper
|
|
│ │ │ └── DevAuthWrapper.jsx # Dev mode auth bypass
|
|
│ │ ├── config/ # MSAL configuration
|
|
│ │ ├── services/ # API service layer
|
|
│ │ ├── hooks/ # Custom React hooks
|
|
│ │ └── App.jsx # Main app component
|
|
│ ├── .env.development # Development environment config
|
|
│ ├── .env.production # Production environment config
|
|
│ └── package.json # Node.js dependencies
|
|
├── service-account.json # Google Cloud service account key
|
|
├── run-dev.sh # Development startup script
|
|
├── apache.conf # Apache virtual host configuration
|
|
├── apache-htaccess.txt # Frontend .htaccess rules
|
|
└── veo-video-generator.service # Systemd service definition
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Environment File Structure
|
|
|
|
The application uses environment-specific configuration files:
|
|
|
|
**Backend:**
|
|
- `.env.development` - Debug mode, localhost CORS, development settings
|
|
- `.env.production` - Production mode, strict CORS, optimized for deployment
|
|
- `.env` - Active environment file (copy from development or production)
|
|
|
|
**Frontend:**
|
|
- `.env.development` - Localhost API, authentication bypass (`VITE_DEV_MODE=true`)
|
|
- `.env.production` - Production API, MSAL authentication enabled (`VITE_DEV_MODE=false`)
|
|
- `.env` - Active environment file (copy from development or production)
|
|
|
|
### Backend Environment Variables
|
|
|
|
| Variable | Description | Default/Example |
|
|
|----------|-------------|-----------------|
|
|
| `PROJECT_ID` | Google Cloud project ID | `optical-414516` |
|
|
| `REGION` | Google Cloud region | `us-central1` |
|
|
| `MODEL_ID` | Default Veo model identifier | `veo-3.0-generate-preview` |
|
|
| `MODEL_FAST_ID` | Default Veo Fast model identifier | `veo-3.0-fast-generate-preview` |
|
|
| `OUTPUT_GCS_BUCKET_NAME` | GCS bucket for temporary storage | `optical-veo3-test` |
|
|
| `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `./service-account.json` |
|
|
| `PORT` | Backend server port | `7394` |
|
|
| `FLASK_ENV` | Environment mode | `development` or `production` |
|
|
| `FLASK_DEBUG` | Debug mode | `True` or `False` |
|
|
| `FRONTEND_URL` | Frontend URL for CORS | `http://localhost:3000` or production URL |
|
|
| `WEBHOOK_URL` | Usage tracking webhook URL | Optional |
|
|
| `WEBHOOK_ENABLED` | Enable usage tracking | `true` or `false` |
|
|
|
|
**Available Models:**
|
|
- `veo-3.1-generate-preview` - Veo 3.1 Standard (with advanced features)
|
|
- `veo-3.1-fast-generate-preview` - Veo 3.1 Fast (frame interpolation only)
|
|
|
|
### Frontend Environment Variables
|
|
|
|
| Variable | Description | Example |
|
|
|----------|-------------|---------|
|
|
| `VITE_API_BASE_URL` | Backend API URL | `http://localhost:7394` |
|
|
| `VITE_APP_TITLE` | Application title | `Veo Video Generator (Dev)` |
|
|
| `VITE_DEV_MODE` | Development mode flag | `true` or `false` |
|
|
| `VITE_MSAL_CLIENT_ID` | Azure AD client ID | `dd434534-...` |
|
|
| `VITE_MSAL_AUTHORITY` | Azure AD authority URL | `https://login.microsoftonline.com/...` |
|
|
| `VITE_MSAL_REDIRECT_URI` | Authentication redirect URI | `http://localhost:3000` |
|
|
|
|
## Key Dependencies
|
|
|
|
### Backend
|
|
- `flask==3.0.0` - Web framework
|
|
- `flask-cors==4.0.0` - Cross-origin resource sharing
|
|
- `google-genai==1.47.0` - Google Gen AI SDK for 3.1 (with advanced features support)
|
|
- `google-cloud-storage==2.12.0` - GCS file operations
|
|
- `google-cloud-aiplatform==1.38.0` - Vertex AI platform
|
|
- `hypercorn==0.15.0` - ASGI server for production
|
|
- `python-dotenv==1.0.0` - Environment configuration
|
|
- `Pillow==10.1.0` - Image processing and format conversion
|
|
|
|
### Frontend
|
|
- `react==18.2.0` - UI framework
|
|
- `@mui/material==5.15.1` - Material-UI component library
|
|
- `@azure/msal-react==2.0.7` - Microsoft authentication
|
|
- `axios==1.6.2` - HTTP client
|
|
- `vite==5.0.8` - Build tool and dev server
|
|
- `@fontsource/montserrat==5.0.16` - Typography
|
|
|
|
## API Endpoints
|
|
|
|
### Main API Routes (`/api`)
|
|
|
|
| Method | Endpoint | Description | Request Body |
|
|
|--------|----------|-------------|--------------|
|
|
| `POST` | `/api/generate` | Start video generation | `{ prompt, model_name, video_length_sec, aspect_ratio, person_generation, sampleCount, seed, generate_audio, image, lastFrame, referenceImage1, referenceImage2, referenceImage3 }` |
|
|
| `GET` | `/api/status/<job_id>` | Check generation status | - |
|
|
| `GET` | `/api/download/<job_id>` | Download completed content (auto-deletes job) | - |
|
|
| `GET` | `/api/download/<job_id>/video/<index>` | Download individual video | - |
|
|
| `GET` | `/api/user-jobs` | Get all jobs for user | Query: `user_email` |
|
|
| `GET` | `/api/queue-status` | Get overall queue status | - |
|
|
| `POST` | `/api/cancel/<job_id>` | Cancel queued/processing job | - |
|
|
| `POST` | `/api/retry/<job_id>` | Retry failed/cancelled job | - |
|
|
| `DELETE` | `/api/delete/<job_id>` | Delete job completely | - |
|
|
| `DELETE` | `/api/cleanup/<job_id>` | Manual cleanup of temp files | - |
|
|
|
|
**Veo 3.1 Image Parameters:**
|
|
- `image` - First frame image (optional, all models)
|
|
- `lastFrame` - Last frame image for interpolation (optional, Veo 3.1 only, requires 8-second duration)
|
|
- `referenceImage1`, `referenceImage2`, `referenceImage3` - Reference images for content guidance (optional, Veo 3.1 Standard only, requires 16:9 aspect ratio and 8-second duration)
|
|
|
|
### Health Check Routes
|
|
|
|
| Method | Endpoint | Description |
|
|
|--------|----------|-------------|
|
|
| `GET` | `/health` | Detailed health check with configuration info |
|
|
| `GET` | `/ping` | Simple ping response |
|
|
|
|
## Video Generation Lifecycle
|
|
|
|
### Job Submission & Queuing
|
|
1. **User Input**: User provides prompt, optional images (first frame, last frame, reference images), and generation parameters (1-4 videos)
|
|
2. **Job Creation**: Backend creates unique job ID and validates parameters including Veo 3.1 feature constraints
|
|
3. **Queue Management**: Job added to global FIFO queue (unlimited per user)
|
|
4. **Queue Position**: Job displayed in "In Queue" section with position indicator
|
|
|
|
### Processing Pipeline
|
|
5. **Queue Processing**: Background thread picks next job when processing slot available (max 2 concurrent)
|
|
6. **Status Transition**: Job moves from "In Queue" to "Currently Processing" section
|
|
7. **Image Processing** (if provided):
|
|
- First frame: Validated, converted to JPEG, uploaded to GCS (all models)
|
|
- Last frame: Processed for frame interpolation (Veo 3.1 only)
|
|
- Reference images: Up to 3 images processed for content guidance (Veo 3.1 Standard only)
|
|
8. **API Calls**: Multiple requests sent to Google Gen AI SDK with appropriate parameters for selected model
|
|
9. **Backend Polling**: Long-running operations polled every 30 seconds with retry logic
|
|
10. **Progress Updates**: Frontend polls status every 2 seconds for real-time updates
|
|
|
|
### Completion & Cleanup
|
|
11. **Video Download**: Completed videos downloaded from GCS to local temp storage
|
|
12. **File Packaging**: Multiple videos and images packaged into downloadable zip
|
|
13. **User Download**: Videos served to user with multiple download options
|
|
14. **Auto-cleanup**: Job automatically deleted 5 seconds after successful download
|
|
|
|
### Job Management Actions
|
|
- **Cancel**: Remove from queue or stop active processing
|
|
- **Retry**: Re-queue failed/cancelled jobs with original parameters
|
|
- **Delete**: Complete removal of job data, local files, and GCS resources
|
|
- **Download Options**: Individual videos or complete zip package
|
|
|
|
## Security
|
|
|
|
- CORS configured for specific frontend domain(s)
|
|
- Azure AD SSO authentication in production (bypassed in dev mode)
|
|
- Automatic cleanup of temporary files after download
|
|
- Service account with minimal required GCS permissions
|
|
- Secure headers in Apache configuration
|
|
- Backend service runs as non-root user in production
|
|
|
|
## Monitoring and Logging
|
|
|
|
### Backend Logs
|
|
```bash
|
|
# View systemd service logs (production)
|
|
sudo journalctl -u veo-video-generator -f
|
|
|
|
# View Flask app logs (development)
|
|
# Logs printed to terminal running app.py
|
|
```
|
|
|
|
### Frontend Logs
|
|
- Browser console for React errors
|
|
- Network tab for API request/response debugging
|
|
- Apache access logs: `/var/log/apache2/access.log`
|
|
|
|
### Usage Tracking
|
|
- Webhook integration sends generation requests to configured endpoint
|
|
- Tracks: user email, prompt, model, timestamp
|
|
- Can be disabled via `WEBHOOK_ENABLED=false`
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
| Issue | Possible Cause | Solution |
|
|
|-------|----------------|----------|
|
|
| **Authentication fails** | Azure AD misconfiguration | Verify `VITE_MSAL_CLIENT_ID`, `VITE_MSAL_AUTHORITY`, and redirect URIs match Azure AD app |
|
|
| **Backend connection error** | Service not running or CORS issue | Check `systemctl status veo-video-generator` and `FRONTEND_URL` in backend `.env` |
|
|
| **Video generation fails** | Invalid credentials or API access | Verify service account permissions and Veo 3.1 APIs are enabled in GCP |
|
|
| **Image upload rejected** | Invalid format or size | Ensure image is <10MB and meets minimum 720x720 resolution |
|
|
| **Download hangs** | GCS permission issue | Check service account has `storage.objects.get` permission on bucket |
|
|
| **Model not found** | Wrong region or model ID | Verify Veo 3.1 is available in specified `REGION` |
|
|
| **Reference images fail** | Wrong model or constraints | Reference images require Veo 3.1 Standard model, 16:9 aspect ratio, and 8-second duration |
|
|
| **Last frame fails** | Wrong constraints | Last frame interpolation requires Veo 3.1 model (Standard or Fast) and 8-second duration |
|
|
| **SDK parameter error** | Outdated SDK version | Ensure `google-genai>=1.47.0` is installed for Veo 3.1 features |
|
|
|
|
### Veo 3.1 Feature Requirements
|
|
|
|
**Frame Interpolation (Last Frame):**
|
|
- ✅ Supported models: `veo-3.1-generate-preview`, `veo-3.1-fast-generate-preview`
|
|
- ✅ Required duration: 8 seconds
|
|
- ✅ Supported aspect ratios: 16:9, 9:16
|
|
|
|
**Reference Images:**
|
|
- ✅ Supported model: `veo-3.1-generate-preview` (Standard only, NOT Fast)
|
|
- ✅ Required duration: 8 seconds
|
|
- ✅ Required aspect ratio: 16:9 only
|
|
- ✅ Maximum images: 3 reference images
|
|
- ❌ Not supported in: Veo 3.1 Fast
|
|
|
|
### Debug Mode
|
|
|
|
Enable detailed logging in development:
|
|
```bash
|
|
# Backend
|
|
FLASK_DEBUG=True in .env
|
|
|
|
# Frontend
|
|
Check browser console with React DevTools
|
|
```
|
|
|
|
## Development
|
|
|
|
### Local Development Setup
|
|
|
|
For local testing without authentication:
|
|
|
|
1. **Quick Start** (runs both backend and frontend):
|
|
```bash
|
|
./run-dev.sh
|
|
```
|
|
|
|
2. **Manual Start**:
|
|
|
|
**Backend** (Terminal 1):
|
|
```bash
|
|
cd backend
|
|
cp .env.development .env
|
|
python app.py
|
|
```
|
|
|
|
**Frontend** (Terminal 2):
|
|
```bash
|
|
cd frontend
|
|
npm run dev
|
|
```
|
|
|
|
### Development Features
|
|
|
|
- **Authentication Bypass**: MSAL/SSO automatically bypassed when `VITE_DEV_MODE=true`
|
|
- **CORS**: Configured for `localhost:3000` and `127.0.0.1:3000`
|
|
- **Hot Reload**: Vite dev server auto-reloads frontend on file changes
|
|
- **Debug Mode**: Flask runs with detailed error pages and auto-reload
|
|
- **Mock User**: Shows "Dev User" in the interface header
|
|
|
|
### Development URLs
|
|
|
|
- Backend API: `http://localhost:7394`
|
|
- Frontend: `http://localhost:3000`
|
|
- No authentication required in dev mode
|
|
|
|
## Additional Files
|
|
|
|
- **`user_docs.md`**: Comprehensive user documentation and feature guide
|
|
- **`CLAUDE.md`**: AI assistant guidance for working with this codebase
|
|
- **`extract_usage_logs.sh`**: Script for extracting usage data from webhook logs
|
|
- **`veo3.zip`**: Archive of production deployment artifacts
|
|
- **`.gitignore`**: Git exclusions (includes `.env`, `node_modules`, `temp_downloads`, etc.)
|
|
|
|
## Video Generation Architecture
|
|
|
|
### Job Queue System
|
|
- **Global Queue**: FIFO processing with unlimited submissions per user
|
|
- **Concurrent Processing**: Maximum 2 jobs processing simultaneously
|
|
- **Status Tracking**: In-memory job status dictionary (consider Redis for scaling)
|
|
- **User Limits**: No queue limits, but 1-4 videos per individual request
|
|
|
|
### Queue Display Sections
|
|
1. **Currently Processing**: Jobs actively generating videos (highlighted in blue)
|
|
2. **In Queue**: Jobs waiting for processing slots (highlighted in orange)
|
|
3. **History**: Completed, failed, or cancelled jobs (standard styling)
|
|
|
|
### File Management
|
|
- **Local Storage**: `temp_downloads/job_{job_id}/` for each job
|
|
- **GCS Integration**: Temporary images uploaded to `temp_images/` bucket path
|
|
- **Auto-cleanup**: Jobs deleted 5 seconds after successful download
|
|
- **Manual Cleanup**: Complete job deletion via delete button
|
|
- **Download Formats**: Individual MP4s or complete ZIP packages
|
|
|
|
### Job Actions by Status
|
|
- **Queued**: Cancel, Delete
|
|
- **Processing**: Cancel, Delete
|
|
- **Failed/Cancelled**: Retry, Delete
|
|
- **Completed**: Download All, Download Individual Videos (auto-deletes after download)
|
|
|
|
## Notes
|
|
|
|
- The original `veo.py` standalone script has been replaced by the full-stack application
|
|
- **Quad model support**: Veo 3.1 (Standard & Fast)
|
|
- **Veo 3.1 advanced features**: Frame interpolation and reference images with conditional UI
|
|
- Multi-video generation support (1-4 videos per request)
|
|
- Unlimited job submissions with intelligent queue management
|
|
- Complete job lifecycle management with cancel/retry/delete functionality
|
|
- Generated videos are automatically cleaned up after download
|
|
- Image uploads are automatically converted to JPEG format regardless of input format
|
|
- The application uses in-memory job status tracking (consider Redis for production scaling)
|
|
- SDK upgraded to `google-genai==1.47.0` for Veo 3.1 feature support |