veo3/README.md
2025-09-30 09:55:04 -05:00

409 lines
No EOL
14 KiB
Markdown

# Veo 3.0 Video Generator
A full-stack web application for generating AI videos using Google's Veo 3.0 model. Generate videos from text prompts or reference images with customizable parameters including video length, aspect ratio, and person generation settings.
## Quick Start
```bash
# Clone and navigate to project
cd veo3_poc
# Ensure service-account.json is in the root directory
# Run development servers (both frontend and backend)
./run-dev.sh
# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:7394
```
## Architecture
- **Frontend**: React 18 + Vite + Material-UI 5 + Montserrat typography
- **Backend**: Flask 3.0 + Google Gen AI SDK 1.17.0
- **Authentication**: Microsoft Azure AD SSO (MSAL 2.0)
- **Storage**: Google Cloud Storage for temporary video files
- **Deployment**: Systemd service + Apache reverse proxy
## Features
- **Text-to-Video Generation**: Create videos from descriptive text prompts
- **Image-to-Video Generation**: Upload reference images to guide video generation
- **Dual Model Support**: Choose between Veo 3.0 (high-quality) and Veo 3.0 Fast (optimized for speed)
- **Customizable Parameters**:
- Video length (1-60 seconds)
- Aspect ratio (16:9 landscape or 9:16 portrait)
- Person generation policy (allow/don't allow)
- **Real-time Progress Tracking**: Live status updates during video generation
- **Automatic File Management**: Downloads and cleanup handled automatically
- **Usage Tracking**: Webhook integration for monitoring generation requests
- **Development Mode**: Local development with authentication bypass
## Prerequisites
- Python 3.13+ (or 3.8+)
- Node.js 16+
- Google Cloud Project with Veo 3.0 API access
- Google Cloud Storage bucket
- Service account JSON key with appropriate permissions
- Microsoft Azure AD application configured (for production SSO)
## Setup Instructions
### Backend Setup
1. Navigate to the backend directory:
```bash
cd backend
```
2. Create and activate virtual environment:
```bash
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Configure environment variables:
```bash
# For development
cp .env.development .env
# For production
cp .env.production .env
# Edit .env with your specific configuration if needed
```
5. Run in development:
```bash
python app.py
```
### Frontend Setup
1. Navigate to the frontend directory:
```bash
cd frontend
```
2. Install dependencies:
```bash
npm install
```
3. Configure environment variables:
```bash
# For development
cp .env.development .env
# For production
cp .env.production .env
# Edit .env with your specific configuration if needed
```
4. Run in development:
```bash
npm run dev
```
5. Build for production:
```bash
npm run build
```
## Production Deployment
### Backend Deployment (systemd service)
1. Copy the backend files to your server
2. Update paths in `veo-video-generator.service`
3. Copy service file:
```bash
sudo cp veo-video-generator.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable veo-video-generator
sudo systemctl start veo-video-generator
```
### Frontend Deployment
1. Build the frontend:
```bash
cd frontend
npm run build
```
2. Copy `dist/` contents to your web server directory:
```bash
cp -r dist/* /path/to/your/web/server/veo/
```
### Apache Configuration
Add the Apache configuration to your virtual host. Update paths as needed.
#### Required Apache Modules
Ensure these modules are enabled:
```bash
sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod rewrite
sudo a2enmod headers
sudo a2enmod expires
sudo systemctl restart apache2
```
#### Configuration Files
1. **Main Apache Config**: Use `apache.conf` for virtual host configuration
2. **Frontend .htaccess**: Copy `apache-htaccess.txt` to `/path/to/your/web/server/veo/.htaccess`
## Project Structure
```
veo3_poc/
├── backend/ # Flask backend application
│ ├── routes/ # API and health check endpoints
│ │ ├── api.py # Main API routes (generate, status, download, cleanup)
│ │ └── health.py # Health check endpoints
│ ├── utils/ # Utility modules
│ │ ├── auth.py # Google Cloud authentication
│ │ └── storage.py # GCS operations and image processing
│ ├── app.py # Flask app initialization and CORS config
│ ├── config.py # Configuration management
│ ├── video_generator.py # Core Veo 3.0 integration logic
│ ├── requirements.txt # Python dependencies
│ ├── .env.development # Development environment config
│ ├── .env.production # Production environment config
│ └── temp_downloads/ # Temporary video storage
├── frontend/ # React frontend application
│ ├── src/
│ │ ├── components/ # React components
│ │ │ ├── VideoForm.jsx # Main video generation form
│ │ │ ├── VideoGenerator.jsx # Top-level container
│ │ │ ├── ProgressIndicator.jsx # Status display
│ │ │ ├── Layout.jsx # App layout wrapper
│ │ │ ├── AuthGuard.jsx # Authentication wrapper
│ │ │ └── DevAuthWrapper.jsx # Dev mode auth bypass
│ │ ├── config/ # MSAL configuration
│ │ ├── services/ # API service layer
│ │ ├── hooks/ # Custom React hooks
│ │ └── App.jsx # Main app component
│ ├── .env.development # Development environment config
│ ├── .env.production # Production environment config
│ └── package.json # Node.js dependencies
├── service-account.json # Google Cloud service account key
├── run-dev.sh # Development startup script
├── apache.conf # Apache virtual host configuration
├── apache-htaccess.txt # Frontend .htaccess rules
└── veo-video-generator.service # Systemd service definition
```
## Configuration
### Environment File Structure
The application uses environment-specific configuration files:
**Backend:**
- `.env.development` - Debug mode, localhost CORS, development settings
- `.env.production` - Production mode, strict CORS, optimized for deployment
- `.env` - Active environment file (copy from development or production)
**Frontend:**
- `.env.development` - Localhost API, authentication bypass (`VITE_DEV_MODE=true`)
- `.env.production` - Production API, MSAL authentication enabled (`VITE_DEV_MODE=false`)
- `.env` - Active environment file (copy from development or production)
### Backend Environment Variables
| Variable | Description | Default/Example |
|----------|-------------|-----------------|
| `PROJECT_ID` | Google Cloud project ID | `optical-414516` |
| `REGION` | Google Cloud region | `us-central1` |
| `MODEL_ID` | Veo model identifier | `veo-3.0-generate-preview` |
| `MODEL_FAST_ID` | Veo Fast model identifier | `veo-3.0-fast-generate-preview` |
| `OUTPUT_GCS_BUCKET_NAME` | GCS bucket for temporary storage | `optical-veo3-test` |
| `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `../service-account.json` |
| `PORT` | Backend server port | `7394` |
| `FLASK_ENV` | Environment mode | `development` or `production` |
| `FLASK_DEBUG` | Debug mode | `True` or `False` |
| `FRONTEND_URL` | Frontend URL for CORS | `http://localhost:3000` or production URL |
| `WEBHOOK_URL` | Usage tracking webhook URL | Optional |
| `WEBHOOK_ENABLED` | Enable usage tracking | `true` or `false` |
### Frontend Environment Variables
| Variable | Description | Example |
|----------|-------------|---------|
| `VITE_API_BASE_URL` | Backend API URL | `http://localhost:7394` |
| `VITE_APP_TITLE` | Application title | `Veo Video Generator (Dev)` |
| `VITE_DEV_MODE` | Development mode flag | `true` or `false` |
| `VITE_MSAL_CLIENT_ID` | Azure AD client ID | `dd434534-...` |
| `VITE_MSAL_AUTHORITY` | Azure AD authority URL | `https://login.microsoftonline.com/...` |
| `VITE_MSAL_REDIRECT_URI` | Authentication redirect URI | `http://localhost:3000` |
## Key Dependencies
### Backend
- `flask==3.0.0` - Web framework
- `flask-cors==4.0.0` - Cross-origin resource sharing
- `google-genai==1.17.0` - Google Gen AI SDK for Veo 3.0
- `google-cloud-storage==2.12.0` - GCS file operations
- `google-cloud-aiplatform==1.38.0` - Vertex AI platform
- `hypercorn==0.15.0` - ASGI server for production
- `python-dotenv==1.0.0` - Environment configuration
- `Pillow==10.1.0` - Image processing
### Frontend
- `react==18.2.0` - UI framework
- `@mui/material==5.15.1` - Material-UI component library
- `@azure/msal-react==2.0.7` - Microsoft authentication
- `axios==1.6.2` - HTTP client
- `vite==5.0.8` - Build tool and dev server
- `@fontsource/montserrat==5.0.16` - Typography
## API Endpoints
### Main API Routes (`/api`)
| Method | Endpoint | Description | Request Body |
|--------|----------|-------------|--------------|
| `POST` | `/api/generate` | Start video generation | `{ prompt, model_name, video_length_sec, aspect_ratio, person_generation, image }` |
| `GET` | `/api/status/<job_id>` | Check generation status | - |
| `GET` | `/api/download/<job_id>` | Download completed video | - |
| `DELETE` | `/api/cleanup/<job_id>` | Manual cleanup of temp files | - |
### Health Check Routes
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/health` | Detailed health check with configuration info |
| `GET` | `/ping` | Simple ping response |
## Video Generation Flow
1. **User Input**: User provides prompt, optional image, and generation parameters
2. **Job Creation**: Backend creates unique job ID and initializes status tracking
3. **Image Processing** (if provided): Image validated, converted to JPEG, uploaded to GCS
4. **API Call**: Request sent to Google Gen AI SDK with Veo 3.0 model
5. **Polling**: Backend polls long-running operation every 30 seconds
6. **Progress Updates**: Frontend fetches status updates every 2 seconds
7. **Video Download**: Completed video downloaded from GCS to local temp storage
8. **User Download**: Video served to user and temporary files cleaned up
## Security
- CORS configured for specific frontend domain(s)
- Azure AD SSO authentication in production (bypassed in dev mode)
- Automatic cleanup of temporary files after download
- Service account with minimal required GCS permissions
- Secure headers in Apache configuration
- Backend service runs as non-root user in production
## Monitoring and Logging
### Backend Logs
```bash
# View systemd service logs (production)
sudo journalctl -u veo-video-generator -f
# View Flask app logs (development)
# Logs printed to terminal running app.py
```
### Frontend Logs
- Browser console for React errors
- Network tab for API request/response debugging
- Apache access logs: `/var/log/apache2/access.log`
### Usage Tracking
- Webhook integration sends generation requests to configured endpoint
- Tracks: user email, prompt, model, timestamp
- Can be disabled via `WEBHOOK_ENABLED=false`
## Troubleshooting
### Common Issues
| Issue | Possible Cause | Solution |
|-------|----------------|----------|
| **Authentication fails** | Azure AD misconfiguration | Verify `VITE_MSAL_CLIENT_ID`, `VITE_MSAL_AUTHORITY`, and redirect URIs match Azure AD app |
| **Backend connection error** | Service not running or CORS issue | Check `systemctl status veo-video-generator` and `FRONTEND_URL` in backend `.env` |
| **Video generation fails** | Invalid credentials or API access | Verify service account permissions and Veo 3.0 API is enabled in GCP |
| **Image upload rejected** | Invalid format or size | Ensure image is <10MB and meets minimum 720x720 resolution |
| **Download hangs** | GCS permission issue | Check service account has `storage.objects.get` permission on bucket |
| **Model not found** | Wrong region or model ID | Verify Veo 3.0 is available in specified `REGION` |
### Debug Mode
Enable detailed logging in development:
```bash
# Backend
FLASK_DEBUG=True in .env
# Frontend
Check browser console with React DevTools
```
## Development
### Local Development Setup
For local testing without authentication:
1. **Quick Start** (runs both backend and frontend):
```bash
./run-dev.sh
```
2. **Manual Start**:
**Backend** (Terminal 1):
```bash
cd backend
cp .env.development .env
python app.py
```
**Frontend** (Terminal 2):
```bash
cd frontend
npm run dev
```
### Development Features
- **Authentication Bypass**: MSAL/SSO automatically bypassed when `VITE_DEV_MODE=true`
- **CORS**: Configured for `localhost:3000` and `127.0.0.1:3000`
- **Hot Reload**: Vite dev server auto-reloads frontend on file changes
- **Debug Mode**: Flask runs with detailed error pages and auto-reload
- **Mock User**: Shows "Dev User" in the interface header
### Development URLs
- Backend API: `http://localhost:7394`
- Frontend: `http://localhost:3000`
- No authentication required in dev mode
## Additional Files
- **`user_docs.md`**: Comprehensive user documentation and feature guide
- **`CLAUDE.md`**: AI assistant guidance for working with this codebase
- **`extract_usage_logs.sh`**: Script for extracting usage data from webhook logs
- **`veo3.zip`**: Archive of production deployment artifacts
- **`.gitignore`**: Git exclusions (includes `.env`, `node_modules`, `temp_downloads`, etc.)
## Notes
- The original `veo.py` standalone script has been replaced by the full-stack application
- Number of videos per request is hard-coded to 1 (UI option removed)
- Generated videos are automatically cleaned up after download
- Image uploads are automatically converted to JPEG format regardless of input format
- The application uses in-memory job status tracking (consider Redis for production scaling)