veo3.1 features readme update
This commit is contained in:
parent
146f032f1d
commit
496ebfdf1b
1 changed files with 75 additions and 26 deletions
101
README.md
101
README.md
|
|
@ -1,6 +1,6 @@
|
|||
# Veo 3.0 Video Generator
|
||||
# Veo 3.0 & 3.1 Video Generator
|
||||
|
||||
A full-stack web application for generating AI videos using Google's Veo 3.0 model. Generate videos from text prompts or reference images with customizable parameters including video length, aspect ratio, and person generation settings.
|
||||
A full-stack web application for generating AI videos using Google's Veo 3.0 and Veo 3.1 models. Generate videos from text prompts with advanced features including frame interpolation, reference images, and customizable parameters.
|
||||
|
||||
## Quick Start
|
||||
|
||||
|
|
@ -21,26 +21,41 @@ cd veo3_poc
|
|||
## Architecture
|
||||
|
||||
- **Frontend**: React 18 + Vite + Material-UI 5 + Montserrat typography
|
||||
- **Backend**: Flask 3.0 + Google Gen AI SDK 1.17.0
|
||||
- **Backend**: Flask 3.0 + Google Gen AI SDK 1.47.0
|
||||
- **Authentication**: Microsoft Azure AD SSO (MSAL 2.0)
|
||||
- **Storage**: Google Cloud Storage for temporary video files
|
||||
- **Storage**: Google Cloud Storage for temporary video and image files
|
||||
- **Deployment**: Systemd service + Apache reverse proxy
|
||||
|
||||
## Features
|
||||
|
||||
### Core Video Generation
|
||||
- **Text-to-Video Generation**: Create videos from descriptive text prompts
|
||||
- **Image-to-Video Generation**: Upload reference images to guide video generation
|
||||
- **Dual Model Support**: Choose between Veo 3.0 (high-quality) and Veo 3.0 Fast (optimized for speed)
|
||||
- **Image-to-Video Generation**: Upload first frame images to guide video generation
|
||||
- **Quad Model Support**: Choose between four models:
|
||||
- **Veo 3.1** (Standard): High-quality with advanced features - $0.40/sec
|
||||
- **Veo 3.1 Fast**: Optimized speed with frame interpolation - $0.15/sec
|
||||
- **Veo 3.0** (Standard): Proven high-quality generation - $0.40/sec
|
||||
- **Veo 3.0 Fast**: Optimized for speed and cost - $0.15/sec
|
||||
|
||||
### Veo 3.1 Advanced Features
|
||||
- **Frame Interpolation**: Upload both first and last frames to generate smooth transitions between them (8-second videos only)
|
||||
- **Reference Images**: Guide video content with up to 3 reference images for consistent characters, objects, or styles (16:9 aspect ratio, 8-second videos, Standard model only)
|
||||
- **Conditional UI**: Advanced features automatically appear/disappear based on selected model capabilities
|
||||
|
||||
### Job Management
|
||||
- **Multi-Video Generation**: Generate 1-4 videos per request with batch processing
|
||||
- **Unlimited Job Queue**: Submit unlimited video generation jobs with FIFO processing
|
||||
- **Advanced Job Management**: Cancel, retry, and delete jobs with complete cleanup
|
||||
- **Real-time Queue Visualization**: Live status updates with three-section queue display
|
||||
- **Customizable Parameters**:
|
||||
- Video length (4, 6, or 8 seconds)
|
||||
- Aspect ratio (16:9 landscape or 9:16 portrait)
|
||||
- Person generation policy (allow/don't allow)
|
||||
- Custom seed values for reproducible results
|
||||
- Audio generation toggle
|
||||
|
||||
### Customizable Parameters
|
||||
- Video length (4, 6, or 8 seconds)
|
||||
- Aspect ratio (16:9 landscape or 9:16 portrait)
|
||||
- Person generation policy (allow/don't allow)
|
||||
- Custom seed values for reproducible results
|
||||
- Audio generation toggle
|
||||
|
||||
### Additional Features
|
||||
- **Intelligent File Management**: Auto-cleanup after download, comprehensive GCS cleanup
|
||||
- **Usage Tracking**: Webhook integration for monitoring generation requests
|
||||
- **Development Mode**: Local development with authentication bypass
|
||||
|
|
@ -49,7 +64,7 @@ cd veo3_poc
|
|||
|
||||
- Python 3.13+ (or 3.8+)
|
||||
- Node.js 16+
|
||||
- Google Cloud Project with Veo 3.0 API access
|
||||
- Google Cloud Project with Veo 3.0 and Veo 3.1 API access
|
||||
- Google Cloud Storage bucket
|
||||
- Service account JSON key with appropriate permissions
|
||||
- Microsoft Azure AD application configured (for production SSO)
|
||||
|
|
@ -232,10 +247,10 @@ The application uses environment-specific configuration files:
|
|||
|----------|-------------|-----------------|
|
||||
| `PROJECT_ID` | Google Cloud project ID | `optical-414516` |
|
||||
| `REGION` | Google Cloud region | `us-central1` |
|
||||
| `MODEL_ID` | Veo model identifier | `veo-3.0-generate-preview` |
|
||||
| `MODEL_FAST_ID` | Veo Fast model identifier | `veo-3.0-fast-generate-preview` |
|
||||
| `MODEL_ID` | Default Veo model identifier | `veo-3.0-generate-preview` |
|
||||
| `MODEL_FAST_ID` | Default Veo Fast model identifier | `veo-3.0-fast-generate-preview` |
|
||||
| `OUTPUT_GCS_BUCKET_NAME` | GCS bucket for temporary storage | `optical-veo3-test` |
|
||||
| `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `../service-account.json` |
|
||||
| `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `./service-account.json` |
|
||||
| `PORT` | Backend server port | `7394` |
|
||||
| `FLASK_ENV` | Environment mode | `development` or `production` |
|
||||
| `FLASK_DEBUG` | Debug mode | `True` or `False` |
|
||||
|
|
@ -243,6 +258,12 @@ The application uses environment-specific configuration files:
|
|||
| `WEBHOOK_URL` | Usage tracking webhook URL | Optional |
|
||||
| `WEBHOOK_ENABLED` | Enable usage tracking | `true` or `false` |
|
||||
|
||||
**Available Models:**
|
||||
- `veo-3.0-generate-preview` - Veo 3.0 Standard
|
||||
- `veo-3.0-fast-generate-preview` - Veo 3.0 Fast
|
||||
- `veo-3.1-generate-preview` - Veo 3.1 Standard (with advanced features)
|
||||
- `veo-3.1-fast-generate-preview` - Veo 3.1 Fast (frame interpolation only)
|
||||
|
||||
### Frontend Environment Variables
|
||||
|
||||
| Variable | Description | Example |
|
||||
|
|
@ -259,12 +280,12 @@ The application uses environment-specific configuration files:
|
|||
### Backend
|
||||
- `flask==3.0.0` - Web framework
|
||||
- `flask-cors==4.0.0` - Cross-origin resource sharing
|
||||
- `google-genai==1.17.0` - Google Gen AI SDK for Veo 3.0
|
||||
- `google-genai==1.47.0` - Google Gen AI SDK for Veo 3.0 & 3.1 (with advanced features support)
|
||||
- `google-cloud-storage==2.12.0` - GCS file operations
|
||||
- `google-cloud-aiplatform==1.38.0` - Vertex AI platform
|
||||
- `hypercorn==0.15.0` - ASGI server for production
|
||||
- `python-dotenv==1.0.0` - Environment configuration
|
||||
- `Pillow==10.1.0` - Image processing
|
||||
- `Pillow==10.1.0` - Image processing and format conversion
|
||||
|
||||
### Frontend
|
||||
- `react==18.2.0` - UI framework
|
||||
|
|
@ -280,7 +301,7 @@ The application uses environment-specific configuration files:
|
|||
|
||||
| Method | Endpoint | Description | Request Body |
|
||||
|--------|----------|-------------|--------------|
|
||||
| `POST` | `/api/generate` | Start video generation | `{ prompt, model_name, video_length_sec, aspect_ratio, person_generation, sampleCount, seed, generate_audio, image }` |
|
||||
| `POST` | `/api/generate` | Start video generation | `{ prompt, model_name, video_length_sec, aspect_ratio, person_generation, sampleCount, seed, generate_audio, image, lastFrame, referenceImage1, referenceImage2, referenceImage3 }` |
|
||||
| `GET` | `/api/status/<job_id>` | Check generation status | - |
|
||||
| `GET` | `/api/download/<job_id>` | Download completed content (auto-deletes job) | - |
|
||||
| `GET` | `/api/download/<job_id>/video/<index>` | Download individual video | - |
|
||||
|
|
@ -291,6 +312,11 @@ The application uses environment-specific configuration files:
|
|||
| `DELETE` | `/api/delete/<job_id>` | Delete job completely | - |
|
||||
| `DELETE` | `/api/cleanup/<job_id>` | Manual cleanup of temp files | - |
|
||||
|
||||
**Veo 3.1 Image Parameters:**
|
||||
- `image` - First frame image (optional, all models)
|
||||
- `lastFrame` - Last frame image for interpolation (optional, Veo 3.1 only, requires 8-second duration)
|
||||
- `referenceImage1`, `referenceImage2`, `referenceImage3` - Reference images for content guidance (optional, Veo 3.1 Standard only, requires 16:9 aspect ratio and 8-second duration)
|
||||
|
||||
### Health Check Routes
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|
|
@ -301,16 +327,19 @@ The application uses environment-specific configuration files:
|
|||
## Video Generation Lifecycle
|
||||
|
||||
### Job Submission & Queuing
|
||||
1. **User Input**: User provides prompt, optional image, and generation parameters (1-4 videos)
|
||||
2. **Job Creation**: Backend creates unique job ID and validates parameters
|
||||
1. **User Input**: User provides prompt, optional images (first frame, last frame, reference images), and generation parameters (1-4 videos)
|
||||
2. **Job Creation**: Backend creates unique job ID and validates parameters including Veo 3.1 feature constraints
|
||||
3. **Queue Management**: Job added to global FIFO queue (unlimited per user)
|
||||
4. **Queue Position**: Job displayed in "In Queue" section with position indicator
|
||||
|
||||
### Processing Pipeline
|
||||
5. **Queue Processing**: Background thread picks next job when processing slot available (max 2 concurrent)
|
||||
6. **Status Transition**: Job moves from "In Queue" to "Currently Processing" section
|
||||
7. **Image Processing** (if provided): Image validated, converted to JPEG, uploaded to GCS
|
||||
8. **API Calls**: Multiple requests sent to Google Gen AI SDK for requested video count
|
||||
7. **Image Processing** (if provided):
|
||||
- First frame: Validated, converted to JPEG, uploaded to GCS (all models)
|
||||
- Last frame: Processed for frame interpolation (Veo 3.1 only)
|
||||
- Reference images: Up to 3 images processed for content guidance (Veo 3.1 Standard only)
|
||||
8. **API Calls**: Multiple requests sent to Google Gen AI SDK with appropriate parameters for selected model
|
||||
9. **Backend Polling**: Long-running operations polled every 30 seconds with retry logic
|
||||
10. **Progress Updates**: Frontend polls status every 2 seconds for real-time updates
|
||||
|
||||
|
|
@ -364,10 +393,27 @@ sudo journalctl -u veo-video-generator -f
|
|||
|-------|----------------|----------|
|
||||
| **Authentication fails** | Azure AD misconfiguration | Verify `VITE_MSAL_CLIENT_ID`, `VITE_MSAL_AUTHORITY`, and redirect URIs match Azure AD app |
|
||||
| **Backend connection error** | Service not running or CORS issue | Check `systemctl status veo-video-generator` and `FRONTEND_URL` in backend `.env` |
|
||||
| **Video generation fails** | Invalid credentials or API access | Verify service account permissions and Veo 3.0 API is enabled in GCP |
|
||||
| **Video generation fails** | Invalid credentials or API access | Verify service account permissions and Veo 3.0/3.1 APIs are enabled in GCP |
|
||||
| **Image upload rejected** | Invalid format or size | Ensure image is <10MB and meets minimum 720x720 resolution |
|
||||
| **Download hangs** | GCS permission issue | Check service account has `storage.objects.get` permission on bucket |
|
||||
| **Model not found** | Wrong region or model ID | Verify Veo 3.0 is available in specified `REGION` |
|
||||
| **Model not found** | Wrong region or model ID | Verify Veo 3.0/3.1 is available in specified `REGION` |
|
||||
| **Reference images fail** | Wrong model or constraints | Reference images require Veo 3.1 Standard model, 16:9 aspect ratio, and 8-second duration |
|
||||
| **Last frame fails** | Wrong constraints | Last frame interpolation requires Veo 3.1 model (Standard or Fast) and 8-second duration |
|
||||
| **SDK parameter error** | Outdated SDK version | Ensure `google-genai>=1.47.0` is installed for Veo 3.1 features |
|
||||
|
||||
### Veo 3.1 Feature Requirements
|
||||
|
||||
**Frame Interpolation (Last Frame):**
|
||||
- ✅ Supported models: `veo-3.1-generate-preview`, `veo-3.1-fast-generate-preview`
|
||||
- ✅ Required duration: 8 seconds
|
||||
- ✅ Supported aspect ratios: 16:9, 9:16
|
||||
|
||||
**Reference Images:**
|
||||
- ✅ Supported model: `veo-3.1-generate-preview` (Standard only, NOT Fast)
|
||||
- ✅ Required duration: 8 seconds
|
||||
- ✅ Required aspect ratio: 16:9 only
|
||||
- ✅ Maximum images: 3 reference images
|
||||
- ❌ Not supported in: Veo 3.1 Fast, Veo 3.0 models
|
||||
|
||||
### Debug Mode
|
||||
|
||||
|
|
@ -457,9 +503,12 @@ For local testing without authentication:
|
|||
## Notes
|
||||
|
||||
- The original `veo.py` standalone script has been replaced by the full-stack application
|
||||
- **Quad model support**: Veo 3.0 (Standard & Fast) and Veo 3.1 (Standard & Fast)
|
||||
- **Veo 3.1 advanced features**: Frame interpolation and reference images with conditional UI
|
||||
- Multi-video generation support (1-4 videos per request)
|
||||
- Unlimited job submissions with intelligent queue management
|
||||
- Complete job lifecycle management with cancel/retry/delete functionality
|
||||
- Generated videos are automatically cleaned up after download
|
||||
- Image uploads are automatically converted to JPEG format regardless of input format
|
||||
- The application uses in-memory job status tracking (consider Redis for production scaling)
|
||||
- The application uses in-memory job status tracking (consider Redis for production scaling)
|
||||
- SDK upgraded to `google-genai==1.47.0` for Veo 3.1 feature support
|
||||
Loading…
Add table
Reference in a new issue