Watch

No description

Python 56.8%
JavaScript 40.6%
Shell 2.4%
HTML 0.2%

Find a file

Repository files (latest commit first)
Filename	Latest commit message	Latest commit date
Manish Tanwar 618273610b Merge branch 'feature-branch'		2025-12-01 12:50:44 +05:30
backend	pip packages update	2025-12-01 12:44:45 +05:30
frontend	making single veo 3.1 model default	2025-11-15 06:00:07 +05:30
.gitignore	veo3.1 features	2025-11-04 02:31:40 +05:30
apache-htaccess.txt	initial commit	2025-09-30 09:49:55 -05:00
apache.conf	initial commit	2025-09-30 09:49:55 -05:00
CLAUDE.md	Latest Update	2025-12-01 12:29:22 +05:30
extract_usage_logs.sh	initial commit	2025-09-30 09:49:55 -05:00
README.md	Readme update for veo 3.1 model default	2025-11-15 06:07:15 +05:30
run-dev.sh	initial commit	2025-09-30 09:49:55 -05:00
video-generation-lifecycle.md	Unlimited Jobs + Job life-cycle	2025-10-11 00:07:33 +05:30

README.md

Veo 3.1 Video Generator

A full-stack web application for generating AI videos using Google's Veo 3.1 models. Generate videos from text prompts with advanced features including frame interpolation, reference images, and customizable parameters.

Quick Start

# Clone and navigate to project
cd veo3_poc

# Ensure service-account.json is in the root directory

# Run development servers (both frontend and backend)
./run-dev.sh

# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:7394

Architecture

Frontend: React 18 + Vite + Material-UI 5 + Montserrat typography
Backend: Flask 3.0 + Google Gen AI SDK 1.47.0
Authentication: Microsoft Azure AD SSO (MSAL 2.0)
Storage: Google Cloud Storage for temporary video and image files
Deployment: Systemd service + Apache reverse proxy

Features

Core Video Generation

Text-to-Video Generation: Create videos from descriptive text prompts
Image-to-Video Generation: Upload first frame images to guide video generation
Quad Model Support: Choose between four models:
- Veo 3.1 (Standard): High-quality with advanced features - $0.40/sec
- Veo 3.1 Fast: Optimized speed with frame interpolation - $0.15/sec

Veo 3.1 Advanced Features

Frame Interpolation: Upload both first and last frames to generate smooth transitions between them (8-second videos only)
Reference Images: Guide video content with up to 3 reference images for consistent characters, objects, or styles (16:9 aspect ratio, 8-second videos, Standard model only)
Conditional UI: Advanced features automatically appear/disappear based on selected model capabilities

Job Management

Multi-Video Generation: Generate 1-4 videos per request with batch processing
Unlimited Job Queue: Submit unlimited video generation jobs with FIFO processing
Advanced Job Management: Cancel, retry, and delete jobs with complete cleanup
Real-time Queue Visualization: Live status updates with three-section queue display

Customizable Parameters

Video length (4, 6, or 8 seconds)
Aspect ratio (16:9 landscape or 9:16 portrait)
Person generation policy (allow/don't allow)
Custom seed values for reproducible results
Audio generation toggle

Additional Features

Intelligent File Management: Auto-cleanup after download, comprehensive GCS cleanup
Usage Tracking: Webhook integration for monitoring generation requests
Development Mode: Local development with authentication bypass

Prerequisites

Python 3.13+ (or 3.8+)
Node.js 16+
Google Cloud Project with Veo 3.1 API access
Google Cloud Storage bucket
Service account JSON key with appropriate permissions
Microsoft Azure AD application configured (for production SSO)

Setup Instructions

Backend Setup

Navigate to the backend directory:
```
cd backend
```

Create and activate virtual environment:

python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows

Install dependencies:
```
pip install -r requirements.txt
```

Configure environment variables:

# For development
cp .env.development .env

# For production
cp .env.production .env
# Edit .env with your specific configuration if needed

Run in development:
```
python app.py
```

Frontend Setup

Navigate to the frontend directory:
```
cd frontend
```
Install dependencies:
```
npm install
```

Configure environment variables:

# For development
cp .env.development .env

# For production
cp .env.production .env
# Edit .env with your specific configuration if needed

Run in development:
```
npm run dev
```
Build for production:
```
npm run build
```

Production Deployment

Backend Deployment (systemd service)

Copy the backend files to your server
Update paths in veo-video-generator.service

Copy service file:

sudo cp veo-video-generator.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable veo-video-generator
sudo systemctl start veo-video-generator

Frontend Deployment

Build the frontend:
```
cd frontend
npm run build
```
Copy dist/ contents to your web server directory:
```
cp -r dist/* /path/to/your/web/server/veo/
```

Apache Configuration

Add the Apache configuration to your virtual host. Update paths as needed.

Required Apache Modules

Ensure these modules are enabled:

sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod rewrite
sudo a2enmod headers
sudo a2enmod expires
sudo systemctl restart apache2

Configuration Files

Main Apache Config: Use apache.conf for virtual host configuration
Frontend .htaccess: Copy apache-htaccess.txt to /path/to/your/web/server/veo/.htaccess

Project Structure

veo3_poc/
├── backend/                    # Flask backend application
│   ├── routes/                # API and health check endpoints
│   │   ├── api.py            # Main API routes (generate, status, download, cleanup)
│   │   └── health.py         # Health check endpoints
│   ├── utils/                 # Utility modules
│   │   ├── auth.py           # Google Cloud authentication
│   │   └── storage.py        # GCS operations and image processing
│   ├── app.py                # Flask app initialization and CORS config
│   ├── config.py             # Configuration management
│   ├── video_generator.py   # Core 3.1 integration logic
│   ├── requirements.txt      # Python dependencies
│   ├── .env.development      # Development environment config
│   ├── .env.production       # Production environment config
│   └── temp_downloads/       # Temporary video storage
├── frontend/                  # React frontend application
│   ├── src/
│   │   ├── components/       # React components
│   │   │   ├── VideoForm.jsx        # Main video generation form
│   │   │   ├── VideoGenerator.jsx   # Top-level container
│   │   │   ├── ProgressIndicator.jsx # Status display
│   │   │   ├── Layout.jsx           # App layout wrapper
│   │   │   ├── AuthGuard.jsx        # Authentication wrapper
│   │   │   └── DevAuthWrapper.jsx   # Dev mode auth bypass
│   │   ├── config/           # MSAL configuration
│   │   ├── services/         # API service layer
│   │   ├── hooks/            # Custom React hooks
│   │   └── App.jsx           # Main app component
│   ├── .env.development      # Development environment config
│   ├── .env.production       # Production environment config
│   └── package.json          # Node.js dependencies
├── service-account.json       # Google Cloud service account key
├── run-dev.sh                # Development startup script
├── apache.conf               # Apache virtual host configuration
├── apache-htaccess.txt       # Frontend .htaccess rules
└── veo-video-generator.service # Systemd service definition

Configuration

Environment File Structure

The application uses environment-specific configuration files:

Backend:

.env.development - Debug mode, localhost CORS, development settings
.env.production - Production mode, strict CORS, optimized for deployment
.env - Active environment file (copy from development or production)

Frontend:

.env.development - Localhost API, authentication bypass (VITE_DEV_MODE=true)
.env.production - Production API, MSAL authentication enabled (VITE_DEV_MODE=false)
.env - Active environment file (copy from development or production)

Backend Environment Variables

Variable	Description	Default/Example
`PROJECT_ID`	Google Cloud project ID	`optical-414516`
`REGION`	Google Cloud region	`us-central1`
`MODEL_ID`	Default Veo model identifier	`veo-3.0-generate-preview`
`MODEL_FAST_ID`	Default Veo Fast model identifier	`veo-3.0-fast-generate-preview`
`OUTPUT_GCS_BUCKET_NAME`	GCS bucket for temporary storage	`optical-veo3-test`
`SERVICE_ACCOUNT_KEY_PATH`	Path to service account JSON	`./service-account.json`
`PORT`	Backend server port	`7394`
`FLASK_ENV`	Environment mode	`development` or `production`
`FLASK_DEBUG`	Debug mode	`True` or `False`
`FRONTEND_URL`	Frontend URL for CORS	`http://localhost:3000` or production URL
`WEBHOOK_URL`	Usage tracking webhook URL	Optional
`WEBHOOK_ENABLED`	Enable usage tracking	`true` or `false`

Available Models:

veo-3.1-generate-preview - Veo 3.1 Standard (with advanced features)
veo-3.1-fast-generate-preview - Veo 3.1 Fast (frame interpolation only)

Frontend Environment Variables

Variable	Description	Example
`VITE_API_BASE_URL`	Backend API URL	`http://localhost:7394`
`VITE_APP_TITLE`	Application title	`Veo Video Generator (Dev)`
`VITE_DEV_MODE`	Development mode flag	`true` or `false`
`VITE_MSAL_CLIENT_ID`	Azure AD client ID	`dd434534-...`
`VITE_MSAL_AUTHORITY`	Azure AD authority URL	`https://login.microsoftonline.com/...`
`VITE_MSAL_REDIRECT_URI`	Authentication redirect URI	`http://localhost:3000`

Key Dependencies

Backend

flask==3.0.0 - Web framework
flask-cors==4.0.0 - Cross-origin resource sharing
google-genai==1.47.0 - Google Gen AI SDK for 3.1 (with advanced features support)
google-cloud-storage==2.12.0 - GCS file operations
google-cloud-aiplatform==1.38.0 - Vertex AI platform
hypercorn==0.15.0 - ASGI server for production
python-dotenv==1.0.0 - Environment configuration
Pillow==10.1.0 - Image processing and format conversion

Frontend

react==18.2.0 - UI framework
@mui/material==5.15.1 - Material-UI component library
@azure/msal-react==2.0.7 - Microsoft authentication
axios==1.6.2 - HTTP client
vite==5.0.8 - Build tool and dev server
@fontsource/montserrat==5.0.16 - Typography

API Endpoints

Main API Routes (`/api`)

Method	Endpoint	Description	Request Body
`POST`	`/api/generate`	Start video generation	`{ prompt, model_name, video_length_sec, aspect_ratio, person_generation, sampleCount, seed, generate_audio, image, lastFrame, referenceImage1, referenceImage2, referenceImage3 }`
`GET`	`/api/status/<job_id>`	Check generation status	-
`GET`	`/api/download/<job_id>`	Download completed content (auto-deletes job)	-
`GET`	`/api/download/<job_id>/video/<index>`	Download individual video	-
`GET`	`/api/user-jobs`	Get all jobs for user	Query: `user_email`
`GET`	`/api/queue-status`	Get overall queue status	-
`POST`	`/api/cancel/<job_id>`	Cancel queued/processing job	-
`POST`	`/api/retry/<job_id>`	Retry failed/cancelled job	-
`DELETE`	`/api/delete/<job_id>`	Delete job completely	-
`DELETE`	`/api/cleanup/<job_id>`	Manual cleanup of temp files	-

Veo 3.1 Image Parameters:

image - First frame image (optional, all models)
lastFrame - Last frame image for interpolation (optional, Veo 3.1 only, requires 8-second duration)
referenceImage1, referenceImage2, referenceImage3 - Reference images for content guidance (optional, Veo 3.1 Standard only, requires 16:9 aspect ratio and 8-second duration)

Health Check Routes

Method	Endpoint	Description
`GET`	`/health`	Detailed health check with configuration info
`GET`	`/ping`	Simple ping response

Video Generation Lifecycle

Job Submission & Queuing

User Input: User provides prompt, optional images (first frame, last frame, reference images), and generation parameters (1-4 videos)
Job Creation: Backend creates unique job ID and validates parameters including Veo 3.1 feature constraints
Queue Management: Job added to global FIFO queue (unlimited per user)
Queue Position: Job displayed in "In Queue" section with position indicator

Processing Pipeline

Queue Processing: Background thread picks next job when processing slot available (max 2 concurrent)
Status Transition: Job moves from "In Queue" to "Currently Processing" section
Image Processing (if provided):
- First frame: Validated, converted to JPEG, uploaded to GCS (all models)
- Last frame: Processed for frame interpolation (Veo 3.1 only)
- Reference images: Up to 3 images processed for content guidance (Veo 3.1 Standard only)
API Calls: Multiple requests sent to Google Gen AI SDK with appropriate parameters for selected model
Backend Polling: Long-running operations polled every 30 seconds with retry logic
Progress Updates: Frontend polls status every 2 seconds for real-time updates

Completion & Cleanup

Video Download: Completed videos downloaded from GCS to local temp storage
File Packaging: Multiple videos and images packaged into downloadable zip
User Download: Videos served to user with multiple download options
Auto-cleanup: Job automatically deleted 5 seconds after successful download

Job Management Actions

Cancel: Remove from queue or stop active processing
Retry: Re-queue failed/cancelled jobs with original parameters
Delete: Complete removal of job data, local files, and GCS resources
Download Options: Individual videos or complete zip package

Security

CORS configured for specific frontend domain(s)
Azure AD SSO authentication in production (bypassed in dev mode)
Automatic cleanup of temporary files after download
Service account with minimal required GCS permissions
Secure headers in Apache configuration
Backend service runs as non-root user in production

Monitoring and Logging

Backend Logs

# View systemd service logs (production)
sudo journalctl -u veo-video-generator -f

# View Flask app logs (development)
# Logs printed to terminal running app.py

Frontend Logs

Browser console for React errors
Network tab for API request/response debugging
Apache access logs: /var/log/apache2/access.log

Usage Tracking

Webhook integration sends generation requests to configured endpoint
Tracks: user email, prompt, model, timestamp
Can be disabled via WEBHOOK_ENABLED=false

Troubleshooting

Common Issues

Issue	Possible Cause	Solution
Authentication fails	Azure AD misconfiguration	Verify `VITE_MSAL_CLIENT_ID`, `VITE_MSAL_AUTHORITY`, and redirect URIs match Azure AD app
Backend connection error	Service not running or CORS issue	Check `systemctl status veo-video-generator` and `FRONTEND_URL` in backend `.env`
Video generation fails	Invalid credentials or API access	Verify service account permissions and Veo 3.1 APIs are enabled in GCP
Image upload rejected	Invalid format or size	Ensure image is <10MB and meets minimum 720x720 resolution
Download hangs	GCS permission issue	Check service account has `storage.objects.get` permission on bucket
Model not found	Wrong region or model ID	Verify Veo 3.1 is available in specified `REGION`
Reference images fail	Wrong model or constraints	Reference images require Veo 3.1 Standard model, 16:9 aspect ratio, and 8-second duration
Last frame fails	Wrong constraints	Last frame interpolation requires Veo 3.1 model (Standard or Fast) and 8-second duration
SDK parameter error	Outdated SDK version	Ensure `google-genai>=1.47.0` is installed for Veo 3.1 features

Veo 3.1 Feature Requirements

Frame Interpolation (Last Frame):

✅ Supported models: veo-3.1-generate-preview, veo-3.1-fast-generate-preview
✅ Required duration: 8 seconds
✅ Supported aspect ratios: 16:9, 9:16

Reference Images:

✅ Supported model: veo-3.1-generate-preview (Standard only, NOT Fast)
✅ Required duration: 8 seconds
✅ Required aspect ratio: 16:9 only
✅ Maximum images: 3 reference images
❌ Not supported in: Veo 3.1 Fast

Debug Mode

Enable detailed logging in development:

# Backend
FLASK_DEBUG=True in .env

# Frontend
Check browser console with React DevTools

Development

Local Development Setup

For local testing without authentication:

Quick Start (runs both backend and frontend):
```
./run-dev.sh
```

Manual Start:

Backend (Terminal 1):

cd backend
cp .env.development .env
python app.py

Frontend (Terminal 2):

cd frontend
npm run dev

Development Features

Authentication Bypass: MSAL/SSO automatically bypassed when VITE_DEV_MODE=true
CORS: Configured for localhost:3000 and 127.0.0.1:3000
Hot Reload: Vite dev server auto-reloads frontend on file changes
Debug Mode: Flask runs with detailed error pages and auto-reload
Mock User: Shows "Dev User" in the interface header

Development URLs

Backend API: http://localhost:7394
Frontend: http://localhost:3000
No authentication required in dev mode

Additional Files

user_docs.md: Comprehensive user documentation and feature guide
CLAUDE.md: AI assistant guidance for working with this codebase
extract_usage_logs.sh: Script for extracting usage data from webhook logs
veo3.zip: Archive of production deployment artifacts
.gitignore: Git exclusions (includes .env, node_modules, temp_downloads, etc.)

Video Generation Architecture

Job Queue System

Global Queue: FIFO processing with unlimited submissions per user
Concurrent Processing: Maximum 2 jobs processing simultaneously
Status Tracking: In-memory job status dictionary (consider Redis for scaling)
User Limits: No queue limits, but 1-4 videos per individual request

Queue Display Sections

Currently Processing: Jobs actively generating videos (highlighted in blue)
In Queue: Jobs waiting for processing slots (highlighted in orange)
History: Completed, failed, or cancelled jobs (standard styling)

File Management

Local Storage: temp_downloads/job_{job_id}/ for each job
GCS Integration: Temporary images uploaded to temp_images/ bucket path
Auto-cleanup: Jobs deleted 5 seconds after successful download
Manual Cleanup: Complete job deletion via delete button
Download Formats: Individual MP4s or complete ZIP packages

Job Actions by Status

Queued: Cancel, Delete
Processing: Cancel, Delete
Failed/Cancelled: Retry, Delete
Completed: Download All, Download Individual Videos (auto-deletes after download)

Notes

The original veo.py standalone script has been replaced by the full-stack application
Quad model support: Veo 3.1 (Standard & Fast)
Veo 3.1 advanced features: Frame interpolation and reference images with conditional UI
Multi-video generation support (1-4 videos per request)
Unlimited job submissions with intelligent queue management
Complete job lifecycle management with cancel/retry/delete functionality
Generated videos are automatically cleaned up after download
Image uploads are automatically converted to JPEG format regardless of input format
The application uses in-memory job status tracking (consider Redis for production scaling)
SDK upgraded to google-genai==1.47.0 for Veo 3.1 feature support