No description
Find a file
2025-12-01 12:50:44 +05:30
backend pip packages update 2025-12-01 12:44:45 +05:30
frontend making single veo 3.1 model default 2025-11-15 06:00:07 +05:30
.gitignore veo3.1 features 2025-11-04 02:31:40 +05:30
apache-htaccess.txt initial commit 2025-09-30 09:49:55 -05:00
apache.conf initial commit 2025-09-30 09:49:55 -05:00
CLAUDE.md Latest Update 2025-12-01 12:29:22 +05:30
extract_usage_logs.sh initial commit 2025-09-30 09:49:55 -05:00
README.md Readme update for veo 3.1 model default 2025-11-15 06:07:15 +05:30
run-dev.sh initial commit 2025-09-30 09:49:55 -05:00
video-generation-lifecycle.md Unlimited Jobs + Job life-cycle 2025-10-11 00:07:33 +05:30

Veo 3.1 Video Generator

A full-stack web application for generating AI videos using Google's Veo 3.1 models. Generate videos from text prompts with advanced features including frame interpolation, reference images, and customizable parameters.

Quick Start

# Clone and navigate to project
cd veo3_poc

# Ensure service-account.json is in the root directory

# Run development servers (both frontend and backend)
./run-dev.sh

# Access the application
# Frontend: http://localhost:3000
# Backend API: http://localhost:7394

Architecture

  • Frontend: React 18 + Vite + Material-UI 5 + Montserrat typography
  • Backend: Flask 3.0 + Google Gen AI SDK 1.47.0
  • Authentication: Microsoft Azure AD SSO (MSAL 2.0)
  • Storage: Google Cloud Storage for temporary video and image files
  • Deployment: Systemd service + Apache reverse proxy

Features

Core Video Generation

  • Text-to-Video Generation: Create videos from descriptive text prompts
  • Image-to-Video Generation: Upload first frame images to guide video generation
  • Quad Model Support: Choose between four models:
    • Veo 3.1 (Standard): High-quality with advanced features - $0.40/sec
    • Veo 3.1 Fast: Optimized speed with frame interpolation - $0.15/sec

Veo 3.1 Advanced Features

  • Frame Interpolation: Upload both first and last frames to generate smooth transitions between them (8-second videos only)
  • Reference Images: Guide video content with up to 3 reference images for consistent characters, objects, or styles (16:9 aspect ratio, 8-second videos, Standard model only)
  • Conditional UI: Advanced features automatically appear/disappear based on selected model capabilities

Job Management

  • Multi-Video Generation: Generate 1-4 videos per request with batch processing
  • Unlimited Job Queue: Submit unlimited video generation jobs with FIFO processing
  • Advanced Job Management: Cancel, retry, and delete jobs with complete cleanup
  • Real-time Queue Visualization: Live status updates with three-section queue display

Customizable Parameters

  • Video length (4, 6, or 8 seconds)
  • Aspect ratio (16:9 landscape or 9:16 portrait)
  • Person generation policy (allow/don't allow)
  • Custom seed values for reproducible results
  • Audio generation toggle

Additional Features

  • Intelligent File Management: Auto-cleanup after download, comprehensive GCS cleanup
  • Usage Tracking: Webhook integration for monitoring generation requests
  • Development Mode: Local development with authentication bypass

Prerequisites

  • Python 3.13+ (or 3.8+)
  • Node.js 16+
  • Google Cloud Project with Veo 3.1 API access
  • Google Cloud Storage bucket
  • Service account JSON key with appropriate permissions
  • Microsoft Azure AD application configured (for production SSO)

Setup Instructions

Backend Setup

  1. Navigate to the backend directory:

    cd backend
    
  2. Create and activate virtual environment:

    python -m venv venv
    source venv/bin/activate  # Linux/Mac
    # or
    venv\Scripts\activate  # Windows
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Configure environment variables:

    # For development
    cp .env.development .env
    
    # For production
    cp .env.production .env
    # Edit .env with your specific configuration if needed
    
  5. Run in development:

    python app.py
    

Frontend Setup

  1. Navigate to the frontend directory:

    cd frontend
    
  2. Install dependencies:

    npm install
    
  3. Configure environment variables:

    # For development
    cp .env.development .env
    
    # For production
    cp .env.production .env
    # Edit .env with your specific configuration if needed
    
  4. Run in development:

    npm run dev
    
  5. Build for production:

    npm run build
    

Production Deployment

Backend Deployment (systemd service)

  1. Copy the backend files to your server
  2. Update paths in veo-video-generator.service
  3. Copy service file:
    sudo cp veo-video-generator.service /etc/systemd/system/
    sudo systemctl daemon-reload
    sudo systemctl enable veo-video-generator
    sudo systemctl start veo-video-generator
    

Frontend Deployment

  1. Build the frontend:

    cd frontend
    npm run build
    
  2. Copy dist/ contents to your web server directory:

    cp -r dist/* /path/to/your/web/server/veo/
    

Apache Configuration

Add the Apache configuration to your virtual host. Update paths as needed.

Required Apache Modules

Ensure these modules are enabled:

sudo a2enmod proxy
sudo a2enmod proxy_http
sudo a2enmod rewrite
sudo a2enmod headers
sudo a2enmod expires
sudo systemctl restart apache2

Configuration Files

  1. Main Apache Config: Use apache.conf for virtual host configuration
  2. Frontend .htaccess: Copy apache-htaccess.txt to /path/to/your/web/server/veo/.htaccess

Project Structure

veo3_poc/
├── backend/                    # Flask backend application
│   ├── routes/                # API and health check endpoints
│   │   ├── api.py            # Main API routes (generate, status, download, cleanup)
│   │   └── health.py         # Health check endpoints
│   ├── utils/                 # Utility modules
│   │   ├── auth.py           # Google Cloud authentication
│   │   └── storage.py        # GCS operations and image processing
│   ├── app.py                # Flask app initialization and CORS config
│   ├── config.py             # Configuration management
│   ├── video_generator.py   # Core 3.1 integration logic
│   ├── requirements.txt      # Python dependencies
│   ├── .env.development      # Development environment config
│   ├── .env.production       # Production environment config
│   └── temp_downloads/       # Temporary video storage
├── frontend/                  # React frontend application
│   ├── src/
│   │   ├── components/       # React components
│   │   │   ├── VideoForm.jsx        # Main video generation form
│   │   │   ├── VideoGenerator.jsx   # Top-level container
│   │   │   ├── ProgressIndicator.jsx # Status display
│   │   │   ├── Layout.jsx           # App layout wrapper
│   │   │   ├── AuthGuard.jsx        # Authentication wrapper
│   │   │   └── DevAuthWrapper.jsx   # Dev mode auth bypass
│   │   ├── config/           # MSAL configuration
│   │   ├── services/         # API service layer
│   │   ├── hooks/            # Custom React hooks
│   │   └── App.jsx           # Main app component
│   ├── .env.development      # Development environment config
│   ├── .env.production       # Production environment config
│   └── package.json          # Node.js dependencies
├── service-account.json       # Google Cloud service account key
├── run-dev.sh                # Development startup script
├── apache.conf               # Apache virtual host configuration
├── apache-htaccess.txt       # Frontend .htaccess rules
└── veo-video-generator.service # Systemd service definition

Configuration

Environment File Structure

The application uses environment-specific configuration files:

Backend:

  • .env.development - Debug mode, localhost CORS, development settings
  • .env.production - Production mode, strict CORS, optimized for deployment
  • .env - Active environment file (copy from development or production)

Frontend:

  • .env.development - Localhost API, authentication bypass (VITE_DEV_MODE=true)
  • .env.production - Production API, MSAL authentication enabled (VITE_DEV_MODE=false)
  • .env - Active environment file (copy from development or production)

Backend Environment Variables

Variable Description Default/Example
PROJECT_ID Google Cloud project ID optical-414516
REGION Google Cloud region us-central1
MODEL_ID Default Veo model identifier veo-3.0-generate-preview
MODEL_FAST_ID Default Veo Fast model identifier veo-3.0-fast-generate-preview
OUTPUT_GCS_BUCKET_NAME GCS bucket for temporary storage optical-veo3-test
SERVICE_ACCOUNT_KEY_PATH Path to service account JSON ./service-account.json
PORT Backend server port 7394
FLASK_ENV Environment mode development or production
FLASK_DEBUG Debug mode True or False
FRONTEND_URL Frontend URL for CORS http://localhost:3000 or production URL
WEBHOOK_URL Usage tracking webhook URL Optional
WEBHOOK_ENABLED Enable usage tracking true or false

Available Models:

  • veo-3.1-generate-preview - Veo 3.1 Standard (with advanced features)
  • veo-3.1-fast-generate-preview - Veo 3.1 Fast (frame interpolation only)

Frontend Environment Variables

Variable Description Example
VITE_API_BASE_URL Backend API URL http://localhost:7394
VITE_APP_TITLE Application title Veo Video Generator (Dev)
VITE_DEV_MODE Development mode flag true or false
VITE_MSAL_CLIENT_ID Azure AD client ID dd434534-...
VITE_MSAL_AUTHORITY Azure AD authority URL https://login.microsoftonline.com/...
VITE_MSAL_REDIRECT_URI Authentication redirect URI http://localhost:3000

Key Dependencies

Backend

  • flask==3.0.0 - Web framework
  • flask-cors==4.0.0 - Cross-origin resource sharing
  • google-genai==1.47.0 - Google Gen AI SDK for 3.1 (with advanced features support)
  • google-cloud-storage==2.12.0 - GCS file operations
  • google-cloud-aiplatform==1.38.0 - Vertex AI platform
  • hypercorn==0.15.0 - ASGI server for production
  • python-dotenv==1.0.0 - Environment configuration
  • Pillow==10.1.0 - Image processing and format conversion

Frontend

  • react==18.2.0 - UI framework
  • @mui/material==5.15.1 - Material-UI component library
  • @azure/msal-react==2.0.7 - Microsoft authentication
  • axios==1.6.2 - HTTP client
  • vite==5.0.8 - Build tool and dev server
  • @fontsource/montserrat==5.0.16 - Typography

API Endpoints

Main API Routes (/api)

Method Endpoint Description Request Body
POST /api/generate Start video generation { prompt, model_name, video_length_sec, aspect_ratio, person_generation, sampleCount, seed, generate_audio, image, lastFrame, referenceImage1, referenceImage2, referenceImage3 }
GET /api/status/<job_id> Check generation status -
GET /api/download/<job_id> Download completed content (auto-deletes job) -
GET /api/download/<job_id>/video/<index> Download individual video -
GET /api/user-jobs Get all jobs for user Query: user_email
GET /api/queue-status Get overall queue status -
POST /api/cancel/<job_id> Cancel queued/processing job -
POST /api/retry/<job_id> Retry failed/cancelled job -
DELETE /api/delete/<job_id> Delete job completely -
DELETE /api/cleanup/<job_id> Manual cleanup of temp files -

Veo 3.1 Image Parameters:

  • image - First frame image (optional, all models)
  • lastFrame - Last frame image for interpolation (optional, Veo 3.1 only, requires 8-second duration)
  • referenceImage1, referenceImage2, referenceImage3 - Reference images for content guidance (optional, Veo 3.1 Standard only, requires 16:9 aspect ratio and 8-second duration)

Health Check Routes

Method Endpoint Description
GET /health Detailed health check with configuration info
GET /ping Simple ping response

Video Generation Lifecycle

Job Submission & Queuing

  1. User Input: User provides prompt, optional images (first frame, last frame, reference images), and generation parameters (1-4 videos)
  2. Job Creation: Backend creates unique job ID and validates parameters including Veo 3.1 feature constraints
  3. Queue Management: Job added to global FIFO queue (unlimited per user)
  4. Queue Position: Job displayed in "In Queue" section with position indicator

Processing Pipeline

  1. Queue Processing: Background thread picks next job when processing slot available (max 2 concurrent)
  2. Status Transition: Job moves from "In Queue" to "Currently Processing" section
  3. Image Processing (if provided):
    • First frame: Validated, converted to JPEG, uploaded to GCS (all models)
    • Last frame: Processed for frame interpolation (Veo 3.1 only)
    • Reference images: Up to 3 images processed for content guidance (Veo 3.1 Standard only)
  4. API Calls: Multiple requests sent to Google Gen AI SDK with appropriate parameters for selected model
  5. Backend Polling: Long-running operations polled every 30 seconds with retry logic
  6. Progress Updates: Frontend polls status every 2 seconds for real-time updates

Completion & Cleanup

  1. Video Download: Completed videos downloaded from GCS to local temp storage
  2. File Packaging: Multiple videos and images packaged into downloadable zip
  3. User Download: Videos served to user with multiple download options
  4. Auto-cleanup: Job automatically deleted 5 seconds after successful download

Job Management Actions

  • Cancel: Remove from queue or stop active processing
  • Retry: Re-queue failed/cancelled jobs with original parameters
  • Delete: Complete removal of job data, local files, and GCS resources
  • Download Options: Individual videos or complete zip package

Security

  • CORS configured for specific frontend domain(s)
  • Azure AD SSO authentication in production (bypassed in dev mode)
  • Automatic cleanup of temporary files after download
  • Service account with minimal required GCS permissions
  • Secure headers in Apache configuration
  • Backend service runs as non-root user in production

Monitoring and Logging

Backend Logs

# View systemd service logs (production)
sudo journalctl -u veo-video-generator -f

# View Flask app logs (development)
# Logs printed to terminal running app.py

Frontend Logs

  • Browser console for React errors
  • Network tab for API request/response debugging
  • Apache access logs: /var/log/apache2/access.log

Usage Tracking

  • Webhook integration sends generation requests to configured endpoint
  • Tracks: user email, prompt, model, timestamp
  • Can be disabled via WEBHOOK_ENABLED=false

Troubleshooting

Common Issues

Issue Possible Cause Solution
Authentication fails Azure AD misconfiguration Verify VITE_MSAL_CLIENT_ID, VITE_MSAL_AUTHORITY, and redirect URIs match Azure AD app
Backend connection error Service not running or CORS issue Check systemctl status veo-video-generator and FRONTEND_URL in backend .env
Video generation fails Invalid credentials or API access Verify service account permissions and Veo 3.1 APIs are enabled in GCP
Image upload rejected Invalid format or size Ensure image is <10MB and meets minimum 720x720 resolution
Download hangs GCS permission issue Check service account has storage.objects.get permission on bucket
Model not found Wrong region or model ID Verify Veo 3.1 is available in specified REGION
Reference images fail Wrong model or constraints Reference images require Veo 3.1 Standard model, 16:9 aspect ratio, and 8-second duration
Last frame fails Wrong constraints Last frame interpolation requires Veo 3.1 model (Standard or Fast) and 8-second duration
SDK parameter error Outdated SDK version Ensure google-genai>=1.47.0 is installed for Veo 3.1 features

Veo 3.1 Feature Requirements

Frame Interpolation (Last Frame):

  • Supported models: veo-3.1-generate-preview, veo-3.1-fast-generate-preview
  • Required duration: 8 seconds
  • Supported aspect ratios: 16:9, 9:16

Reference Images:

  • Supported model: veo-3.1-generate-preview (Standard only, NOT Fast)
  • Required duration: 8 seconds
  • Required aspect ratio: 16:9 only
  • Maximum images: 3 reference images
  • Not supported in: Veo 3.1 Fast

Debug Mode

Enable detailed logging in development:

# Backend
FLASK_DEBUG=True in .env

# Frontend
Check browser console with React DevTools

Development

Local Development Setup

For local testing without authentication:

  1. Quick Start (runs both backend and frontend):

    ./run-dev.sh
    
  2. Manual Start:

    Backend (Terminal 1):

    cd backend
    cp .env.development .env
    python app.py
    

    Frontend (Terminal 2):

    cd frontend
    npm run dev
    

Development Features

  • Authentication Bypass: MSAL/SSO automatically bypassed when VITE_DEV_MODE=true
  • CORS: Configured for localhost:3000 and 127.0.0.1:3000
  • Hot Reload: Vite dev server auto-reloads frontend on file changes
  • Debug Mode: Flask runs with detailed error pages and auto-reload
  • Mock User: Shows "Dev User" in the interface header

Development URLs

  • Backend API: http://localhost:7394
  • Frontend: http://localhost:3000
  • No authentication required in dev mode

Additional Files

  • user_docs.md: Comprehensive user documentation and feature guide
  • CLAUDE.md: AI assistant guidance for working with this codebase
  • extract_usage_logs.sh: Script for extracting usage data from webhook logs
  • veo3.zip: Archive of production deployment artifacts
  • .gitignore: Git exclusions (includes .env, node_modules, temp_downloads, etc.)

Video Generation Architecture

Job Queue System

  • Global Queue: FIFO processing with unlimited submissions per user
  • Concurrent Processing: Maximum 2 jobs processing simultaneously
  • Status Tracking: In-memory job status dictionary (consider Redis for scaling)
  • User Limits: No queue limits, but 1-4 videos per individual request

Queue Display Sections

  1. Currently Processing: Jobs actively generating videos (highlighted in blue)
  2. In Queue: Jobs waiting for processing slots (highlighted in orange)
  3. History: Completed, failed, or cancelled jobs (standard styling)

File Management

  • Local Storage: temp_downloads/job_{job_id}/ for each job
  • GCS Integration: Temporary images uploaded to temp_images/ bucket path
  • Auto-cleanup: Jobs deleted 5 seconds after successful download
  • Manual Cleanup: Complete job deletion via delete button
  • Download Formats: Individual MP4s or complete ZIP packages

Job Actions by Status

  • Queued: Cancel, Delete
  • Processing: Cancel, Delete
  • Failed/Cancelled: Retry, Delete
  • Completed: Download All, Download Individual Videos (auto-deletes after download)

Notes

  • The original veo.py standalone script has been replaced by the full-stack application
  • Quad model support: Veo 3.1 (Standard & Fast)
  • Veo 3.1 advanced features: Frame interpolation and reference images with conditional UI
  • Multi-video generation support (1-4 videos per request)
  • Unlimited job submissions with intelligent queue management
  • Complete job lifecycle management with cancel/retry/delete functionality
  • Generated videos are automatically cleaned up after download
  • Image uploads are automatically converted to JPEG format regardless of input format
  • The application uses in-memory job status tracking (consider Redis for production scaling)
  • SDK upgraded to google-genai==1.47.0 for Veo 3.1 feature support