veo3/user_docs.md
2025-10-11 00:07:33 +05:30

13 KiB

Veo 3.0 Video Generator - User Documentation

Overview

The Veo 3.0 Video Generator is a web application that allows users to create AI-generated videos using Google's Veo 3.0 model. The application features a modern React frontend with Material-UI components and a Flask backend that interfaces with Google's GenerativeAI SDK.

Features

  • Text-to-Video Generation: Create videos from text prompts using Google's Veo 3.0 model
  • Image-to-Video Generation: Upload reference images to guide video generation
  • Multi-Video Generation: Generate 1-4 videos per request with batch processing
  • Unlimited Job Queue: Submit unlimited video generation jobs with FIFO processing
  • Advanced Job Management: Cancel, retry, and delete jobs with complete file cleanup
  • Real-time Queue Visualization: Three-section display (Processing, Queued, History)
  • Customizable Settings: Control video length (4-8 seconds), aspect ratio, person generation, seed values, and audio
  • Real-time Progress Tracking: Monitor generation progress with live status updates and queue positions
  • Secure Authentication: Microsoft Azure AD integration for user authentication
  • Intelligent Downloads: Multiple download options with automatic cleanup
  • Comprehensive File Management: Auto-cleanup after download, complete GCS resource management

System Requirements

Prerequisites

  • Google Cloud Project with Veo 3.0 API access
  • Google Cloud Storage bucket for temporary file storage
  • Service account with appropriate permissions
  • Python 3.8+ and Node.js 16+ for development
  • Microsoft Azure AD app registration (for authentication)

Supported Formats

  • Input Images: JPEG, PNG, GIF, BMP, TIFF, WebP, ICO (auto-converted to JPEG)
  • Output Videos: MP4 format (individual files or ZIP packages)
  • Image Size: Minimum 720x720 pixels, maximum 10MB
  • Video Length: 4, 6, or 8 seconds
  • Video Count: 1-4 videos per request
  • Aspect Ratios: 16:9 (landscape) or 9:16 (portrait)
  • Seeds: Optional numeric seeds (0-4294967295) for reproducible results

Getting Started

1. Environment Setup

Create a .env file in the project root with the following configuration:

# Google Cloud Configuration
PROJECT_ID=your-google-cloud-project-id
REGION=us-central1
MODEL_ID=veo-3.0-generate-preview
OUTPUT_GCS_BUCKET_NAME=your-storage-bucket-name
SERVICE_ACCOUNT_KEY_PATH=./service-account.json

# Flask Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development
FLASK_DEBUG=True
PORT=7394

# Frontend Configuration
FRONTEND_URL=http://localhost:3000

# Webhook Configuration (optional)
WEBHOOK_URL=your-webhook-url
WEBHOOK_ENABLED=false
WEBHOOK_TIMEOUT=10

2. Google Cloud Setup

  1. Enable APIs:

    • Vertex AI API
    • Cloud Storage API
    • GenerativeAI API
  2. Create Service Account:

    gcloud iam service-accounts create veo-video-generator \
      --display-name="Veo Video Generator"
    
  3. Grant Permissions:

    gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
      --member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
      --role="roles/aiplatform.user"
    
    gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
      --member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
      --role="roles/storage.admin"
    
  4. Download Service Account Key:

    gcloud iam service-accounts keys create service-account.json \
      --iam-account=veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com
    

3. Installation

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Frontend Setup

cd frontend
npm install

4. Running the Application

Development Mode

Use the provided development script:

./run-dev.sh

This starts both frontend (port 3000) and backend (port 7394) servers.

Manual Startup

Backend:

cd backend
source venv/bin/activate
python app.py

Frontend:

cd frontend
npm run dev

Production Mode

# Backend
cd backend
python app.py

# Frontend (build and serve)
cd frontend
npm run build
# Serve the dist/ folder with your web server

Using the Application

1. Authentication

  • In development mode, authentication is bypassed with a mock user
  • In production, users authenticate via Microsoft Azure AD
  • Authenticated users' email addresses are tracked for usage analytics

2. Creating Videos

Text-to-Video Generation

  1. Enter your video prompt in the text area (minimum 10 characters)
  2. Adjust settings:
    • Video Length: 4, 6, or 8 seconds (default: 8 seconds)
    • Number of Videos: 1-4 videos per request (default: 1)
    • Model: Veo 3.0 (high-quality) or Veo 3.0 Fast (optimized speed)
    • Aspect Ratio: 16:9 or 9:16 (default: 16:9)
    • Person Generation: Allow/don't allow people in videos
    • Seed: Optional numeric value for reproducible results
    • Audio: Toggle audio generation on/off
  3. Click "Generate Video" or "Add to Queue"
  4. Monitor progress in the queue visualization
  5. Download when complete (auto-removes from queue)

Image-to-Video Generation

  1. Upload a reference image (JPEG, PNG, etc.)
  2. The system automatically:
    • Validates image format and size
    • Converts to JPEG format
    • Detects aspect ratio and adjusts settings
    • Crops if necessary to fit aspect ratio
  3. Enter your video prompt
  4. Adjust additional settings as needed
  5. Generate and download

Queue Management

  • Unlimited Submissions: No limit on number of jobs per user
  • FIFO Processing: Jobs processed in order of submission
  • Concurrent Limit: Maximum 2 jobs process simultaneously
  • Real-time Updates: Queue status updates every 2 seconds
  • Job Actions: Cancel, retry, or delete jobs based on status

3. Queue Visualization & Progress Monitoring

The application displays jobs in three distinct sections:

Currently Processing (Blue Highlight)

  • Jobs actively generating videos (max 2 concurrent)
  • Real-time progress percentage and status updates
  • Available actions: Cancel, Delete

In Queue (Orange Highlight)

  • Jobs waiting for processing slots
  • Queue position indicator (e.g., "Queue Position: 3")
  • Available actions: Cancel, Delete

History (Standard Styling)

  • Completed, failed, or cancelled jobs
  • Available actions based on status:
    • Completed: Download All, Download Individual Videos
    • Failed/Cancelled: Retry, Delete

Progress Stages

  • Queued: Waiting in queue for processing slot
  • Starting (0%): Initializing request
  • Uploading Image (5%): Processing reference image
  • Generating (10%): Submitting to Veo 3.0 API
  • Processing (20-80%): Video generation in progress
  • Downloading (90%): Retrieving completed videos
  • Completed (100%): Ready for download

4. Error Handling

Common error scenarios and solutions:

  • Content Filtered: Prompt violates safety policies - try rewording
  • Image Too Large: Resize image to under 10MB
  • Invalid Format: Use supported image formats only
  • Authentication Failed: Check service account configuration
  • Quota Exceeded: Check Google Cloud quotas and billing

API Reference

Endpoints

POST /api/generate

Start video generation process.

Request Body:

{
  "prompt": "A cat playing in a garden",
  "video_length_sec": 8,
  "aspect_ratio": "16:9",
  "person_generation": "dont_allow",
  "model_name": "veo-3.0-generate-preview",
  "sampleCount": 2,
  "seed": 12345,
  "generate_audio": true,
  "user_email": "user@example.com"
}

With Image (multipart/form-data):

  • image: Image file
  • Other parameters as form fields

Response:

{
  "job_id": "uuid-string",
  "status": "started"
}

Job Management Endpoints

GET /api/user-jobs?user_email=user@example.com Get all jobs for user.

GET /api/queue-status Get overall queue status.

POST /api/cancel/{job_id} Cancel queued or processing job.

POST /api/retry/{job_id} Retry failed or cancelled job.

DELETE /api/delete/{job_id} Completely delete job and all associated files.

GET /api/status/{job_id}

Check generation status.

Response:

{
  "status": "processing",
  "progress": 45,
  "message": "Video generation in progress...",
  "video_count": 2,
  "videos_requested": 2,
  "video_path": null,
  "individual_video_paths": [],
  "is_zip": true,
  "created_at": "2024-01-01T12:00:00Z",
  "error": null
}

Download Endpoints

GET /api/download/{job_id} Download completed content (ZIP for multiple videos, MP4 for single). Automatically deletes job 5 seconds after successful download.

GET /api/download/{job_id}/video/{index} Download individual video (index 1-4).

Response: MP4 video file, ZIP package, or error message.

DELETE /api/cleanup/{job_id}

Manually clean up job files (local only).

Response:

{
  "message": "Files cleaned up successfully"
}

Configuration Reference

Backend Configuration (config.py)

Variable Description Default
PROJECT_ID Google Cloud project ID -
REGION Google Cloud region us-central1
MODEL_ID Veo model identifier veo-3.0-generate-preview
MODEL_FAST_ID Veo Fast model identifier veo-3.0-fast-generate-preview
OUTPUT_GCS_BUCKET_NAME GCS bucket for storage -
SERVICE_ACCOUNT_KEY_PATH Path to service account JSON ../service-account.json
SECRET_KEY Flask secret key -
FLASK_ENV Environment mode production
PORT Backend server port 7394
FRONTEND_URL Frontend URL for CORS -
MAX_IMAGE_SIZE Maximum image size in bytes 10MB
CONCURRENT_JOB_LIMIT Max concurrent processing jobs 2
MAX_RETRIES Max retry attempts per job 3
WEBHOOK_URL Usage tracking webhook URL -
WEBHOOK_ENABLED Enable usage tracking true

Frontend Configuration

Set environment variables in .env.local:

VITE_API_BASE_URL=http://localhost:7394
VITE_DEV_MODE=true
VITE_APP_TITLE=Veo Video Generator (Dev)
VITE_MSAL_CLIENT_ID=your-azure-client-id
VITE_MSAL_AUTHORITY=https://login.microsoftonline.com/your-tenant-id
VITE_MSAL_REDIRECT_URI=http://localhost:3000

Queue System Configuration

  • Unlimited Queue: No per-user job limits
  • Video Limits: 1-4 videos per individual request
  • Processing Slots: Maximum 2 concurrent jobs
  • Polling Intervals: Frontend polls every 2 seconds, backend polls Google API every 30 seconds
  • Auto-cleanup: Jobs deleted 5 seconds after successful download
  • Retry Logic: Up to 3 retry attempts with exponential backoff

Troubleshooting

Common Issues

  1. "Service account not found"

    • Verify service-account.json exists and is valid
    • Check file permissions
    • Ensure service account has required roles
  2. "Quota exceeded"

    • Check Google Cloud quotas for Vertex AI
    • Verify billing is enabled
    • Request quota increases if needed
  3. "Content filtered"

    • Modify prompt to avoid restricted content
    • Review Google's content policies
    • Try different wording or concepts
  4. "Image validation failed"

    • Check image format (use common formats)
    • Ensure image is at least 720x720 pixels
    • Verify file size is under 10MB
  5. CORS errors

    • Check FRONTEND_URL configuration
    • Verify both frontend and backend URLs
    • Ensure proper development/production settings

Debug Mode

Enable debug logging by setting:

FLASK_DEBUG=True

Debug information includes:

  • Request/response details
  • Image processing steps
  • API call parameters
  • File paths and operations

Log Files

Check application logs for detailed error information:

  • Backend logs: Console output from Flask application
  • Frontend logs: Browser developer console
  • Google Cloud logs: Cloud Logging for API errors

Security Considerations

  • Service account keys should be kept secure and not committed to version control
  • Use environment variables for sensitive configuration
  • Implement proper authentication in production
  • Regular rotation of service account keys
  • Monitor usage and costs through Google Cloud Console
  • Webhook URLs should use HTTPS in production

Performance Tips

  • Use appropriate video lengths (shorter = faster generation)
  • Optimize image sizes before upload
  • Monitor Google Cloud quotas and billing
  • Implement caching for repeated requests
  • Use CDN for frontend assets in production
  • Clean up temporary files regularly

Support and Resources

For technical issues, check the debug logs and refer to Google Cloud support resources.