veo3/user_docs.md
2025-09-30 09:49:55 -05:00

9.6 KiB

Veo 3.0 Video Generator - User Documentation

Overview

The Veo 3.0 Video Generator is a web application that allows users to create AI-generated videos using Google's Veo 3.0 model. The application features a modern React frontend with Material-UI components and a Flask backend that interfaces with Google's GenerativeAI SDK.

Features

  • Text-to-Video Generation: Create videos from text prompts using Google's Veo 3.0 model
  • Image-to-Video Generation: Upload reference images to guide video generation
  • Customizable Settings: Control video length, aspect ratio, and person generation policies
  • Real-time Progress Tracking: Monitor generation progress with live status updates
  • Secure Authentication: Microsoft Azure AD integration for user authentication
  • Automatic Downloads: Direct download of generated videos after completion
  • File Management: Automatic cleanup of temporary files and cloud storage

System Requirements

Prerequisites

  • Google Cloud Project with Veo 3.0 API access
  • Google Cloud Storage bucket for temporary file storage
  • Service account with appropriate permissions
  • Python 3.8+ and Node.js 16+ for development
  • Microsoft Azure AD app registration (for authentication)

Supported Formats

  • Input Images: JPEG, PNG, GIF, BMP, TIFF, WebP, ICO (auto-converted to JPEG)
  • Output Videos: MP4 format
  • Image Size: Minimum 720x720 pixels, maximum 10MB
  • Video Length: 1-60 seconds
  • Aspect Ratios: 16:9 (landscape) or 9:16 (portrait)

Getting Started

1. Environment Setup

Create a .env file in the project root with the following configuration:

# Google Cloud Configuration
PROJECT_ID=your-google-cloud-project-id
REGION=us-central1
MODEL_ID=veo-3.0-generate-preview
OUTPUT_GCS_BUCKET_NAME=your-storage-bucket-name
SERVICE_ACCOUNT_KEY_PATH=./service-account.json

# Flask Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development
FLASK_DEBUG=True
PORT=7394

# Frontend Configuration
FRONTEND_URL=http://localhost:3000

# Webhook Configuration (optional)
WEBHOOK_URL=your-webhook-url
WEBHOOK_ENABLED=false
WEBHOOK_TIMEOUT=10

2. Google Cloud Setup

  1. Enable APIs:

    • Vertex AI API
    • Cloud Storage API
    • GenerativeAI API
  2. Create Service Account:

    gcloud iam service-accounts create veo-video-generator \
      --display-name="Veo Video Generator"
    
  3. Grant Permissions:

    gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
      --member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
      --role="roles/aiplatform.user"
    
    gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
      --member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
      --role="roles/storage.admin"
    
  4. Download Service Account Key:

    gcloud iam service-accounts keys create service-account.json \
      --iam-account=veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com
    

3. Installation

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Frontend Setup

cd frontend
npm install

4. Running the Application

Development Mode

Use the provided development script:

./run-dev.sh

This starts both frontend (port 3000) and backend (port 7394) servers.

Manual Startup

Backend:

cd backend
source venv/bin/activate
python app.py

Frontend:

cd frontend
npm run dev

Production Mode

# Backend
cd backend
python app.py

# Frontend (build and serve)
cd frontend
npm run build
# Serve the dist/ folder with your web server

Using the Application

1. Authentication

  • In development mode, authentication is bypassed with a mock user
  • In production, users authenticate via Microsoft Azure AD
  • Authenticated users' email addresses are tracked for usage analytics

2. Creating Videos

Text-to-Video Generation

  1. Enter your video prompt in the text area
  2. Adjust settings:
    • Video Length: 1-60 seconds (default: 8 seconds)
    • Aspect Ratio: 16:9 or 9:16 (default: 16:9)
    • Person Generation: Allow/don't allow people in videos
  3. Click "Generate Video"
  4. Monitor progress in real-time
  5. Download when complete

Image-to-Video Generation

  1. Upload a reference image (JPEG, PNG, etc.)
  2. The system automatically:
    • Validates image format and size
    • Converts to JPEG format
    • Detects aspect ratio and adjusts settings
    • Crops if necessary to fit aspect ratio
  3. Enter your video prompt
  4. Adjust additional settings as needed
  5. Generate and download

3. Progress Monitoring

The application provides real-time feedback during generation:

  • Starting (0%): Initializing request
  • Uploading Image (5%): Processing reference image
  • Generating (10%): Submitting to Veo 3.0 API
  • Processing (20-80%): Video generation in progress
  • Downloading (90%): Retrieving completed video
  • Completed (100%): Ready for download

4. Error Handling

Common error scenarios and solutions:

  • Content Filtered: Prompt violates safety policies - try rewording
  • Image Too Large: Resize image to under 10MB
  • Invalid Format: Use supported image formats only
  • Authentication Failed: Check service account configuration
  • Quota Exceeded: Check Google Cloud quotas and billing

API Reference

Endpoints

POST /api/generate

Start video generation process.

Request Body:

{
  "prompt": "A cat playing in a garden",
  "video_length_sec": 8,
  "aspect_ratio": "16:9",
  "person_generation": "dont_allow",
  "user_email": "user@example.com"
}

With Image (multipart/form-data):

  • image: Image file
  • Other parameters as form fields

Response:

{
  "job_id": "uuid-string",
  "status": "started"
}

GET /api/status/{job_id}

Check generation status.

Response:

{
  "status": "processing",
  "progress": 45,
  "message": "Video generation in progress...",
  "video_path": null,
  "error": null
}

GET /api/download/{job_id}

Download completed video.

Response: MP4 video file or error message.

DELETE /api/cleanup/{job_id}

Manually clean up job files.

Response:

{
  "message": "Files cleaned up successfully"
}

Configuration Reference

Backend Configuration (config.py)

Variable Description Default
PROJECT_ID Google Cloud project ID -
REGION Google Cloud region us-central1
MODEL_ID Veo model identifier veo-3.0-generate-preview
OUTPUT_GCS_BUCKET_NAME GCS bucket for storage -
SERVICE_ACCOUNT_KEY_PATH Path to service account JSON ../service-account.json
SECRET_KEY Flask secret key -
FLASK_ENV Environment mode production
PORT Backend server port 7394
FRONTEND_URL Frontend URL for CORS -
MAX_IMAGE_SIZE Maximum image size in bytes 10MB
WEBHOOK_URL Usage tracking webhook URL -
WEBHOOK_ENABLED Enable usage tracking true

Frontend Configuration

Set environment variables in .env.local:

VITE_API_BASE_URL=http://localhost:7394
VITE_DEV_MODE=true

Troubleshooting

Common Issues

  1. "Service account not found"

    • Verify service-account.json exists and is valid
    • Check file permissions
    • Ensure service account has required roles
  2. "Quota exceeded"

    • Check Google Cloud quotas for Vertex AI
    • Verify billing is enabled
    • Request quota increases if needed
  3. "Content filtered"

    • Modify prompt to avoid restricted content
    • Review Google's content policies
    • Try different wording or concepts
  4. "Image validation failed"

    • Check image format (use common formats)
    • Ensure image is at least 720x720 pixels
    • Verify file size is under 10MB
  5. CORS errors

    • Check FRONTEND_URL configuration
    • Verify both frontend and backend URLs
    • Ensure proper development/production settings

Debug Mode

Enable debug logging by setting:

FLASK_DEBUG=True

Debug information includes:

  • Request/response details
  • Image processing steps
  • API call parameters
  • File paths and operations

Log Files

Check application logs for detailed error information:

  • Backend logs: Console output from Flask application
  • Frontend logs: Browser developer console
  • Google Cloud logs: Cloud Logging for API errors

Security Considerations

  • Service account keys should be kept secure and not committed to version control
  • Use environment variables for sensitive configuration
  • Implement proper authentication in production
  • Regular rotation of service account keys
  • Monitor usage and costs through Google Cloud Console
  • Webhook URLs should use HTTPS in production

Performance Tips

  • Use appropriate video lengths (shorter = faster generation)
  • Optimize image sizes before upload
  • Monitor Google Cloud quotas and billing
  • Implement caching for repeated requests
  • Use CDN for frontend assets in production
  • Clean up temporary files regularly

Support and Resources

For technical issues, check the debug logs and refer to Google Cloud support resources.