veo3/CLAUDE.md
2025-12-01 12:29:22 +05:30

13 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Full-stack web application integrating Google's Veo 3.1 video generation models only. The system provides text-to-video and image-to-video generation with advanced features like frame interpolation and reference images.

Stack:

  • Backend: Flask 3.0 + Google Gen AI SDK 1.47.0
  • Frontend: React 18 + Vite + Material-UI 5
  • Storage: Google Cloud Storage
  • Auth: Azure AD SSO (MSAL) with dev mode bypass
  • Deployment: systemd service + Apache reverse proxy

Note: This application exclusively uses Veo 3.1 models (Standard and Fast variants). Veo 3.0 models are not supported.

Common Commands

Development

# Start both frontend and backend
./run-dev.sh

# Backend only (from backend/)
python app.py

# Frontend only (from frontend/)
npm run dev

# Build frontend for production
cd frontend && npm run build

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows
pip install -r requirements.txt
cp .env.development .env
python app.py

Frontend Setup

cd frontend
npm install
cp .env.development .env
npm run dev

Production Service Management

# systemd service control
sudo systemctl start veo-video-generator
sudo systemctl stop veo-video-generator
sudo systemctl restart veo-video-generator
sudo systemctl status veo-video-generator

# View logs
sudo journalctl -u veo-video-generator -f

Architecture

Job Queue System

The core of the application is a multi-process job queue system implemented in backend/video_generator.py:

Key Data Structures:

  • job_status: In-memory dict tracking all job states (consider Redis for production scaling)
  • job_queue: Global FIFO array of pending job IDs
  • processing_jobs: Array of currently active jobs (max 2 concurrent via CONCURRENT_JOB_LIMIT)
  • user_job_counts: Per-user job count tracking (no limits, but tracked for monitoring)

Queue Processing Flow:

  1. Jobs are created with UUID and added to job_queue via add_to_queue()
  2. Background thread in start_next_job() picks jobs when len(processing_jobs) < CONCURRENT_JOB_LIMIT
  3. process_video_generation() handles the actual generation in a separate thread
  4. complete_job() removes from processing_jobs and triggers next job

Job Lifecycle States:

  • queuedprocessinggeneratingdownloadingcompleted
  • OR: failed, cancelled, retry_1_of_3, etc.

Video Generation Pipeline

Request Flow (backend/routes/api.py → video_generator.py):

  1. POST /api/generate: Multipart form data with prompt, images, parameters
  2. Job creation: generate_video_async() creates UUID, validates inputs
  3. Image processing: If images provided, validate → convert to JPEG → upload to GCS temp_images/
  4. API calls: Multiple threaded calls to Google Gen AI SDK (1 call per requested video)
  5. Polling: Backend polls operations every 30s, frontend polls status every 2s
  6. Download: Videos downloaded from GCS to temp_downloads/job_{job_id}/
  7. Packaging: Multiple videos + images packaged into zip if needed
  8. Cleanup: Auto-delete 5 seconds after successful download

Key Functions:

  • generate_video_async(): Queue job, return job ID immediately
  • process_video_generation(): Main processing function (runs in thread)
  • start_next_job(): Queue processor that maintains concurrent limit
  • complete_job(): Cleanup and start next queued job

Model Support (Veo 3.1 Only)

Two models defined in backend/config.pySUPPORTED_MODELS:

  • veo-3.1-generate-preview (Veo 3.1 Standard): High-quality with reference images + frame interpolation ($0.40/sec)
  • veo-3.1-fast-generate-preview (Veo 3.1 Fast): Optimized speed with frame interpolation only ($0.15/sec)

Veo 3.1 Feature Constraints (enforced in process_video_generation()):

  • Frame interpolation (last frame): Requires 8-second duration, available on BOTH models
  • Reference images: Requires 8-second duration + 16:9 aspect ratio, Standard model ONLY (NOT available in Fast)

Authentication Flow

Production: Azure AD SSO via @azure/msal-react

  • Frontend wraps app in <AuthGuard> component
  • Backend has NO auth validation (frontend-only)

Development: Auth bypass when VITE_DEV_MODE=true

  • <DevAuthWrapper> component bypasses MSAL
  • Shows "Dev User" in UI header

Storage Architecture

Local Storage:

  • backend/temp_downloads/job_{job_id}/: Per-job folder for videos and images
  • Files deleted automatically 5 seconds after download
  • Entire job folder removed with shutil.rmtree() for efficiency

GCS Storage:

  • Temp images: gs://{bucket}/temp_images/ (cleaned up after processing)
  • Generated videos: gs://{bucket}/veo_outputs/{prompt}_{timestamp}/ (cleaned up after download)
  • Cleanup happens in cleanup_image_files() and delete_blob() from utils/storage.py

Download Strategy (backend/routes/api.py:download_video):

  • Single video + no image → Direct MP4 download
  • Multiple videos OR image included → ZIP file download
  • ZIP created in process_video_generation() and stored in job folder

Project Structure

backend/
├── routes/
│   ├── api.py              # Main API routes (generate, status, download, cancel, retry, delete)
│   └── health.py           # Health check endpoints
├── utils/
│   ├── auth.py             # Google Cloud credentials
│   └── storage.py          # GCS operations, image validation/conversion
├── app.py                  # Flask app initialization, CORS setup
├── config.py               # Config class with model definitions, env vars
├── video_generator.py      # Core logic: queue system, job processing, Gen AI SDK calls
├── requirements.txt
├── .env.development        # Dev config (localhost CORS, debug mode)
└── .env.production         # Prod config (strict CORS, optimized)

frontend/
├── src/
│   ├── components/
│   │   ├── VideoForm.jsx           # Main form with conditional Veo 3.1 features
│   │   ├── VideoGenerator.jsx      # Top-level container
│   │   ├── ProgressIndicator.jsx   # Job status display
│   │   ├── QueueManager.jsx        # Queue visualization (3 sections)
│   │   ├── Layout.jsx              # App layout wrapper
│   │   ├── AuthGuard.jsx           # MSAL authentication wrapper
│   │   └── DevAuthWrapper.jsx      # Dev mode auth bypass
│   ├── services/
│   │   └── api.js          # Axios API client
│   ├── hooks/
│   │   └── useVideoGeneration.js   # Custom hook for video generation logic
│   ├── config/
│   │   └── msalConfig.js   # MSAL configuration
│   └── App.jsx             # Main app component
├── .env.development        # Dev config (localhost API, VITE_DEV_MODE=true)
└── .env.production         # Prod config (production API, MSAL enabled)

Configuration

Backend Environment Variables (.env.development vs .env.production)

Critical differences:

  • FLASK_ENV: development vs production
  • FLASK_DEBUG: True vs False
  • FRONTEND_URL: http://localhost:3000 vs production URL
  • CORS: Development allows localhost:3000, 127.0.0.1:3000 + configured URL

Key variables (backend/config.py):

  • PROJECT_ID, REGION, OUTPUT_GCS_BUCKET_NAME: GCP configuration
  • SERVICE_ACCOUNT_KEY_PATH: Path to service account JSON (default: ../service-account.json)
  • MODEL_ID, MODEL_FAST_ID: Default model identifiers (overridable per request)
  • WEBHOOK_URL, WEBHOOK_ENABLED: Usage tracking webhook integration
  • TEMP_DOWNLOAD_PATH: Local storage path (./temp_downloads/)
  • MAX_IMAGE_SIZE: 10MB upload limit
  • MIN_IMAGE_RESOLUTION: (720, 720) minimum

Frontend Environment Variables

Critical differences:

  • VITE_DEV_MODE: true (bypasses auth) vs false (enables MSAL)
  • VITE_API_BASE_URL: http://localhost:7394 vs production API URL

Key variables:

  • VITE_MSAL_CLIENT_ID, VITE_MSAL_AUTHORITY, VITE_MSAL_REDIRECT_URI: Azure AD config

API Endpoints

Main Routes (backend/routes/api.py)

Job Management:

  • POST /api/generate: Start video generation (multipart form data)
  • GET /api/status/<job_id>: Get job status with queue position
  • GET /api/download/<job_id>: Download video/zip (auto-deletes job after 5s)
  • GET /api/download/<job_id>/video/<index>: Download individual video
  • GET /api/user-jobs?user_email=<email>: Get all user jobs
  • GET /api/queue-status: Get queue status (length, processing jobs, concurrent limit)

Job Actions:

  • POST /api/cancel/<job_id>: Cancel queued/processing job
  • POST /api/retry/<job_id>: Retry failed/cancelled job
  • DELETE /api/delete/<job_id>: Completely delete job and files
  • DELETE /api/cleanup/<job_id>: Manual cleanup (full GCS cleanup)
  • DELETE /api/cleanup/<job_id>/local: Fast local-only cleanup

Health:

  • GET /health: Detailed health check with config info
  • GET /ping: Simple ping response

Key Implementation Details

Multi-Video Generation

  • Frontend sends sampleCount (1-4 videos per request)
  • Backend makes N API calls to ensure N videos (1 call per video)
  • Operations polled in parallel, results collected in all_generated_videos
  • Exact count selected: generated_videos = all_generated_videos[:sample_count]

Image Processing Pipeline (utils/storage.py)

  1. Validation: Check format, resolution, file size
  2. Conversion: All formats → JPEG (Pillow supported formats accepted)
  3. Aspect ratio detection: Auto-detect 16:9 or 9:16
  4. Cropping: Auto-crop to target aspect ratio if needed
  5. Upload: To GCS temp_images/ with unique blob name
  6. Cleanup: Delete from GCS after video generation completes

Error Handling & Retries

  • Automatic retry for network errors: 503, 500, timeout, connection errors
  • Max 3 retries with exponential backoff (30s, 60s, 120s)
  • Retry count tracked in job_status[job_id]['retry_count']
  • Job status shows retry_1_of_3, retry_2_of_3, etc.
  • Final failure after max retries updates status to failed

File Cleanup Strategy

  • Auto-cleanup: Triggered 5 seconds after successful download
  • Complete deletion: delete_job_completely() removes from queue, job_status, local files, GCS files
  • Efficient folder cleanup: Uses shutil.rmtree() instead of individual file deletion
  • GCS cleanup: Optional in cleanup_job_files(cleanup_gcs=True/False) for performance

Content Safety Filtering

  • Handled in process_video_generation() after operation completion
  • Checks response.rai_media_filtered_count and rai_media_filtered_reasons
  • Returns user-friendly error message with filter reason

Usage Tracking

  • Webhook integration sends data to configured endpoint
  • Payload includes: user email, prompt, model, timestamp
  • Non-blocking: Failures don't interrupt main flow
  • Disable with WEBHOOK_ENABLED=false

Development Workflow

Quick start: Run ./run-dev.sh from project root

Manual start:

  1. Backend: cd backend && python app.py (port 7394)
  2. Frontend: cd frontend && npm run dev (port 3000)

Development features:

  • Hot reload on both frontend and backend
  • Auth bypass enabled (no Azure AD needed)
  • Verbose logging with DEBUG statements
  • CORS configured for localhost

Environment setup:

  • Script automatically copies .env.development to .env for both frontend and backend
  • Virtual environment required for backend (must be created manually first)
  • Frontend dependencies installed automatically if missing

Production Deployment

Backend:

  1. Copy veo-video-generator.service to /etc/systemd/system/
  2. Update paths in service file
  3. Enable and start: sudo systemctl enable veo-video-generator && sudo systemctl start veo-video-generator

Frontend:

  1. Build: cd frontend && npm run build
  2. Copy dist/* to Apache web root
  3. Copy apache-htaccess.txt to web root as .htaccess

Apache configuration:

  • Main config in apache.conf (reverse proxy to backend)
  • Required modules: proxy, proxy_http, rewrite, headers, expires
  • Backend proxied at /api endpoint
  • Frontend served as static files with SPA routing support

Important Notes

  • Veo 3.1 Only: Application exclusively uses Veo 3.1 models. Veo 3.0 is not supported.

  • In-memory job tracking: job_status dict is NOT persisted. Consider Redis for production scaling.

  • No queue limits: Unlimited submissions allowed per user (removed MAX_QUEUE_SIZE_PER_USER)

  • Concurrent processing: Hard limit of 2 jobs (CONCURRENT_JOB_LIMIT)

  • Service account required: Must have GCS permissions for bucket operations

  • SDK version: Requires google-genai>=1.47.0 for Veo 3.1 features

  • Image formats: All Pillow formats accepted, automatically converted to JPEG

  • Auto-deletion: Jobs automatically deleted 5 seconds after download (no user action needed)

  • Frontend validation: No backend auth validation - security relies on frontend MSAL

  • Latest Commit for Update: 12/01/2025 For Last Frame Image Error.