Manish Tanwar 0528aec6fb Latest Update

2025-12-01 12:29:22 +05:30

13 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Full-stack web application integrating Google's Veo 3.1 video generation models only. The system provides text-to-video and image-to-video generation with advanced features like frame interpolation and reference images.

Stack:

Backend: Flask 3.0 + Google Gen AI SDK 1.47.0
Frontend: React 18 + Vite + Material-UI 5
Storage: Google Cloud Storage
Auth: Azure AD SSO (MSAL) with dev mode bypass
Deployment: systemd service + Apache reverse proxy

Note: This application exclusively uses Veo 3.1 models (Standard and Fast variants). Veo 3.0 models are not supported.

Common Commands

Development

# Start both frontend and backend
./run-dev.sh

# Backend only (from backend/)
python app.py

# Frontend only (from frontend/)
npm run dev

# Build frontend for production
cd frontend && npm run build

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows
pip install -r requirements.txt
cp .env.development .env
python app.py

Frontend Setup

cd frontend
npm install
cp .env.development .env
npm run dev

Production Service Management

# systemd service control
sudo systemctl start veo-video-generator
sudo systemctl stop veo-video-generator
sudo systemctl restart veo-video-generator
sudo systemctl status veo-video-generator

# View logs
sudo journalctl -u veo-video-generator -f

Architecture

Job Queue System

The core of the application is a multi-process job queue system implemented in backend/video_generator.py:

Key Data Structures:

job_status: In-memory dict tracking all job states (consider Redis for production scaling)
job_queue: Global FIFO array of pending job IDs
processing_jobs: Array of currently active jobs (max 2 concurrent via CONCURRENT_JOB_LIMIT)
user_job_counts: Per-user job count tracking (no limits, but tracked for monitoring)

Queue Processing Flow:

Jobs are created with UUID and added to job_queue via add_to_queue()
Background thread in start_next_job() picks jobs when len(processing_jobs) < CONCURRENT_JOB_LIMIT
process_video_generation() handles the actual generation in a separate thread
complete_job() removes from processing_jobs and triggers next job

Job Lifecycle States:

queued → processing → generating → downloading → completed
OR: failed, cancelled, retry_1_of_3, etc.

Video Generation Pipeline

Request Flow (backend/routes/api.py → video_generator.py):

POST /api/generate: Multipart form data with prompt, images, parameters
Job creation: generate_video_async() creates UUID, validates inputs
Image processing: If images provided, validate → convert to JPEG → upload to GCS temp_images/
API calls: Multiple threaded calls to Google Gen AI SDK (1 call per requested video)
Polling: Backend polls operations every 30s, frontend polls status every 2s
Download: Videos downloaded from GCS to temp_downloads/job_{job_id}/
Packaging: Multiple videos + images packaged into zip if needed
Cleanup: Auto-delete 5 seconds after successful download

Key Functions:

generate_video_async(): Queue job, return job ID immediately
process_video_generation(): Main processing function (runs in thread)
start_next_job(): Queue processor that maintains concurrent limit
complete_job(): Cleanup and start next queued job

Model Support (Veo 3.1 Only)

Two models defined in backend/config.py → SUPPORTED_MODELS:

veo-3.1-generate-preview (Veo 3.1 Standard): High-quality with reference images + frame interpolation ($0.40/sec)
veo-3.1-fast-generate-preview (Veo 3.1 Fast): Optimized speed with frame interpolation only ($0.15/sec)

Veo 3.1 Feature Constraints (enforced in process_video_generation()):

Frame interpolation (last frame): Requires 8-second duration, available on BOTH models
Reference images: Requires 8-second duration + 16:9 aspect ratio, Standard model ONLY (NOT available in Fast)

Authentication Flow

Production: Azure AD SSO via @azure/msal-react

Frontend wraps app in <AuthGuard> component
Backend has NO auth validation (frontend-only)

Development: Auth bypass when VITE_DEV_MODE=true

<DevAuthWrapper> component bypasses MSAL
Shows "Dev User" in UI header

Storage Architecture

Local Storage:

backend/temp_downloads/job_{job_id}/: Per-job folder for videos and images
Files deleted automatically 5 seconds after download
Entire job folder removed with shutil.rmtree() for efficiency

GCS Storage:

Temp images: gs://{bucket}/temp_images/ (cleaned up after processing)
Generated videos: gs://{bucket}/veo_outputs/{prompt}_{timestamp}/ (cleaned up after download)
Cleanup happens in cleanup_image_files() and delete_blob() from utils/storage.py

Download Strategy (backend/routes/api.py:download_video):

Single video + no image → Direct MP4 download
Multiple videos OR image included → ZIP file download
ZIP created in process_video_generation() and stored in job folder

Project Structure

backend/
├── routes/
│   ├── api.py              # Main API routes (generate, status, download, cancel, retry, delete)
│   └── health.py           # Health check endpoints
├── utils/
│   ├── auth.py             # Google Cloud credentials
│   └── storage.py          # GCS operations, image validation/conversion
├── app.py                  # Flask app initialization, CORS setup
├── config.py               # Config class with model definitions, env vars
├── video_generator.py      # Core logic: queue system, job processing, Gen AI SDK calls
├── requirements.txt
├── .env.development        # Dev config (localhost CORS, debug mode)
└── .env.production         # Prod config (strict CORS, optimized)

frontend/
├── src/
│   ├── components/
│   │   ├── VideoForm.jsx           # Main form with conditional Veo 3.1 features
│   │   ├── VideoGenerator.jsx      # Top-level container
│   │   ├── ProgressIndicator.jsx   # Job status display
│   │   ├── QueueManager.jsx        # Queue visualization (3 sections)
│   │   ├── Layout.jsx              # App layout wrapper
│   │   ├── AuthGuard.jsx           # MSAL authentication wrapper
│   │   └── DevAuthWrapper.jsx      # Dev mode auth bypass
│   ├── services/
│   │   └── api.js          # Axios API client
│   ├── hooks/
│   │   └── useVideoGeneration.js   # Custom hook for video generation logic
│   ├── config/
│   │   └── msalConfig.js   # MSAL configuration
│   └── App.jsx             # Main app component
├── .env.development        # Dev config (localhost API, VITE_DEV_MODE=true)
└── .env.production         # Prod config (production API, MSAL enabled)

Configuration

Backend Environment Variables (.env.development vs .env.production)

Critical differences:

FLASK_ENV: development vs production
FLASK_DEBUG: True vs False
FRONTEND_URL: http://localhost:3000 vs production URL
CORS: Development allows localhost:3000, 127.0.0.1:3000 + configured URL

Key variables (backend/config.py):

PROJECT_ID, REGION, OUTPUT_GCS_BUCKET_NAME: GCP configuration
SERVICE_ACCOUNT_KEY_PATH: Path to service account JSON (default: ../service-account.json)
MODEL_ID, MODEL_FAST_ID: Default model identifiers (overridable per request)
WEBHOOK_URL, WEBHOOK_ENABLED: Usage tracking webhook integration
TEMP_DOWNLOAD_PATH: Local storage path (./temp_downloads/)
MAX_IMAGE_SIZE: 10MB upload limit
MIN_IMAGE_RESOLUTION: (720, 720) minimum

Frontend Environment Variables

Critical differences:

VITE_DEV_MODE: true (bypasses auth) vs false (enables MSAL)
VITE_API_BASE_URL: http://localhost:7394 vs production API URL

Key variables:

VITE_MSAL_CLIENT_ID, VITE_MSAL_AUTHORITY, VITE_MSAL_REDIRECT_URI: Azure AD config

API Endpoints

Main Routes (backend/routes/api.py)

Job Management:

POST /api/generate: Start video generation (multipart form data)
GET /api/status/<job_id>: Get job status with queue position
GET /api/download/<job_id>: Download video/zip (auto-deletes job after 5s)
GET /api/download/<job_id>/video/<index>: Download individual video
GET /api/user-jobs?user_email=<email>: Get all user jobs
GET /api/queue-status: Get queue status (length, processing jobs, concurrent limit)

Job Actions:

POST /api/cancel/<job_id>: Cancel queued/processing job
POST /api/retry/<job_id>: Retry failed/cancelled job
DELETE /api/delete/<job_id>: Completely delete job and files
DELETE /api/cleanup/<job_id>: Manual cleanup (full GCS cleanup)
DELETE /api/cleanup/<job_id>/local: Fast local-only cleanup

Health:

GET /health: Detailed health check with config info
GET /ping: Simple ping response

Key Implementation Details

Multi-Video Generation

Frontend sends sampleCount (1-4 videos per request)
Backend makes N API calls to ensure N videos (1 call per video)
Operations polled in parallel, results collected in all_generated_videos
Exact count selected: generated_videos = all_generated_videos[:sample_count]

Image Processing Pipeline (utils/storage.py)

Validation: Check format, resolution, file size
Conversion: All formats → JPEG (Pillow supported formats accepted)
Aspect ratio detection: Auto-detect 16:9 or 9:16
Cropping: Auto-crop to target aspect ratio if needed
Upload: To GCS temp_images/ with unique blob name
Cleanup: Delete from GCS after video generation completes

Error Handling & Retries

Automatic retry for network errors: 503, 500, timeout, connection errors
Max 3 retries with exponential backoff (30s, 60s, 120s)
Retry count tracked in job_status[job_id]['retry_count']
Job status shows retry_1_of_3, retry_2_of_3, etc.
Final failure after max retries updates status to failed

File Cleanup Strategy

Auto-cleanup: Triggered 5 seconds after successful download
Complete deletion: delete_job_completely() removes from queue, job_status, local files, GCS files
Efficient folder cleanup: Uses shutil.rmtree() instead of individual file deletion
GCS cleanup: Optional in cleanup_job_files(cleanup_gcs=True/False) for performance

Content Safety Filtering

Handled in process_video_generation() after operation completion
Checks response.rai_media_filtered_count and rai_media_filtered_reasons
Returns user-friendly error message with filter reason

Usage Tracking

Webhook integration sends data to configured endpoint
Payload includes: user email, prompt, model, timestamp
Non-blocking: Failures don't interrupt main flow
Disable with WEBHOOK_ENABLED=false

Development Workflow

Quick start: Run ./run-dev.sh from project root

Manual start:

Backend: cd backend && python app.py (port 7394)
Frontend: cd frontend && npm run dev (port 3000)

Development features:

Hot reload on both frontend and backend
Auth bypass enabled (no Azure AD needed)
Verbose logging with DEBUG statements
CORS configured for localhost

Environment setup:

Script automatically copies .env.development to .env for both frontend and backend
Virtual environment required for backend (must be created manually first)
Frontend dependencies installed automatically if missing

Production Deployment

Backend:

Copy veo-video-generator.service to /etc/systemd/system/
Update paths in service file
Enable and start: sudo systemctl enable veo-video-generator && sudo systemctl start veo-video-generator

Frontend:

Build: cd frontend && npm run build
Copy dist/* to Apache web root
Copy apache-htaccess.txt to web root as .htaccess

Apache configuration:

Main config in apache.conf (reverse proxy to backend)
Required modules: proxy, proxy_http, rewrite, headers, expires
Backend proxied at /api endpoint
Frontend served as static files with SPA routing support

Important Notes

Veo 3.1 Only: Application exclusively uses Veo 3.1 models. Veo 3.0 is not supported.
In-memory job tracking: job_status dict is NOT persisted. Consider Redis for production scaling.
No queue limits: Unlimited submissions allowed per user (removed MAX_QUEUE_SIZE_PER_USER)
Concurrent processing: Hard limit of 2 jobs (CONCURRENT_JOB_LIMIT)
Service account required: Must have GCS permissions for bucket operations
SDK version: Requires google-genai>=1.47.0 for Veo 3.1 features
Image formats: All Pillow formats accepted, automatically converted to JPEG
Auto-deletion: Jobs automatically deleted 5 seconds after download (no user action needed)
Frontend validation: No backend auth validation - security relies on frontend MSAL
Latest Commit for Update: 12/01/2025 For Last Frame Image Error.

13 KiB Raw Permalink Blame History