veo3/user_docs.md

# Veo 3.0 Video Generator - User Documentation

## Overview

The Veo 3.0 Video Generator is a web application that allows users to create AI-generated videos using Google's Veo 3.0 model. The application features a modern React frontend with Material-UI components and a Flask backend that interfaces with Google's GenerativeAI SDK.

## Features

- **Text-to-Video Generation**: Create videos from text prompts using Google's Veo 3.0 model
- **Image-to-Video Generation**: Upload reference images to guide video generation
- **Multi-Video Generation**: Generate 1-4 videos per request with batch processing
- **Unlimited Job Queue**: Submit unlimited video generation jobs with FIFO processing
- **Advanced Job Management**: Cancel, retry, and delete jobs with complete file cleanup
- **Real-time Queue Visualization**: Three-section display (Processing, Queued, History)
- **Customizable Settings**: Control video length (4-8 seconds), aspect ratio, person generation, seed values, and audio
- **Real-time Progress Tracking**: Monitor generation progress with live status updates and queue positions
- **Secure Authentication**: Microsoft Azure AD integration for user authentication
- **Intelligent Downloads**: Multiple download options with automatic cleanup
- **Comprehensive File Management**: Auto-cleanup after download, complete GCS resource management

## System Requirements

### Prerequisites
- Google Cloud Project with Veo 3.0 API access
- Google Cloud Storage bucket for temporary file storage
- Service account with appropriate permissions
- Python 3.8+ and Node.js 16+ for development
- Microsoft Azure AD app registration (for authentication)

### Supported Formats
- **Input Images**: JPEG, PNG, GIF, BMP, TIFF, WebP, ICO (auto-converted to JPEG)
- **Output Videos**: MP4 format (individual files or ZIP packages)
- **Image Size**: Minimum 720x720 pixels, maximum 10MB
- **Video Length**: 4, 6, or 8 seconds
- **Video Count**: 1-4 videos per request
- **Aspect Ratios**: 16:9 (landscape) or 9:16 (portrait)
- **Seeds**: Optional numeric seeds (0-4294967295) for reproducible results

## Getting Started

### 1. Environment Setup

Create a `.env` file in the project root with the following configuration:

```env
# Google Cloud Configuration
PROJECT_ID=your-google-cloud-project-id
REGION=us-central1
MODEL_ID=veo-3.0-generate-preview
OUTPUT_GCS_BUCKET_NAME=your-storage-bucket-name
SERVICE_ACCOUNT_KEY_PATH=./service-account.json

# Flask Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development
FLASK_DEBUG=True
PORT=7394

# Frontend Configuration
FRONTEND_URL=http://localhost:3000

# Webhook Configuration (optional)
WEBHOOK_URL=your-webhook-url
WEBHOOK_ENABLED=false
WEBHOOK_TIMEOUT=10
```

### 2. Google Cloud Setup

1. **Enable APIs**:
   - Vertex AI API
   - Cloud Storage API
   - GenerativeAI API

2. **Create Service Account**:
   ```bash
   gcloud iam service-accounts create veo-video-generator \
     --display-name="Veo Video Generator"
   ```

3. **Grant Permissions**:
   ```bash
   gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
     --member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
     --role="roles/aiplatform.user"

   gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
     --member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
     --role="roles/storage.admin"
   ```

4. **Download Service Account Key**:
   ```bash
   gcloud iam service-accounts keys create service-account.json \
     --iam-account=veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com
   ```

### 3. Installation

#### Backend Setup
```bash
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```

#### Frontend Setup
```bash
cd frontend
npm install
```

### 4. Running the Application

#### Development Mode
Use the provided development script:
```bash
./run-dev.sh
```

This starts both frontend (port 3000) and backend (port 7394) servers.

#### Manual Startup

**Backend**:
```bash
cd backend
source venv/bin/activate
python app.py
```

**Frontend**:
```bash
cd frontend
npm run dev
```

#### Production Mode
```bash
# Backend
cd backend
python app.py

# Frontend (build and serve)
cd frontend
npm run build
# Serve the dist/ folder with your web server
```

## Using the Application

### 1. Authentication
- In development mode, authentication is bypassed with a mock user
- In production, users authenticate via Microsoft Azure AD
- Authenticated users' email addresses are tracked for usage analytics

### 2. Creating Videos

#### Text-to-Video Generation
1. Enter your video prompt in the text area (minimum 10 characters)
2. Adjust settings:
   - **Video Length**: 4, 6, or 8 seconds (default: 8 seconds)
   - **Number of Videos**: 1-4 videos per request (default: 1)
   - **Model**: Veo 3.0 (high-quality) or Veo 3.0 Fast (optimized speed)
   - **Aspect Ratio**: 16:9 or 9:16 (default: 16:9)
   - **Person Generation**: Allow/don't allow people in videos
   - **Seed**: Optional numeric value for reproducible results
   - **Audio**: Toggle audio generation on/off
3. Click "Generate Video" or "Add to Queue"
4. Monitor progress in the queue visualization
5. Download when complete (auto-removes from queue)

#### Image-to-Video Generation
1. Upload a reference image (JPEG, PNG, etc.)
2. The system automatically:
   - Validates image format and size
   - Converts to JPEG format
   - Detects aspect ratio and adjusts settings
   - Crops if necessary to fit aspect ratio
3. Enter your video prompt
4. Adjust additional settings as needed
5. Generate and download

#### Queue Management
- **Unlimited Submissions**: No limit on number of jobs per user
- **FIFO Processing**: Jobs processed in order of submission
- **Concurrent Limit**: Maximum 2 jobs process simultaneously
- **Real-time Updates**: Queue status updates every 2 seconds
- **Job Actions**: Cancel, retry, or delete jobs based on status

### 3. Queue Visualization & Progress Monitoring

The application displays jobs in three distinct sections:

#### Currently Processing (Blue Highlight)
- Jobs actively generating videos (max 2 concurrent)
- Real-time progress percentage and status updates
- Available actions: Cancel, Delete

#### In Queue (Orange Highlight)
- Jobs waiting for processing slots
- Queue position indicator (e.g., "Queue Position: 3")
- Available actions: Cancel, Delete

#### History (Standard Styling)
- Completed, failed, or cancelled jobs
- Available actions based on status:
  - **Completed**: Download All, Download Individual Videos
  - **Failed/Cancelled**: Retry, Delete

#### Progress Stages
- **Queued**: Waiting in queue for processing slot
- **Starting** (0%): Initializing request
- **Uploading Image** (5%): Processing reference image
- **Generating** (10%): Submitting to Veo 3.0 API
- **Processing** (20-80%): Video generation in progress
- **Downloading** (90%): Retrieving completed videos
- **Completed** (100%): Ready for download

### 4. Error Handling
Common error scenarios and solutions:

- **Content Filtered**: Prompt violates safety policies - try rewording
- **Image Too Large**: Resize image to under 10MB
- **Invalid Format**: Use supported image formats only
- **Authentication Failed**: Check service account configuration
- **Quota Exceeded**: Check Google Cloud quotas and billing

## API Reference

### Endpoints

#### `POST /api/generate`
Start video generation process.

**Request Body**:
```json
{
  "prompt": "A cat playing in a garden",
  "video_length_sec": 8,
  "aspect_ratio": "16:9",
  "person_generation": "dont_allow",
  "model_name": "veo-3.0-generate-preview",
  "sampleCount": 2,
  "seed": 12345,
  "generate_audio": true,
  "user_email": "user@example.com"
}
```

**With Image** (multipart/form-data):
- `image`: Image file
- Other parameters as form fields

**Response**:
```json
{
  "job_id": "uuid-string",
  "status": "started"
}
```

#### Job Management Endpoints

**`GET /api/user-jobs?user_email=user@example.com`**
Get all jobs for user.

**`GET /api/queue-status`**
Get overall queue status.

**`POST /api/cancel/{job_id}`**
Cancel queued or processing job.

**`POST /api/retry/{job_id}`**
Retry failed or cancelled job.

**`DELETE /api/delete/{job_id}`**
Completely delete job and all associated files.

#### `GET /api/status/{job_id}`
Check generation status.

**Response**:
```json
{
  "status": "processing",
  "progress": 45,
  "message": "Video generation in progress...",
  "video_count": 2,
  "videos_requested": 2,
  "video_path": null,
  "individual_video_paths": [],
  "is_zip": true,
  "created_at": "2024-01-01T12:00:00Z",
  "error": null
}
```

#### Download Endpoints

**`GET /api/download/{job_id}`**
Download completed content (ZIP for multiple videos, MP4 for single).
Automatically deletes job 5 seconds after successful download.

**`GET /api/download/{job_id}/video/{index}`**
Download individual video (index 1-4).

**Response**: MP4 video file, ZIP package, or error message.

#### `DELETE /api/cleanup/{job_id}`
Manually clean up job files (local only).

**Response**:
```json
{
  "message": "Files cleaned up successfully"
}
```

## Configuration Reference

### Backend Configuration (`config.py`)

| Variable | Description | Default |
|----------|-------------|---------|
| `PROJECT_ID` | Google Cloud project ID | - |
| `REGION` | Google Cloud region | `us-central1` |
| `MODEL_ID` | Veo model identifier | `veo-3.0-generate-preview` |
| `MODEL_FAST_ID` | Veo Fast model identifier | `veo-3.0-fast-generate-preview` |
| `OUTPUT_GCS_BUCKET_NAME` | GCS bucket for storage | - |
| `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `../service-account.json` |
| `SECRET_KEY` | Flask secret key | - |
| `FLASK_ENV` | Environment mode | `production` |
| `PORT` | Backend server port | `7394` |
| `FRONTEND_URL` | Frontend URL for CORS | - |
| `MAX_IMAGE_SIZE` | Maximum image size in bytes | `10MB` |
| `CONCURRENT_JOB_LIMIT` | Max concurrent processing jobs | `2` |
| `MAX_RETRIES` | Max retry attempts per job | `3` |
| `WEBHOOK_URL` | Usage tracking webhook URL | - |
| `WEBHOOK_ENABLED` | Enable usage tracking | `true` |

### Frontend Configuration

Set environment variables in `.env.local`:

```env
VITE_API_BASE_URL=http://localhost:7394
VITE_DEV_MODE=true
VITE_APP_TITLE=Veo Video Generator (Dev)
VITE_MSAL_CLIENT_ID=your-azure-client-id
VITE_MSAL_AUTHORITY=https://login.microsoftonline.com/your-tenant-id
VITE_MSAL_REDIRECT_URI=http://localhost:3000
```

### Queue System Configuration

- **Unlimited Queue**: No per-user job limits
- **Video Limits**: 1-4 videos per individual request
- **Processing Slots**: Maximum 2 concurrent jobs
- **Polling Intervals**: Frontend polls every 2 seconds, backend polls Google API every 30 seconds
- **Auto-cleanup**: Jobs deleted 5 seconds after successful download
- **Retry Logic**: Up to 3 retry attempts with exponential backoff

## Troubleshooting

### Common Issues

1. **"Service account not found"**
   - Verify `service-account.json` exists and is valid
   - Check file permissions
   - Ensure service account has required roles

2. **"Quota exceeded"**
   - Check Google Cloud quotas for Vertex AI
   - Verify billing is enabled
   - Request quota increases if needed

3. **"Content filtered"**
   - Modify prompt to avoid restricted content
   - Review Google's content policies
   - Try different wording or concepts

4. **"Image validation failed"**
   - Check image format (use common formats)
   - Ensure image is at least 720x720 pixels
   - Verify file size is under 10MB

5. **CORS errors**
   - Check `FRONTEND_URL` configuration
   - Verify both frontend and backend URLs
   - Ensure proper development/production settings

### Debug Mode

Enable debug logging by setting:
```env
FLASK_DEBUG=True
```

Debug information includes:
- Request/response details
- Image processing steps
- API call parameters
- File paths and operations

### Log Files

Check application logs for detailed error information:
- Backend logs: Console output from Flask application
- Frontend logs: Browser developer console
- Google Cloud logs: Cloud Logging for API errors

## Security Considerations

- Service account keys should be kept secure and not committed to version control
- Use environment variables for sensitive configuration
- Implement proper authentication in production
- Regular rotation of service account keys
- Monitor usage and costs through Google Cloud Console
- Webhook URLs should use HTTPS in production

## Performance Tips

- Use appropriate video lengths (shorter = faster generation)
- Optimize image sizes before upload
- Monitor Google Cloud quotas and billing
- Implement caching for repeated requests
- Use CDN for frontend assets in production
- Clean up temporary files regularly

## Support and Resources

- [Google Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)
- [Veo 3.0 Model Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo)
- [Google GenAI SDK Reference](https://googleapis.dev/python/google-genai/latest/)
- [Flask Documentation](https://flask.palletsprojects.com/)
- [React Documentation](https://reactjs.org/docs/)
- [Material-UI Documentation](https://mui.com/)

For technical issues, check the debug logs and refer to Google Cloud support resources.