veo3/user_docs.md
2025-09-30 09:49:55 -05:00

357 lines
No EOL
9.6 KiB
Markdown

# Veo 3.0 Video Generator - User Documentation
## Overview
The Veo 3.0 Video Generator is a web application that allows users to create AI-generated videos using Google's Veo 3.0 model. The application features a modern React frontend with Material-UI components and a Flask backend that interfaces with Google's GenerativeAI SDK.
## Features
- **Text-to-Video Generation**: Create videos from text prompts using Google's Veo 3.0 model
- **Image-to-Video Generation**: Upload reference images to guide video generation
- **Customizable Settings**: Control video length, aspect ratio, and person generation policies
- **Real-time Progress Tracking**: Monitor generation progress with live status updates
- **Secure Authentication**: Microsoft Azure AD integration for user authentication
- **Automatic Downloads**: Direct download of generated videos after completion
- **File Management**: Automatic cleanup of temporary files and cloud storage
## System Requirements
### Prerequisites
- Google Cloud Project with Veo 3.0 API access
- Google Cloud Storage bucket for temporary file storage
- Service account with appropriate permissions
- Python 3.8+ and Node.js 16+ for development
- Microsoft Azure AD app registration (for authentication)
### Supported Formats
- **Input Images**: JPEG, PNG, GIF, BMP, TIFF, WebP, ICO (auto-converted to JPEG)
- **Output Videos**: MP4 format
- **Image Size**: Minimum 720x720 pixels, maximum 10MB
- **Video Length**: 1-60 seconds
- **Aspect Ratios**: 16:9 (landscape) or 9:16 (portrait)
## Getting Started
### 1. Environment Setup
Create a `.env` file in the project root with the following configuration:
```env
# Google Cloud Configuration
PROJECT_ID=your-google-cloud-project-id
REGION=us-central1
MODEL_ID=veo-3.0-generate-preview
OUTPUT_GCS_BUCKET_NAME=your-storage-bucket-name
SERVICE_ACCOUNT_KEY_PATH=./service-account.json
# Flask Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development
FLASK_DEBUG=True
PORT=7394
# Frontend Configuration
FRONTEND_URL=http://localhost:3000
# Webhook Configuration (optional)
WEBHOOK_URL=your-webhook-url
WEBHOOK_ENABLED=false
WEBHOOK_TIMEOUT=10
```
### 2. Google Cloud Setup
1. **Enable APIs**:
- Vertex AI API
- Cloud Storage API
- GenerativeAI API
2. **Create Service Account**:
```bash
gcloud iam service-accounts create veo-video-generator \
--display-name="Veo Video Generator"
```
3. **Grant Permissions**:
```bash
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/storage.admin"
```
4. **Download Service Account Key**:
```bash
gcloud iam service-accounts keys create service-account.json \
--iam-account=veo-video-generator@YOUR_PROJECT_ID.iam.gserviceaccount.com
```
### 3. Installation
#### Backend Setup
```bash
cd backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
#### Frontend Setup
```bash
cd frontend
npm install
```
### 4. Running the Application
#### Development Mode
Use the provided development script:
```bash
./run-dev.sh
```
This starts both frontend (port 3000) and backend (port 7394) servers.
#### Manual Startup
**Backend**:
```bash
cd backend
source venv/bin/activate
python app.py
```
**Frontend**:
```bash
cd frontend
npm run dev
```
#### Production Mode
```bash
# Backend
cd backend
python app.py
# Frontend (build and serve)
cd frontend
npm run build
# Serve the dist/ folder with your web server
```
## Using the Application
### 1. Authentication
- In development mode, authentication is bypassed with a mock user
- In production, users authenticate via Microsoft Azure AD
- Authenticated users' email addresses are tracked for usage analytics
### 2. Creating Videos
#### Text-to-Video Generation
1. Enter your video prompt in the text area
2. Adjust settings:
- **Video Length**: 1-60 seconds (default: 8 seconds)
- **Aspect Ratio**: 16:9 or 9:16 (default: 16:9)
- **Person Generation**: Allow/don't allow people in videos
3. Click "Generate Video"
4. Monitor progress in real-time
5. Download when complete
#### Image-to-Video Generation
1. Upload a reference image (JPEG, PNG, etc.)
2. The system automatically:
- Validates image format and size
- Converts to JPEG format
- Detects aspect ratio and adjusts settings
- Crops if necessary to fit aspect ratio
3. Enter your video prompt
4. Adjust additional settings as needed
5. Generate and download
### 3. Progress Monitoring
The application provides real-time feedback during generation:
- **Starting** (0%): Initializing request
- **Uploading Image** (5%): Processing reference image
- **Generating** (10%): Submitting to Veo 3.0 API
- **Processing** (20-80%): Video generation in progress
- **Downloading** (90%): Retrieving completed video
- **Completed** (100%): Ready for download
### 4. Error Handling
Common error scenarios and solutions:
- **Content Filtered**: Prompt violates safety policies - try rewording
- **Image Too Large**: Resize image to under 10MB
- **Invalid Format**: Use supported image formats only
- **Authentication Failed**: Check service account configuration
- **Quota Exceeded**: Check Google Cloud quotas and billing
## API Reference
### Endpoints
#### `POST /api/generate`
Start video generation process.
**Request Body**:
```json
{
"prompt": "A cat playing in a garden",
"video_length_sec": 8,
"aspect_ratio": "16:9",
"person_generation": "dont_allow",
"user_email": "user@example.com"
}
```
**With Image** (multipart/form-data):
- `image`: Image file
- Other parameters as form fields
**Response**:
```json
{
"job_id": "uuid-string",
"status": "started"
}
```
#### `GET /api/status/{job_id}`
Check generation status.
**Response**:
```json
{
"status": "processing",
"progress": 45,
"message": "Video generation in progress...",
"video_path": null,
"error": null
}
```
#### `GET /api/download/{job_id}`
Download completed video.
**Response**: MP4 video file or error message.
#### `DELETE /api/cleanup/{job_id}`
Manually clean up job files.
**Response**:
```json
{
"message": "Files cleaned up successfully"
}
```
## Configuration Reference
### Backend Configuration (`config.py`)
| Variable | Description | Default |
|----------|-------------|---------|
| `PROJECT_ID` | Google Cloud project ID | - |
| `REGION` | Google Cloud region | `us-central1` |
| `MODEL_ID` | Veo model identifier | `veo-3.0-generate-preview` |
| `OUTPUT_GCS_BUCKET_NAME` | GCS bucket for storage | - |
| `SERVICE_ACCOUNT_KEY_PATH` | Path to service account JSON | `../service-account.json` |
| `SECRET_KEY` | Flask secret key | - |
| `FLASK_ENV` | Environment mode | `production` |
| `PORT` | Backend server port | `7394` |
| `FRONTEND_URL` | Frontend URL for CORS | - |
| `MAX_IMAGE_SIZE` | Maximum image size in bytes | `10MB` |
| `WEBHOOK_URL` | Usage tracking webhook URL | - |
| `WEBHOOK_ENABLED` | Enable usage tracking | `true` |
### Frontend Configuration
Set environment variables in `.env.local`:
```env
VITE_API_BASE_URL=http://localhost:7394
VITE_DEV_MODE=true
```
## Troubleshooting
### Common Issues
1. **"Service account not found"**
- Verify `service-account.json` exists and is valid
- Check file permissions
- Ensure service account has required roles
2. **"Quota exceeded"**
- Check Google Cloud quotas for Vertex AI
- Verify billing is enabled
- Request quota increases if needed
3. **"Content filtered"**
- Modify prompt to avoid restricted content
- Review Google's content policies
- Try different wording or concepts
4. **"Image validation failed"**
- Check image format (use common formats)
- Ensure image is at least 720x720 pixels
- Verify file size is under 10MB
5. **CORS errors**
- Check `FRONTEND_URL` configuration
- Verify both frontend and backend URLs
- Ensure proper development/production settings
### Debug Mode
Enable debug logging by setting:
```env
FLASK_DEBUG=True
```
Debug information includes:
- Request/response details
- Image processing steps
- API call parameters
- File paths and operations
### Log Files
Check application logs for detailed error information:
- Backend logs: Console output from Flask application
- Frontend logs: Browser developer console
- Google Cloud logs: Cloud Logging for API errors
## Security Considerations
- Service account keys should be kept secure and not committed to version control
- Use environment variables for sensitive configuration
- Implement proper authentication in production
- Regular rotation of service account keys
- Monitor usage and costs through Google Cloud Console
- Webhook URLs should use HTTPS in production
## Performance Tips
- Use appropriate video lengths (shorter = faster generation)
- Optimize image sizes before upload
- Monitor Google Cloud quotas and billing
- Implement caching for repeated requests
- Use CDN for frontend assets in production
- Clean up temporary files regularly
## Support and Resources
- [Google Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)
- [Veo 3.0 Model Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo)
- [Google GenAI SDK Reference](https://googleapis.dev/python/google-genai/latest/)
- [Flask Documentation](https://flask.palletsprojects.com/)
- [React Documentation](https://reactjs.org/docs/)
- [Material-UI Documentation](https://mui.com/)
For technical issues, check the debug logs and refer to Google Cloud support resources.