diff --git a/README.md b/README.md index 7b3b8dc..60261f1 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,18 @@ # Video Query Tool -A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload multiple videos simultaneously and receive AI-generated content based on customizable prompts. Features parallel processing, automatic video splitting, Azure AD B2C authentication, chunked file uploads, PDF generation with Mermaid diagrams, and comprehensive usage tracking. +A full-stack web application that processes videos using Google's Gemini AI model with dual processing modes (Single Video and Batch). Features intelligent two-stage synthesis, automatic video splitting, Azure AD B2C authentication, chunked file uploads up to 5GB, PDF generation with merged Mermaid diagrams, and comprehensive usage tracking. ## Features ### Core Functionality -- **Video Processing**: Upload and analyze videos using Google Gemini 2.0 Flash Exp AI model -- **Multiple Processing Modes**: +- **Dual Processing Modes**: + - **Single Video Mode**: Process videos individually with per-video control + - **Batch Mode**: Combine multiple related videos (up to 10) for unified analysis +- **Intelligent AI Synthesis**: Two-stage processing ensures seamless results + - Stage 1: Each video/chunk → concise summary + - Stage 2: All summaries → unified cohesive result +- **Video Processing**: Upload and analyze using Google Gemini 2.0 Flash Exp AI model +- **Prompt Templates**: - Meeting Summary - Process/Tool Documentation - Process Documentation with Mermaid Charts @@ -14,19 +20,22 @@ A full-stack web application that processes videos using Google's Gemini AI mode - **Large File Support**: Chunked upload system supporting files up to 5GB per file - **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams - **Authentication**: Azure AD B2C integration (optional, controlled via .env) -- **Parallel Processing**: Process up to 2 videos simultaneously -- **Multiple File Upload**: Upload and queue multiple videos at once +- **Parallel Processing**: Process up to 2 videos simultaneously (single mode) - **Long Video Support**: Automatic splitting and parallel chunk processing for videos > 54 minutes ### Technical Features +- **Explicit User Control**: No auto-processing - all videos require explicit "Process" button click +- **Batch Video Management**: Reorder, arrange, and remove videos before processing +- **Smart Diagram Merging**: Multiple Mermaid diagrams intelligently combined into one +- **Persistent Mode Selection**: Processing mode and batch queue persist across page refreshes - **Multiple File Queue**: Upload multiple videos, manage queue (Stop, Retry, Remove) - **Drag & Drop Upload**: Modern file upload interface with progress tracking -- **Real-time Processing**: Live status updates with parallel processing indicators +- **Real-time Status**: Live status updates (uploading → uploaded → processing → completed) - **Queue Management**: Stop, retry, or remove videos from processing queue anytime - **Automatic Video Splitting**: Videos > 54 minutes automatically split into 54-min chunks - **Rate Limiting**: Built-in API rate limiting (2-second delay) to prevent quota errors - **Error Handling**: Comprehensive error handling with retry capability -- **Processing Time Display**: Shows processing duration for each completed video +- **Processing Time Display**: Shows processing duration for each completed video/batch - **Usage Analytics**: Automated tracking via webhook integration - **Production Ready**: Systemd service configuration and deployment scripts @@ -36,7 +45,8 @@ A full-stack web application that processes videos using Google's Gemini AI mode - **Single Chunk Limit**: Individual chunks must be under 55 minutes (handled automatically) - **File Size**: Application supports uploads up to 5GB per file - **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM -- **Parallel Processing**: Max 2 videos simultaneously (rate limit protection) +- **Parallel Processing**: Max 2 videos simultaneously in single mode (rate limit protection) +- **Batch Size**: Maximum 10 videos per batch processing session - **API Rate Limits**: Gemini free tier: 5 RPM (built-in 2s delay between calls) ## Project Structure @@ -257,9 +267,13 @@ video_query/ ## API Reference ### Video Processing Endpoints -- **POST /api/process**: Main video processing endpoint +- **POST /api/process**: Single video processing endpoint - Accepts JSON: `file_path`, `filename`, `prompt` (for chunked uploads) - Returns: Processing result with content, processing time, chunks info +- **POST /api/process-batch**: Batch video processing endpoint + - Accepts JSON: `videos` (array of {file_path, filename, order}), `prompt`, `batch_id` + - Returns: Unified result for all videos, total chunks processed + - Maximum 10 videos per batch ### Chunked Upload Endpoints - **POST /api/init-upload**: Initialize chunked upload session @@ -304,14 +318,40 @@ video_query/ 3. Open: http://localhost:3000 ### Processing Videos -1. **Upload**: Drag & drop multiple videos or click to select -2. **Queue**: Videos appear in "Processing Queue" section -3. **Select Prompt**: Choose processing mode or write custom prompt -4. **Process**: Click "Process N Videos" button -5. **Monitor**: Watch real-time progress (2 videos process in parallel) -6. **Manage**: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) buttons -7. **View Results**: Check "Processed Videos" section for completed results -8. **Download**: Click "Download PDF" or "Copy Formatted" for any completed video + +The application supports two processing modes, selected via a toggle at the top: + +#### Single Video Mode (Default) +Process videos individually with per-video control: +1. **Select Mode**: Click "Single Video Mode" button at the top +2. **Upload**: Drag & drop videos or click to select files +3. **Choose Prompt**: Select a prompt template or write custom prompt +4. **Process Each Video**: Click individual "Process Video" button for each uploaded video + - Videos show status: uploading → uploaded → processing → completed + - Up to 2 videos process in parallel automatically +5. **Monitor Progress**: Watch real-time status updates and processing indicators +6. **Manage Queue**: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) per video +7. **View Results**: Completed videos appear in "Processed Videos" section +8. **Download**: Click "Download PDF" or "Copy Formatted" for any result + +#### Batch Mode +Process multiple related videos as one unified analysis: +1. **Select Mode**: Click "Batch Mode" button at the top +2. **Upload Videos**: Add multiple related videos (max 10 per batch) +3. **Arrange Videos**: Use Up/Down arrows to reorder videos in logical sequence +4. **Remove Unwanted**: Click Remove button to exclude videos from batch +5. **Choose Prompt**: Select or customize the analysis prompt +6. **Process Batch**: Click single "Process Batch" button to analyze all videos together + - Backend automatically handles video splitting and chunking + - Two-stage synthesis creates unified result across all videos + - Multiple Mermaid diagrams merged into one comprehensive diagram +7. **View Results**: Single unified result appears for entire batch +8. **Download**: Generate PDF with combined analysis + +**Key Differences**: +- **Single Mode**: Each video = separate result, manual per-video processing +- **Batch Mode**: All videos = one unified result, single batch processing +- **Explicit Control**: No auto-processing - all require button clicks ### Processing Long Videos - Videos > 54 minutes automatically split into chunks