# Video Query Tool A full-stack web application that processes videos using Google's Gemini AI model with dual processing modes (Single Video and Batch). Features intelligent two-stage synthesis, automatic video splitting, Azure AD B2C authentication, chunked file uploads up to 5GB, PDF generation with merged Mermaid diagrams, and comprehensive usage tracking. ## Features ### Core Functionality - **Dual Processing Modes**: - **Single Video Mode**: Process videos individually with per-video control - **Batch Mode**: Combine multiple related videos (up to 10) for unified analysis - **Intelligent AI Synthesis**: Two-stage processing ensures seamless results - Stage 1: Each video/chunk → concise summary - Stage 2: All summaries → unified cohesive result - **Video Processing**: Upload and analyze using Google Gemini 2.0 Flash Exp AI model - **Prompt Templates**: - Meeting Summary - Process/Tool Documentation - Process Documentation with Mermaid Charts - Custom Prompts - **Large File Support**: Chunked upload system supporting files up to 5GB per file - **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams - **Authentication**: Azure AD B2C integration (optional, controlled via .env) - **Parallel Processing**: Process up to 2 videos simultaneously (single mode) - **Long Video Support**: Automatic splitting and parallel chunk processing for videos > 54 minutes ### Technical Features - **Explicit User Control**: No auto-processing - all videos require explicit "Process" button click - **Batch Video Management**: Reorder, arrange, and remove videos before processing - **Smart Diagram Merging**: Multiple Mermaid diagrams intelligently combined into one - **Persistent Mode Selection**: Processing mode and batch queue persist across page refreshes - **Multiple File Queue**: Upload multiple videos, manage queue (Stop, Retry, Remove) - **Drag & Drop Upload**: Modern file upload interface with progress tracking - **Real-time Status**: Live status updates (uploading → uploaded → processing → completed) - **Queue Management**: Stop, retry, or remove videos from processing queue anytime - **Automatic Video Splitting**: Videos > 54 minutes automatically split into 54-min chunks - **Rate Limiting**: Built-in API rate limiting (2-second delay) to prevent quota errors - **Error Handling**: Comprehensive error handling with retry capability - **Processing Time Display**: Shows processing duration for each completed video/batch - **Usage Analytics**: Automated tracking via webhook integration - **Production Ready**: Systemd service configuration and deployment scripts ## Limitations - **Video Length**: No limit - videos automatically split into 54-minute chunks - **Single Chunk Limit**: Individual chunks must be under 55 minutes (handled automatically) - **File Size**: Application supports uploads up to 5GB per file - **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM - **Parallel Processing**: Max 2 videos simultaneously in single mode (rate limit protection) - **Batch Size**: Maximum 10 videos per batch processing session - **API Rate Limits**: Gemini free tier: 5 RPM (built-in 2s delay between calls) ## Project Structure ``` video_query/ ├── backend/ # Flask/Hypercorn API server │ ├── app.py # Main Flask application with PDF generation │ ├── video_processor.py # Gemini API integration, parallel processing, rate limiting │ ├── video_splitter.py # Video splitting for long videos (54-min chunks) │ ├── auth.py # Azure AD B2C authentication handlers │ ├── chunked_upload.py # Chunked file upload Blueprint │ ├── run.py # Hypercorn production server │ ├── requirements.txt # Python dependencies │ ├── .env # Environment variables (GOOGLE_API_KEY) │ └── test_*.py # API testing utilities ├── frontend/ # React SPA │ ├── src/ │ │ ├── components/ # React components │ │ │ ├── VideoUpload.js # Multi-file drag & drop upload │ │ │ ├── PromptSelector.js # Mode selection and prompt editing │ │ │ ├── ResultDisplay.js # Results with PDF generation │ │ │ ├── AuthenticatedContent.js # Queue management, processed list │ │ │ └── Login.js # Authentication interface │ │ ├── auth/ # Authentication utilities │ │ │ ├── authConfig.js # Azure AD B2C configuration │ │ │ ├── AuthProvider.js # MSAL React provider │ │ │ └── authApiClient.js # Authenticated API client │ │ └── utils/ │ │ ├── chunkedUploader.js # Large file upload handler │ │ ├── configLoader.js # Dynamic config loading │ │ └── pathUtils.js # Path utilities │ ├── public/ │ │ ├── config.js # Production config (committed) │ │ ├── config.local.js # Local dev config (not committed) │ │ └── index.html # Loads both configs │ ├── package.json # Node.js dependencies │ ├── .env # Frontend environment variables │ └── build/ # Production build output ├── DEPLOYMENT.md # Production deployment instructions ├── LOG_EXTRACTION_README.md # Usage analytics documentation ├── CLAUDE.md # Development guidelines and build commands ├── restart.sh # Development restart script ├── quick_extract.sh # Log extraction utility └── extract_user_logs*.sh # Advanced log processing ``` ## Dependencies ### Backend Dependencies - **Flask 3.1.0**: Web framework - **google-genai 1.45.0**: Gemini AI SDK (updated API) - **Hypercorn 0.17.3**: ASGI production server - **python-jose**: JWT token validation for Azure AD - **flask-cors 5.0.1**: Cross-origin resource sharing - **pdfkit 1.0.0**: PDF generation from HTML - **cairosvg 2.8.0**: SVG to PNG conversion for diagrams - **Pillow 11.2.1**: Image processing - **python-dotenv 1.1.0**: Environment variable management - **ffmpeg-python**: Video splitting functionality ### Frontend Dependencies - **React 18.2.0**: UI framework - **@azure/msal-react 3.0.12**: Microsoft Authentication Library - **axios 1.6.0**: HTTP client with abort signal support - **bootstrap 5.3.2**: UI components and styling - **mermaid 11.6.0**: Diagram generation - **react-dropzone 14.2.3**: Multi-file upload interface - **showdown 2.1.0**: Markdown to HTML conversion ## Setup Instructions ### Prerequisites - Python 3.8+ - Node.js 16+ - Google Cloud API key with Gemini access - Azure AD B2C tenant (optional, for authentication) - wkhtmltopdf (for PDF generation) - ffmpeg/ffprobe (for video splitting) ### Backend Setup 1. **Create and activate virtual environment**: ```bash cd backend python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` 2. **Install dependencies**: ```bash pip install -r requirements.txt ``` 3. **Set up environment variables** (create `backend/.env`): ```bash GOOGLE_API_KEY=your_gemini_api_key_here ``` 4. **Install system dependencies**: ```bash # Ubuntu/Debian: sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg # macOS: brew install cairo wkhtmltopdf ffmpeg ``` 5. **Start development server**: ```bash python3 run.py # Server runs on http://0.0.0.0:5010 ``` ### Frontend Setup 1. **Install Node.js dependencies**: ```bash cd frontend npm install ``` 2. **Configure authentication** (optional): - Edit `frontend/.env`: ``` REACT_APP_DISABLE_AUTH=true # Disable auth for local dev ``` - For production, update `src/auth/authConfig.js` with Azure AD B2C details 3. **Configure backend URL for local development**: - File `frontend/public/config.local.js` already configured for localhost:5010 - This file is not committed (in .gitignore) 4. **Start development server**: ```bash npm start # Server runs on http://localhost:3000 ``` ## Production Deployment ### System Requirements - Ubuntu/CentOS server - Apache/Nginx web server - Python 3.8+ with virtual environment - wkhtmltopdf system package - ffmpeg/ffprobe for video processing - Node.js for building frontend ### Backend Deployment 1. **Install system packages**: ```bash sudo apt-get update sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg ``` 2. **Set up virtual environment and install dependencies**: ```bash cd backend python3 -m venv venv source venv/bin/activate pip install -r requirements.txt ``` 3. **Create production .env file**: ```bash echo "GOOGLE_API_KEY=your_production_api_key" > .env ``` 4. **Create systemd service** (see `backend/video-query.service`): ```bash sudo cp backend/video-query.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable video-query sudo systemctl start video-query ``` ### Frontend Deployment 1. **Update production config** (`frontend/public/config.js`): ```javascript window.__APP_CONFIG__ = { "basePath": "/video-query", "domain": "https://your-domain.com", "api": { "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process", "chunkedUploadEndpoint": "https://your-domain.com/video_query_back" } }; ``` 2. **Build for production**: ```bash cd frontend npm run build ``` 3. **Deploy to web server**: ```bash sudo cp -r build/* /var/www/html/video-query/ ``` 4. **Configure web server** (Apache example): ```apache DocumentRoot /var/www/html # Frontend Alias /video-query /var/www/html/video-query # Backend proxy ProxyPass /video_query_back http://localhost:5010 ProxyPassReverse /video_query_back http://localhost:5010 ``` ## API Reference ### Video Processing Endpoints - **POST /api/process**: Single video processing endpoint - Accepts JSON: `file_path`, `filename`, `prompt` (for chunked uploads) - Returns: Processing result with content, processing time, chunks info - **POST /api/process-batch**: Batch video processing endpoint - Accepts JSON: `videos` (array of {file_path, filename, order}), `prompt`, `batch_id` - Returns: Unified result for all videos, total chunks processed - Maximum 10 videos per batch ### Chunked Upload Endpoints - **POST /api/init-upload**: Initialize chunked upload session - **POST /api/upload-chunk/**: Upload file chunk - **POST /api/complete-upload/**: Mark upload complete - **POST /api/cancel-upload/**: Cancel upload ### PDF Generation Endpoints - **POST /api/generate-pdf**: Generate PDF from HTML with Mermaid diagrams - JSON data: `html`, `textDiagrams`, `diagramPngs`, `videoFileName` ### Authentication Endpoints (if enabled) - **GET /api/auth-test**: Verify authentication status ## Configuration Files ### Backend Configuration - **backend/.env**: Environment variables ``` GOOGLE_API_KEY=your_api_key ``` ### Frontend Configuration - **frontend/.env**: React environment variables ``` REACT_APP_DISABLE_AUTH=true # Optional: disable auth for local dev ``` - **frontend/public/config.js**: Production configuration (committed to git) - **frontend/public/config.local.js**: Local development override (not committed) ### Key Configuration Details - **Parallel Processing**: Max 2 concurrent videos (App.js:245) - **Rate Limiting**: 2-second delay between API calls (video_processor.py:224) - **File Size Threshold**: 10MB for inline vs upload API (video_processor.py:167) - **Video Chunk Duration**: 54 minutes (video_splitter.py) ## Usage ### Local Development 1. Start backend: `cd backend && source venv/bin/activate && python3 run.py` 2. Start frontend: `cd frontend && npm start` 3. Open: http://localhost:3000 ### Processing Videos The application supports two processing modes, selected via a toggle at the top: #### Single Video Mode (Default) Process videos individually with per-video control: 1. **Select Mode**: Click "Single Video Mode" button at the top 2. **Upload**: Drag & drop videos or click to select files 3. **Choose Prompt**: Select a prompt template or write custom prompt 4. **Process Each Video**: Click individual "Process Video" button for each uploaded video - Videos show status: uploading → uploaded → processing → completed - Up to 2 videos process in parallel automatically 5. **Monitor Progress**: Watch real-time status updates and processing indicators 6. **Manage Queue**: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) per video 7. **View Results**: Completed videos appear in "Processed Videos" section 8. **Download**: Click "Download PDF" or "Copy Formatted" for any result #### Batch Mode Process multiple related videos as one unified analysis: 1. **Select Mode**: Click "Batch Mode" button at the top 2. **Upload Videos**: Add multiple related videos (max 10 per batch) 3. **Arrange Videos**: Use Up/Down arrows to reorder videos in logical sequence 4. **Remove Unwanted**: Click Remove button to exclude videos from batch 5. **Choose Prompt**: Select or customize the analysis prompt 6. **Process Batch**: Click single "Process Batch" button to analyze all videos together - Backend automatically handles video splitting and chunking - Two-stage synthesis creates unified result across all videos - Multiple Mermaid diagrams merged into one comprehensive diagram 7. **View Results**: Single unified result appears for entire batch 8. **Download**: Generate PDF with combined analysis **Key Differences**: - **Single Mode**: Each video = separate result, manual per-video processing - **Batch Mode**: All videos = one unified result, single batch processing - **Explicit Control**: No auto-processing - all require button clicks ### Processing Long Videos - Videos > 54 minutes automatically split into chunks - Each chunk processed in parallel (backend handles this) - Results intelligently combined - Processing time displayed for transparency ## Development Utilities - **restart.sh**: Quick development environment restart - **backend/test_*.py**: API testing and validation scripts - **backend/run.py**: Production server with optimized settings for large uploads - **extract_user_logs*.sh**: Usage analytics extraction ## Security Features - Azure AD B2C integration with JWT validation (optional) - CORS protection with specific origin allowlisting - Secure file upload validation - Temporary file cleanup - Token expiration handling - Rate limiting to prevent API abuse - Abort signal support for cancellation ## Troubleshooting ### Backend Issues - **400 INVALID_ARGUMENT**: Usually rate limiting - check logs for details - **File upload errors**: Verify ffmpeg installed (`which ffprobe`) - **PDF generation fails**: Ensure wkhtmltopdf installed ### Frontend Issues - **CORS errors**: Check backend CORS settings in app.py - **Changes not visible**: Clear browser cache (Ctrl+Shift+R) - **Config not loading**: Verify config.js and config.local.js exist in public/ ### Rate Limiting - Backend: 2-second delay between API calls (automatic) - Frontend: Max 2 parallel videos - Free tier: 5 RPM limit enforced by Gemini API ## License This project is proprietary and confidential.