| .claude | ||
| backend | ||
| frontend | ||
| .gitignore | ||
| CLAUDE.md | ||
| CORS_FIX_SUMMARY.md | ||
| extract_user_logs.sh | ||
| extract_user_logs_robust.sh | ||
| MSAL_CORS_CONFIGURATION_FIX.md | ||
| quick_extract.sh | ||
| README.md | ||
| requirements.txt | ||
| restart.sh | ||
| video_query.py | ||
Video Query Tool
A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload multiple videos simultaneously and receive AI-generated content based on customizable prompts. Features parallel processing, automatic video splitting, Azure AD B2C authentication, chunked file uploads, PDF generation with Mermaid diagrams, and comprehensive usage tracking.
Features
Core Functionality
- Video Processing: Upload and analyze videos using Google Gemini 2.0 Flash Exp AI model
- Multiple Processing Modes:
- Meeting Summary
- Process/Tool Documentation
- Process Documentation with Mermaid Charts
- Custom Prompts
- Large File Support: Chunked upload system supporting files up to 5GB per file
- PDF Generation: Convert results to PDF with embedded Mermaid diagrams
- Authentication: Azure AD B2C integration (optional, controlled via .env)
- Parallel Processing: Process up to 2 videos simultaneously
- Multiple File Upload: Upload and queue multiple videos at once
- Long Video Support: Automatic splitting and parallel chunk processing for videos > 54 minutes
Technical Features
- Multiple File Queue: Upload multiple videos, manage queue (Stop, Retry, Remove)
- Drag & Drop Upload: Modern file upload interface with progress tracking
- Real-time Processing: Live status updates with parallel processing indicators
- Queue Management: Stop, retry, or remove videos from processing queue anytime
- Automatic Video Splitting: Videos > 54 minutes automatically split into 54-min chunks
- Rate Limiting: Built-in API rate limiting (2-second delay) to prevent quota errors
- Error Handling: Comprehensive error handling with retry capability
- Processing Time Display: Shows processing duration for each completed video
- Usage Analytics: Automated tracking via webhook integration
- Production Ready: Systemd service configuration and deployment scripts
Limitations
- Video Length: No limit - videos automatically split into 54-minute chunks
- Single Chunk Limit: Individual chunks must be under 55 minutes (handled automatically)
- File Size: Application supports uploads up to 5GB per file
- Supported Formats: MP4, AVI, MOV, WMV, MKV, WEBM
- Parallel Processing: Max 2 videos simultaneously (rate limit protection)
- API Rate Limits: Gemini free tier: 5 RPM (built-in 2s delay between calls)
Project Structure
video_query/
├── backend/ # Flask/Hypercorn API server
│ ├── app.py # Main Flask application with PDF generation
│ ├── video_processor.py # Gemini API integration, parallel processing, rate limiting
│ ├── video_splitter.py # Video splitting for long videos (54-min chunks)
│ ├── auth.py # Azure AD B2C authentication handlers
│ ├── chunked_upload.py # Chunked file upload Blueprint
│ ├── run.py # Hypercorn production server
│ ├── requirements.txt # Python dependencies
│ ├── .env # Environment variables (GOOGLE_API_KEY)
│ └── test_*.py # API testing utilities
├── frontend/ # React SPA
│ ├── src/
│ │ ├── components/ # React components
│ │ │ ├── VideoUpload.js # Multi-file drag & drop upload
│ │ │ ├── PromptSelector.js # Mode selection and prompt editing
│ │ │ ├── ResultDisplay.js # Results with PDF generation
│ │ │ ├── AuthenticatedContent.js # Queue management, processed list
│ │ │ └── Login.js # Authentication interface
│ │ ├── auth/ # Authentication utilities
│ │ │ ├── authConfig.js # Azure AD B2C configuration
│ │ │ ├── AuthProvider.js # MSAL React provider
│ │ │ └── authApiClient.js # Authenticated API client
│ │ └── utils/
│ │ ├── chunkedUploader.js # Large file upload handler
│ │ ├── configLoader.js # Dynamic config loading
│ │ └── pathUtils.js # Path utilities
│ ├── public/
│ │ ├── config.js # Production config (committed)
│ │ ├── config.local.js # Local dev config (not committed)
│ │ └── index.html # Loads both configs
│ ├── package.json # Node.js dependencies
│ ├── .env # Frontend environment variables
│ └── build/ # Production build output
├── DEPLOYMENT.md # Production deployment instructions
├── LOG_EXTRACTION_README.md # Usage analytics documentation
├── CLAUDE.md # Development guidelines and build commands
├── restart.sh # Development restart script
├── quick_extract.sh # Log extraction utility
└── extract_user_logs*.sh # Advanced log processing
Dependencies
Backend Dependencies
- Flask 3.1.0: Web framework
- google-genai 1.45.0: Gemini AI SDK (updated API)
- Hypercorn 0.17.3: ASGI production server
- python-jose: JWT token validation for Azure AD
- flask-cors 5.0.1: Cross-origin resource sharing
- pdfkit 1.0.0: PDF generation from HTML
- cairosvg 2.8.0: SVG to PNG conversion for diagrams
- Pillow 11.2.1: Image processing
- python-dotenv 1.1.0: Environment variable management
- ffmpeg-python: Video splitting functionality
Frontend Dependencies
- React 18.2.0: UI framework
- @azure/msal-react 3.0.12: Microsoft Authentication Library
- axios 1.6.0: HTTP client with abort signal support
- bootstrap 5.3.2: UI components and styling
- mermaid 11.6.0: Diagram generation
- react-dropzone 14.2.3: Multi-file upload interface
- showdown 2.1.0: Markdown to HTML conversion
Setup Instructions
Prerequisites
- Python 3.8+
- Node.js 16+
- Google Cloud API key with Gemini access
- Azure AD B2C tenant (optional, for authentication)
- wkhtmltopdf (for PDF generation)
- ffmpeg/ffprobe (for video splitting)
Backend Setup
-
Create and activate virtual environment:
cd backend python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate -
Install dependencies:
pip install -r requirements.txt -
Set up environment variables (create
backend/.env):GOOGLE_API_KEY=your_gemini_api_key_here -
Install system dependencies:
# Ubuntu/Debian: sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg # macOS: brew install cairo wkhtmltopdf ffmpeg -
Start development server:
python3 run.py # Server runs on http://0.0.0.0:5010
Frontend Setup
-
Install Node.js dependencies:
cd frontend npm install -
Configure authentication (optional):
- Edit
frontend/.env:REACT_APP_DISABLE_AUTH=true # Disable auth for local dev - For production, update
src/auth/authConfig.jswith Azure AD B2C details
- Edit
-
Configure backend URL for local development:
- File
frontend/public/config.local.jsalready configured for localhost:5010 - This file is not committed (in .gitignore)
- File
-
Start development server:
npm start # Server runs on http://localhost:3000
Production Deployment
System Requirements
- Ubuntu/CentOS server
- Apache/Nginx web server
- Python 3.8+ with virtual environment
- wkhtmltopdf system package
- ffmpeg/ffprobe for video processing
- Node.js for building frontend
Backend Deployment
-
Install system packages:
sudo apt-get update sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg -
Set up virtual environment and install dependencies:
cd backend python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Create production .env file:
echo "GOOGLE_API_KEY=your_production_api_key" > .env -
Create systemd service (see
backend/video-query.service):sudo cp backend/video-query.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable video-query sudo systemctl start video-query
Frontend Deployment
-
Update production config (
frontend/public/config.js):window.__APP_CONFIG__ = { "basePath": "/video-query", "domain": "https://your-domain.com", "api": { "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process", "chunkedUploadEndpoint": "https://your-domain.com/video_query_back" } }; -
Build for production:
cd frontend npm run build -
Deploy to web server:
sudo cp -r build/* /var/www/html/video-query/ -
Configure web server (Apache example):
<VirtualHost *:443> DocumentRoot /var/www/html # Frontend Alias /video-query /var/www/html/video-query # Backend proxy ProxyPass /video_query_back http://localhost:5010 ProxyPassReverse /video_query_back http://localhost:5010 </VirtualHost>
API Reference
Video Processing Endpoints
- POST /api/process: Main video processing endpoint
- Accepts JSON:
file_path,filename,prompt(for chunked uploads) - Returns: Processing result with content, processing time, chunks info
- Accepts JSON:
Chunked Upload Endpoints
- POST /api/init-upload: Initialize chunked upload session
- POST /api/upload-chunk/<upload_id>: Upload file chunk
- POST /api/complete-upload/<upload_id>: Mark upload complete
- POST /api/cancel-upload/<upload_id>: Cancel upload
PDF Generation Endpoints
- POST /api/generate-pdf: Generate PDF from HTML with Mermaid diagrams
- JSON data:
html,textDiagrams,diagramPngs,videoFileName
- JSON data:
Authentication Endpoints (if enabled)
- GET /api/auth-test: Verify authentication status
Configuration Files
Backend Configuration
- backend/.env: Environment variables
GOOGLE_API_KEY=your_api_key
Frontend Configuration
- frontend/.env: React environment variables
REACT_APP_DISABLE_AUTH=true # Optional: disable auth for local dev - frontend/public/config.js: Production configuration (committed to git)
- frontend/public/config.local.js: Local development override (not committed)
Key Configuration Details
- Parallel Processing: Max 2 concurrent videos (App.js:245)
- Rate Limiting: 2-second delay between API calls (video_processor.py:224)
- File Size Threshold: 10MB for inline vs upload API (video_processor.py:167)
- Video Chunk Duration: 54 minutes (video_splitter.py)
Usage
Local Development
- Start backend:
cd backend && source venv/bin/activate && python3 run.py - Start frontend:
cd frontend && npm start - Open: http://localhost:3000
Processing Videos
- Upload: Drag & drop multiple videos or click to select
- Queue: Videos appear in "Processing Queue" section
- Select Prompt: Choose processing mode or write custom prompt
- Process: Click "Process N Videos" button
- Monitor: Watch real-time progress (2 videos process in parallel)
- Manage: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) buttons
- View Results: Check "Processed Videos" section for completed results
- Download: Click "Download PDF" or "Copy Formatted" for any completed video
Processing Long Videos
- Videos > 54 minutes automatically split into chunks
- Each chunk processed in parallel (backend handles this)
- Results intelligently combined
- Processing time displayed for transparency
Development Utilities
- restart.sh: Quick development environment restart
- backend/test_*.py: API testing and validation scripts
- backend/run.py: Production server with optimized settings for large uploads
- extract_user_logs.sh*: Usage analytics extraction
Security Features
- Azure AD B2C integration with JWT validation (optional)
- CORS protection with specific origin allowlisting
- Secure file upload validation
- Temporary file cleanup
- Token expiration handling
- Rate limiting to prevent API abuse
- Abort signal support for cancellation
Troubleshooting
Backend Issues
- 400 INVALID_ARGUMENT: Usually rate limiting - check logs for details
- File upload errors: Verify ffmpeg installed (
which ffprobe) - PDF generation fails: Ensure wkhtmltopdf installed
Frontend Issues
- CORS errors: Check backend CORS settings in app.py
- Changes not visible: Clear browser cache (Ctrl+Shift+R)
- Config not loading: Verify config.js and config.local.js exist in public/
Rate Limiting
- Backend: 2-second delay between API calls (automatic)
- Frontend: Max 2 parallel videos
- Free tier: 5 RPM limit enforced by Gemini API
License
This project is proprietary and confidential.