No description

Find a file

Manish Tanwar 5d652fefc7 Video-Batch and Long Video Working		2025-11-06 18:10:09 +05:30
.claude	video error fix	2025-10-22 16:21:02 +05:30
backend	Video-Batch and Long Video Working	2025-11-06 18:06:43 +05:30
frontend	Video-Batch and Long Video Working	2025-11-06 18:06:43 +05:30
.gitignore	Video-Batch and Long Video Working	2025-11-06 18:10:09 +05:30
CLAUDE.md	video queue process error fix	2025-10-22 14:59:39 +05:30
CORS_FIX_SUMMARY.md	new features with video job cycles and video lenght changes	2025-10-22 14:24:13 +05:30
extract_user_logs.sh	initial commit	2025-09-18 14:25:24 -05:00
extract_user_logs_robust.sh	initial commit	2025-09-18 14:25:24 -05:00
MSAL_CORS_CONFIGURATION_FIX.md	MSAL for brandtechsandbox.oliver.solutions as the config file was pointing to that	2025-10-22 17:00:32 +05:30
quick_extract.sh	initial commit	2025-09-18 14:25:24 -05:00
README.md	video queue process error fix	2025-10-22 14:59:39 +05:30
requirements.txt	Add python-jose dependency for JWT authentication	2025-10-24 17:10:06 +05:30
restart.sh	initial commit	2025-09-18 14:25:24 -05:00
video_query.py	fixed SDK to newer version and added download file name feature for pdfs	2025-10-16 07:46:56 -05:00

README.md

Video Query Tool

A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload multiple videos simultaneously and receive AI-generated content based on customizable prompts. Features parallel processing, automatic video splitting, Azure AD B2C authentication, chunked file uploads, PDF generation with Mermaid diagrams, and comprehensive usage tracking.

Features

Core Functionality

Video Processing: Upload and analyze videos using Google Gemini 2.0 Flash Exp AI model
Multiple Processing Modes:
- Meeting Summary
- Process/Tool Documentation
- Process Documentation with Mermaid Charts
- Custom Prompts
Large File Support: Chunked upload system supporting files up to 5GB per file
PDF Generation: Convert results to PDF with embedded Mermaid diagrams
Authentication: Azure AD B2C integration (optional, controlled via .env)
Parallel Processing: Process up to 2 videos simultaneously
Multiple File Upload: Upload and queue multiple videos at once
Long Video Support: Automatic splitting and parallel chunk processing for videos > 54 minutes

Technical Features

Multiple File Queue: Upload multiple videos, manage queue (Stop, Retry, Remove)
Drag & Drop Upload: Modern file upload interface with progress tracking
Real-time Processing: Live status updates with parallel processing indicators
Queue Management: Stop, retry, or remove videos from processing queue anytime
Automatic Video Splitting: Videos > 54 minutes automatically split into 54-min chunks
Rate Limiting: Built-in API rate limiting (2-second delay) to prevent quota errors
Error Handling: Comprehensive error handling with retry capability
Processing Time Display: Shows processing duration for each completed video
Usage Analytics: Automated tracking via webhook integration
Production Ready: Systemd service configuration and deployment scripts

Limitations

Video Length: No limit - videos automatically split into 54-minute chunks
Single Chunk Limit: Individual chunks must be under 55 minutes (handled automatically)
File Size: Application supports uploads up to 5GB per file
Supported Formats: MP4, AVI, MOV, WMV, MKV, WEBM
Parallel Processing: Max 2 videos simultaneously (rate limit protection)
API Rate Limits: Gemini free tier: 5 RPM (built-in 2s delay between calls)

Project Structure

video_query/
├── backend/                    # Flask/Hypercorn API server
│   ├── app.py                 # Main Flask application with PDF generation
│   ├── video_processor.py     # Gemini API integration, parallel processing, rate limiting
│   ├── video_splitter.py      # Video splitting for long videos (54-min chunks)
│   ├── auth.py                # Azure AD B2C authentication handlers
│   ├── chunked_upload.py      # Chunked file upload Blueprint
│   ├── run.py                 # Hypercorn production server
│   ├── requirements.txt       # Python dependencies
│   ├── .env                   # Environment variables (GOOGLE_API_KEY)
│   └── test_*.py              # API testing utilities
├── frontend/                   # React SPA
│   ├── src/
│   │   ├── components/        # React components
│   │   │   ├── VideoUpload.js    # Multi-file drag & drop upload
│   │   │   ├── PromptSelector.js # Mode selection and prompt editing
│   │   │   ├── ResultDisplay.js  # Results with PDF generation
│   │   │   ├── AuthenticatedContent.js # Queue management, processed list
│   │   │   └── Login.js         # Authentication interface
│   │   ├── auth/              # Authentication utilities
│   │   │   ├── authConfig.js     # Azure AD B2C configuration
│   │   │   ├── AuthProvider.js   # MSAL React provider
│   │   │   └── authApiClient.js  # Authenticated API client
│   │   └── utils/
│   │       ├── chunkedUploader.js # Large file upload handler
│   │       ├── configLoader.js    # Dynamic config loading
│   │       └── pathUtils.js       # Path utilities
│   ├── public/
│   │   ├── config.js              # Production config (committed)
│   │   ├── config.local.js        # Local dev config (not committed)
│   │   └── index.html             # Loads both configs
│   ├── package.json           # Node.js dependencies
│   ├── .env                   # Frontend environment variables
│   └── build/                 # Production build output
├── DEPLOYMENT.md              # Production deployment instructions
├── LOG_EXTRACTION_README.md   # Usage analytics documentation
├── CLAUDE.md                  # Development guidelines and build commands
├── restart.sh                 # Development restart script
├── quick_extract.sh           # Log extraction utility
└── extract_user_logs*.sh      # Advanced log processing

Dependencies

Backend Dependencies

Flask 3.1.0: Web framework
google-genai 1.45.0: Gemini AI SDK (updated API)
Hypercorn 0.17.3: ASGI production server
python-jose: JWT token validation for Azure AD
flask-cors 5.0.1: Cross-origin resource sharing
pdfkit 1.0.0: PDF generation from HTML
cairosvg 2.8.0: SVG to PNG conversion for diagrams
Pillow 11.2.1: Image processing
python-dotenv 1.1.0: Environment variable management
ffmpeg-python: Video splitting functionality

Frontend Dependencies

React 18.2.0: UI framework
@azure/msal-react 3.0.12: Microsoft Authentication Library
axios 1.6.0: HTTP client with abort signal support
bootstrap 5.3.2: UI components and styling
mermaid 11.6.0: Diagram generation
react-dropzone 14.2.3: Multi-file upload interface
showdown 2.1.0: Markdown to HTML conversion

Setup Instructions

Prerequisites

Python 3.8+
Node.js 16+
Google Cloud API key with Gemini access
Azure AD B2C tenant (optional, for authentication)
wkhtmltopdf (for PDF generation)
ffmpeg/ffprobe (for video splitting)

Backend Setup

Create and activate virtual environment:

cd backend
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables (create backend/.env):
```
GOOGLE_API_KEY=your_gemini_api_key_here
```

Install system dependencies:

# Ubuntu/Debian:
sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg

# macOS:
brew install cairo wkhtmltopdf ffmpeg

Start development server:

python3 run.py
# Server runs on http://0.0.0.0:5010

Frontend Setup

Install Node.js dependencies:
```
cd frontend
npm install
```
Configure authentication (optional):
- Edit frontend/.env:
```
REACT_APP_DISABLE_AUTH=true  # Disable auth for local dev
```
- For production, update src/auth/authConfig.js with Azure AD B2C details
Configure backend URL for local development:
- File frontend/public/config.local.js already configured for localhost:5010
- This file is not committed (in .gitignore)

Start development server:

npm start
# Server runs on http://localhost:3000

Production Deployment

System Requirements

Ubuntu/CentOS server
Apache/Nginx web server
Python 3.8+ with virtual environment
wkhtmltopdf system package
ffmpeg/ffprobe for video processing
Node.js for building frontend

Backend Deployment

Install system packages:

sudo apt-get update
sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg

Set up virtual environment and install dependencies:

cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Create production .env file:

echo "GOOGLE_API_KEY=your_production_api_key" > .env

Create systemd service (see backend/video-query.service):

sudo cp backend/video-query.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable video-query
sudo systemctl start video-query

Frontend Deployment

Update production config (frontend/public/config.js):

window.__APP_CONFIG__ = {
  "basePath": "/video-query",
  "domain": "https://your-domain.com",
  "api": {
    "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process",
    "chunkedUploadEndpoint": "https://your-domain.com/video_query_back"
  }
};

Build for production:
```
cd frontend
npm run build
```

Deploy to web server:

sudo cp -r build/* /var/www/html/video-query/

Configure web server (Apache example):

<VirtualHost *:443>
    DocumentRoot /var/www/html

    # Frontend
    Alias /video-query /var/www/html/video-query

    # Backend proxy
    ProxyPass /video_query_back http://localhost:5010
    ProxyPassReverse /video_query_back http://localhost:5010
</VirtualHost>

API Reference

Video Processing Endpoints

POST /api/process: Main video processing endpoint
- Accepts JSON: file_path, filename, prompt (for chunked uploads)
- Returns: Processing result with content, processing time, chunks info

Chunked Upload Endpoints

POST /api/init-upload: Initialize chunked upload session
POST /api/upload-chunk/<upload_id>: Upload file chunk
POST /api/complete-upload/<upload_id>: Mark upload complete
POST /api/cancel-upload/<upload_id>: Cancel upload

PDF Generation Endpoints

POST /api/generate-pdf: Generate PDF from HTML with Mermaid diagrams
- JSON data: html, textDiagrams, diagramPngs, videoFileName

Authentication Endpoints (if enabled)

GET /api/auth-test: Verify authentication status

Configuration Files

Backend Configuration

backend/.env: Environment variables
```
GOOGLE_API_KEY=your_api_key
```

Frontend Configuration

frontend/.env: React environment variables

REACT_APP_DISABLE_AUTH=true  # Optional: disable auth for local dev

frontend/public/config.js: Production configuration (committed to git)
frontend/public/config.local.js: Local development override (not committed)

Key Configuration Details

Parallel Processing: Max 2 concurrent videos (App.js:245)
Rate Limiting: 2-second delay between API calls (video_processor.py:224)
File Size Threshold: 10MB for inline vs upload API (video_processor.py:167)
Video Chunk Duration: 54 minutes (video_splitter.py)

Usage

Local Development

Start backend: cd backend && source venv/bin/activate && python3 run.py
Start frontend: cd frontend && npm start
Open: http://localhost:3000

Processing Videos

Upload: Drag & drop multiple videos or click to select
Queue: Videos appear in "Processing Queue" section
Select Prompt: Choose processing mode or write custom prompt
Process: Click "Process N Videos" button
Monitor: Watch real-time progress (2 videos process in parallel)
Manage: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) buttons
View Results: Check "Processed Videos" section for completed results
Download: Click "Download PDF" or "Copy Formatted" for any completed video

Processing Long Videos

Videos > 54 minutes automatically split into chunks
Each chunk processed in parallel (backend handles this)
Results intelligently combined
Processing time displayed for transparency

Development Utilities

restart.sh: Quick development environment restart
backend/test_*.py: API testing and validation scripts
backend/run.py: Production server with optimized settings for large uploads
extract_user_logs.sh*: Usage analytics extraction

Security Features

Azure AD B2C integration with JWT validation (optional)
CORS protection with specific origin allowlisting
Secure file upload validation
Temporary file cleanup
Token expiration handling
Rate limiting to prevent API abuse
Abort signal support for cancellation

Troubleshooting

Backend Issues

400 INVALID_ARGUMENT: Usually rate limiting - check logs for details
File upload errors: Verify ffmpeg installed (which ffprobe)
PDF generation fails: Ensure wkhtmltopdf installed

Frontend Issues

CORS errors: Check backend CORS settings in app.py
Changes not visible: Clear browser cache (Ctrl+Shift+R)
Config not loading: Verify config.js and config.local.js exist in public/

Rate Limiting

Backend: 2-second delay between API calls (automatic)
Frontend: Max 2 parallel videos
Free tier: 5 RPM limit enforced by Gemini API

License

This project is proprietary and confidential.