No description

Find a file

Manish Tanwar ec9f7426e4 Update in Bitrate and Small Large Size Logic		2025-11-27 02:48:15 +05:30
.claude	pdf instructions update check-2	2025-11-15 05:27:08 +05:30
backend	Update in Bitrate and Small Large Size Logic	2025-11-27 02:48:15 +05:30
frontend	Video-Batch and Long Video Working	2025-11-06 18:06:43 +05:30
.gitignore	pdf instructions update	2025-11-15 03:56:00 +05:30
CLAUDE.md	Update in Bitrate and Small Large Size Logic	2025-11-27 02:48:15 +05:30
extract_user_logs.sh	initial commit	2025-09-18 14:25:24 -05:00
extract_user_logs_robust.sh	initial commit	2025-09-18 14:25:24 -05:00
quick_extract.sh	initial commit	2025-09-18 14:25:24 -05:00
README.md	Readme Update with working	2025-11-06 18:49:55 +05:30
requirements.txt	Add python-jose dependency for JWT authentication	2025-10-24 17:10:06 +05:30
restart.sh	initial commit	2025-09-18 14:25:24 -05:00
video_query.py	fixed SDK to newer version and added download file name feature for pdfs	2025-10-16 07:46:56 -05:00

README.md

Video Query Tool

A full-stack web application that processes videos using Google's Gemini AI model with dual processing modes (Single Video and Batch). Features intelligent two-stage synthesis, automatic video splitting, Azure AD B2C authentication, chunked file uploads up to 5GB, PDF generation with merged Mermaid diagrams, and comprehensive usage tracking.

Features

Core Functionality

Dual Processing Modes:
- Single Video Mode: Process videos individually with per-video control
- Batch Mode: Combine multiple related videos (up to 10) for unified analysis
Intelligent AI Synthesis: Two-stage processing ensures seamless results
- Stage 1: Each video/chunk → concise summary
- Stage 2: All summaries → unified cohesive result
Video Processing: Upload and analyze using Google Gemini 2.0 Flash Exp AI model
Prompt Templates:
- Meeting Summary
- Process/Tool Documentation
- Process Documentation with Mermaid Charts
- Custom Prompts
Large File Support: Chunked upload system supporting files up to 5GB per file
PDF Generation: Convert results to PDF with embedded Mermaid diagrams
Authentication: Azure AD B2C integration (optional, controlled via .env)
Parallel Processing: Process up to 2 videos simultaneously (single mode)
Long Video Support: Automatic splitting and parallel chunk processing for videos > 54 minutes

Technical Features

Explicit User Control: No auto-processing - all videos require explicit "Process" button click
Batch Video Management: Reorder, arrange, and remove videos before processing
Smart Diagram Merging: Multiple Mermaid diagrams intelligently combined into one
Persistent Mode Selection: Processing mode and batch queue persist across page refreshes
Multiple File Queue: Upload multiple videos, manage queue (Stop, Retry, Remove)
Drag & Drop Upload: Modern file upload interface with progress tracking
Real-time Status: Live status updates (uploading → uploaded → processing → completed)
Queue Management: Stop, retry, or remove videos from processing queue anytime
Automatic Video Splitting: Videos > 54 minutes automatically split into 54-min chunks
Rate Limiting: Built-in API rate limiting (2-second delay) to prevent quota errors
Error Handling: Comprehensive error handling with retry capability
Processing Time Display: Shows processing duration for each completed video/batch
Usage Analytics: Automated tracking via webhook integration
Production Ready: Systemd service configuration and deployment scripts

Limitations

Video Length: No limit - videos automatically split into 54-minute chunks
Single Chunk Limit: Individual chunks must be under 55 minutes (handled automatically)
File Size: Application supports uploads up to 5GB per file
Supported Formats: MP4, AVI, MOV, WMV, MKV, WEBM
Parallel Processing: Max 2 videos simultaneously in single mode (rate limit protection)
Batch Size: Maximum 10 videos per batch processing session
API Rate Limits: Gemini free tier: 5 RPM (built-in 2s delay between calls)

Project Structure

video_query/
├── backend/                    # Flask/Hypercorn API server
│   ├── app.py                 # Main Flask application with PDF generation
│   ├── video_processor.py     # Gemini API integration, parallel processing, rate limiting
│   ├── video_splitter.py      # Video splitting for long videos (54-min chunks)
│   ├── auth.py                # Azure AD B2C authentication handlers
│   ├── chunked_upload.py      # Chunked file upload Blueprint
│   ├── run.py                 # Hypercorn production server
│   ├── requirements.txt       # Python dependencies
│   ├── .env                   # Environment variables (GOOGLE_API_KEY)
│   └── test_*.py              # API testing utilities
├── frontend/                   # React SPA
│   ├── src/
│   │   ├── components/        # React components
│   │   │   ├── VideoUpload.js    # Multi-file drag & drop upload
│   │   │   ├── PromptSelector.js # Mode selection and prompt editing
│   │   │   ├── ResultDisplay.js  # Results with PDF generation
│   │   │   ├── AuthenticatedContent.js # Queue management, processed list
│   │   │   └── Login.js         # Authentication interface
│   │   ├── auth/              # Authentication utilities
│   │   │   ├── authConfig.js     # Azure AD B2C configuration
│   │   │   ├── AuthProvider.js   # MSAL React provider
│   │   │   └── authApiClient.js  # Authenticated API client
│   │   └── utils/
│   │       ├── chunkedUploader.js # Large file upload handler
│   │       ├── configLoader.js    # Dynamic config loading
│   │       └── pathUtils.js       # Path utilities
│   ├── public/
│   │   ├── config.js              # Production config (committed)
│   │   ├── config.local.js        # Local dev config (not committed)
│   │   └── index.html             # Loads both configs
│   ├── package.json           # Node.js dependencies
│   ├── .env                   # Frontend environment variables
│   └── build/                 # Production build output
├── DEPLOYMENT.md              # Production deployment instructions
├── LOG_EXTRACTION_README.md   # Usage analytics documentation
├── CLAUDE.md                  # Development guidelines and build commands
├── restart.sh                 # Development restart script
├── quick_extract.sh           # Log extraction utility
└── extract_user_logs*.sh      # Advanced log processing

Dependencies

Backend Dependencies

Flask 3.1.0: Web framework
google-genai 1.45.0: Gemini AI SDK (updated API)
Hypercorn 0.17.3: ASGI production server
python-jose: JWT token validation for Azure AD
flask-cors 5.0.1: Cross-origin resource sharing
pdfkit 1.0.0: PDF generation from HTML
cairosvg 2.8.0: SVG to PNG conversion for diagrams
Pillow 11.2.1: Image processing
python-dotenv 1.1.0: Environment variable management
ffmpeg-python: Video splitting functionality

Frontend Dependencies

React 18.2.0: UI framework
@azure/msal-react 3.0.12: Microsoft Authentication Library
axios 1.6.0: HTTP client with abort signal support
bootstrap 5.3.2: UI components and styling
mermaid 11.6.0: Diagram generation
react-dropzone 14.2.3: Multi-file upload interface
showdown 2.1.0: Markdown to HTML conversion

Setup Instructions

Prerequisites

Python 3.8+
Node.js 16+
Google Cloud API key with Gemini access
Azure AD B2C tenant (optional, for authentication)
wkhtmltopdf (for PDF generation)
ffmpeg/ffprobe (for video splitting)

Backend Setup

Create and activate virtual environment:

cd backend
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables (create backend/.env):
```
GOOGLE_API_KEY=your_gemini_api_key_here
```

Install system dependencies:

# Ubuntu/Debian:
sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg

# macOS:
brew install cairo wkhtmltopdf ffmpeg

Start development server:

python3 run.py
# Server runs on http://0.0.0.0:5010

Frontend Setup

Install Node.js dependencies:
```
cd frontend
npm install
```
Configure authentication (optional):
- Edit frontend/.env:
```
REACT_APP_DISABLE_AUTH=true  # Disable auth for local dev
```
- For production, update src/auth/authConfig.js with Azure AD B2C details
Configure backend URL for local development:
- File frontend/public/config.local.js already configured for localhost:5010
- This file is not committed (in .gitignore)

Start development server:

npm start
# Server runs on http://localhost:3000

Production Deployment

System Requirements

Ubuntu/CentOS server
Apache/Nginx web server
Python 3.8+ with virtual environment
wkhtmltopdf system package
ffmpeg/ffprobe for video processing
Node.js for building frontend

Backend Deployment

Install system packages:

sudo apt-get update
sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg

Set up virtual environment and install dependencies:

cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Create production .env file:

echo "GOOGLE_API_KEY=your_production_api_key" > .env

Create systemd service (see backend/video-query.service):

sudo cp backend/video-query.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable video-query
sudo systemctl start video-query

Frontend Deployment

Update production config (frontend/public/config.js):

window.__APP_CONFIG__ = {
  "basePath": "/video-query",
  "domain": "https://your-domain.com",
  "api": {
    "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process",
    "chunkedUploadEndpoint": "https://your-domain.com/video_query_back"
  }
};

Build for production:
```
cd frontend
npm run build
```

Deploy to web server:

sudo cp -r build/* /var/www/html/video-query/

Configure web server (Apache example):

<VirtualHost *:443>
    DocumentRoot /var/www/html

    # Frontend
    Alias /video-query /var/www/html/video-query

    # Backend proxy
    ProxyPass /video_query_back http://localhost:5010
    ProxyPassReverse /video_query_back http://localhost:5010
</VirtualHost>

API Reference

Video Processing Endpoints

POST /api/process: Single video processing endpoint
- Accepts JSON: file_path, filename, prompt (for chunked uploads)
- Returns: Processing result with content, processing time, chunks info
POST /api/process-batch: Batch video processing endpoint
- Accepts JSON: videos (array of {file_path, filename, order}), prompt, batch_id
- Returns: Unified result for all videos, total chunks processed
- Maximum 10 videos per batch

Chunked Upload Endpoints

POST /api/init-upload: Initialize chunked upload session
POST /api/upload-chunk/<upload_id>: Upload file chunk
POST /api/complete-upload/<upload_id>: Mark upload complete
POST /api/cancel-upload/<upload_id>: Cancel upload

PDF Generation Endpoints

POST /api/generate-pdf: Generate PDF from HTML with Mermaid diagrams
- JSON data: html, textDiagrams, diagramPngs, videoFileName

Authentication Endpoints (if enabled)

GET /api/auth-test: Verify authentication status

Configuration Files

Backend Configuration

backend/.env: Environment variables
```
GOOGLE_API_KEY=your_api_key
```

Frontend Configuration

frontend/.env: React environment variables

REACT_APP_DISABLE_AUTH=true  # Optional: disable auth for local dev

frontend/public/config.js: Production configuration (committed to git)
frontend/public/config.local.js: Local development override (not committed)

Key Configuration Details

Parallel Processing: Max 2 concurrent videos (App.js:245)
Rate Limiting: 2-second delay between API calls (video_processor.py:224)
File Size Threshold: 10MB for inline vs upload API (video_processor.py:167)
Video Chunk Duration: 54 minutes (video_splitter.py)

Usage

Local Development

Start backend: cd backend && source venv/bin/activate && python3 run.py
Start frontend: cd frontend && npm start
Open: http://localhost:3000

Processing Videos

The application supports two processing modes, selected via a toggle at the top:

Single Video Mode (Default)

Process videos individually with per-video control:

Select Mode: Click "Single Video Mode" button at the top
Upload: Drag & drop videos or click to select files
Choose Prompt: Select a prompt template or write custom prompt
Process Each Video: Click individual "Process Video" button for each uploaded video
- Videos show status: uploading → uploaded → processing → completed
- Up to 2 videos process in parallel automatically
Monitor Progress: Watch real-time status updates and processing indicators
Manage Queue: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) per video
View Results: Completed videos appear in "Processed Videos" section
Download: Click "Download PDF" or "Copy Formatted" for any result

Batch Mode

Process multiple related videos as one unified analysis:

Select Mode: Click "Batch Mode" button at the top
Upload Videos: Add multiple related videos (max 10 per batch)
Arrange Videos: Use Up/Down arrows to reorder videos in logical sequence
Remove Unwanted: Click Remove button to exclude videos from batch
Choose Prompt: Select or customize the analysis prompt
Process Batch: Click single "Process Batch" button to analyze all videos together
- Backend automatically handles video splitting and chunking
- Two-stage synthesis creates unified result across all videos
- Multiple Mermaid diagrams merged into one comprehensive diagram
View Results: Single unified result appears for entire batch
Download: Generate PDF with combined analysis

Key Differences:

Single Mode: Each video = separate result, manual per-video processing
Batch Mode: All videos = one unified result, single batch processing
Explicit Control: No auto-processing - all require button clicks

Processing Long Videos

Videos > 54 minutes automatically split into chunks
Each chunk processed in parallel (backend handles this)
Results intelligently combined
Processing time displayed for transparency

Development Utilities

restart.sh: Quick development environment restart
backend/test_*.py: API testing and validation scripts
backend/run.py: Production server with optimized settings for large uploads
extract_user_logs.sh*: Usage analytics extraction

Security Features

Azure AD B2C integration with JWT validation (optional)
CORS protection with specific origin allowlisting
Secure file upload validation
Temporary file cleanup
Token expiration handling
Rate limiting to prevent API abuse
Abort signal support for cancellation

Troubleshooting

Backend Issues

400 INVALID_ARGUMENT: Usually rate limiting - check logs for details
File upload errors: Verify ffmpeg installed (which ffprobe)
PDF generation fails: Ensure wkhtmltopdf installed

Frontend Issues

CORS errors: Check backend CORS settings in app.py
Changes not visible: Clear browser cache (Ctrl+Shift+R)
Config not loading: Verify config.js and config.local.js exist in public/

Rate Limiting

Backend: 2-second delay between API calls (automatic)
Frontend: Max 2 parallel videos
Free tier: 5 RPM limit enforced by Gemini API

License

This project is proprietary and confidential.