No description
Find a file
2025-11-27 02:48:15 +05:30
.claude pdf instructions update check-2 2025-11-15 05:27:08 +05:30
backend Update in Bitrate and Small Large Size Logic 2025-11-27 02:48:15 +05:30
frontend Video-Batch and Long Video Working 2025-11-06 18:06:43 +05:30
.gitignore pdf instructions update 2025-11-15 03:56:00 +05:30
CLAUDE.md Update in Bitrate and Small Large Size Logic 2025-11-27 02:48:15 +05:30
extract_user_logs.sh initial commit 2025-09-18 14:25:24 -05:00
extract_user_logs_robust.sh initial commit 2025-09-18 14:25:24 -05:00
quick_extract.sh initial commit 2025-09-18 14:25:24 -05:00
README.md Readme Update with working 2025-11-06 18:49:55 +05:30
requirements.txt Add python-jose dependency for JWT authentication 2025-10-24 17:10:06 +05:30
restart.sh initial commit 2025-09-18 14:25:24 -05:00
video_query.py fixed SDK to newer version and added download file name feature for pdfs 2025-10-16 07:46:56 -05:00

Video Query Tool

A full-stack web application that processes videos using Google's Gemini AI model with dual processing modes (Single Video and Batch). Features intelligent two-stage synthesis, automatic video splitting, Azure AD B2C authentication, chunked file uploads up to 5GB, PDF generation with merged Mermaid diagrams, and comprehensive usage tracking.

Features

Core Functionality

  • Dual Processing Modes:
    • Single Video Mode: Process videos individually with per-video control
    • Batch Mode: Combine multiple related videos (up to 10) for unified analysis
  • Intelligent AI Synthesis: Two-stage processing ensures seamless results
    • Stage 1: Each video/chunk → concise summary
    • Stage 2: All summaries → unified cohesive result
  • Video Processing: Upload and analyze using Google Gemini 2.0 Flash Exp AI model
  • Prompt Templates:
    • Meeting Summary
    • Process/Tool Documentation
    • Process Documentation with Mermaid Charts
    • Custom Prompts
  • Large File Support: Chunked upload system supporting files up to 5GB per file
  • PDF Generation: Convert results to PDF with embedded Mermaid diagrams
  • Authentication: Azure AD B2C integration (optional, controlled via .env)
  • Parallel Processing: Process up to 2 videos simultaneously (single mode)
  • Long Video Support: Automatic splitting and parallel chunk processing for videos > 54 minutes

Technical Features

  • Explicit User Control: No auto-processing - all videos require explicit "Process" button click
  • Batch Video Management: Reorder, arrange, and remove videos before processing
  • Smart Diagram Merging: Multiple Mermaid diagrams intelligently combined into one
  • Persistent Mode Selection: Processing mode and batch queue persist across page refreshes
  • Multiple File Queue: Upload multiple videos, manage queue (Stop, Retry, Remove)
  • Drag & Drop Upload: Modern file upload interface with progress tracking
  • Real-time Status: Live status updates (uploading → uploaded → processing → completed)
  • Queue Management: Stop, retry, or remove videos from processing queue anytime
  • Automatic Video Splitting: Videos > 54 minutes automatically split into 54-min chunks
  • Rate Limiting: Built-in API rate limiting (2-second delay) to prevent quota errors
  • Error Handling: Comprehensive error handling with retry capability
  • Processing Time Display: Shows processing duration for each completed video/batch
  • Usage Analytics: Automated tracking via webhook integration
  • Production Ready: Systemd service configuration and deployment scripts

Limitations

  • Video Length: No limit - videos automatically split into 54-minute chunks
  • Single Chunk Limit: Individual chunks must be under 55 minutes (handled automatically)
  • File Size: Application supports uploads up to 5GB per file
  • Supported Formats: MP4, AVI, MOV, WMV, MKV, WEBM
  • Parallel Processing: Max 2 videos simultaneously in single mode (rate limit protection)
  • Batch Size: Maximum 10 videos per batch processing session
  • API Rate Limits: Gemini free tier: 5 RPM (built-in 2s delay between calls)

Project Structure

video_query/
├── backend/                    # Flask/Hypercorn API server
│   ├── app.py                 # Main Flask application with PDF generation
│   ├── video_processor.py     # Gemini API integration, parallel processing, rate limiting
│   ├── video_splitter.py      # Video splitting for long videos (54-min chunks)
│   ├── auth.py                # Azure AD B2C authentication handlers
│   ├── chunked_upload.py      # Chunked file upload Blueprint
│   ├── run.py                 # Hypercorn production server
│   ├── requirements.txt       # Python dependencies
│   ├── .env                   # Environment variables (GOOGLE_API_KEY)
│   └── test_*.py              # API testing utilities
├── frontend/                   # React SPA
│   ├── src/
│   │   ├── components/        # React components
│   │   │   ├── VideoUpload.js    # Multi-file drag & drop upload
│   │   │   ├── PromptSelector.js # Mode selection and prompt editing
│   │   │   ├── ResultDisplay.js  # Results with PDF generation
│   │   │   ├── AuthenticatedContent.js # Queue management, processed list
│   │   │   └── Login.js         # Authentication interface
│   │   ├── auth/              # Authentication utilities
│   │   │   ├── authConfig.js     # Azure AD B2C configuration
│   │   │   ├── AuthProvider.js   # MSAL React provider
│   │   │   └── authApiClient.js  # Authenticated API client
│   │   └── utils/
│   │       ├── chunkedUploader.js # Large file upload handler
│   │       ├── configLoader.js    # Dynamic config loading
│   │       └── pathUtils.js       # Path utilities
│   ├── public/
│   │   ├── config.js              # Production config (committed)
│   │   ├── config.local.js        # Local dev config (not committed)
│   │   └── index.html             # Loads both configs
│   ├── package.json           # Node.js dependencies
│   ├── .env                   # Frontend environment variables
│   └── build/                 # Production build output
├── DEPLOYMENT.md              # Production deployment instructions
├── LOG_EXTRACTION_README.md   # Usage analytics documentation
├── CLAUDE.md                  # Development guidelines and build commands
├── restart.sh                 # Development restart script
├── quick_extract.sh           # Log extraction utility
└── extract_user_logs*.sh      # Advanced log processing

Dependencies

Backend Dependencies

  • Flask 3.1.0: Web framework
  • google-genai 1.45.0: Gemini AI SDK (updated API)
  • Hypercorn 0.17.3: ASGI production server
  • python-jose: JWT token validation for Azure AD
  • flask-cors 5.0.1: Cross-origin resource sharing
  • pdfkit 1.0.0: PDF generation from HTML
  • cairosvg 2.8.0: SVG to PNG conversion for diagrams
  • Pillow 11.2.1: Image processing
  • python-dotenv 1.1.0: Environment variable management
  • ffmpeg-python: Video splitting functionality

Frontend Dependencies

  • React 18.2.0: UI framework
  • @azure/msal-react 3.0.12: Microsoft Authentication Library
  • axios 1.6.0: HTTP client with abort signal support
  • bootstrap 5.3.2: UI components and styling
  • mermaid 11.6.0: Diagram generation
  • react-dropzone 14.2.3: Multi-file upload interface
  • showdown 2.1.0: Markdown to HTML conversion

Setup Instructions

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • Google Cloud API key with Gemini access
  • Azure AD B2C tenant (optional, for authentication)
  • wkhtmltopdf (for PDF generation)
  • ffmpeg/ffprobe (for video splitting)

Backend Setup

  1. Create and activate virtual environment:

    cd backend
    python3 -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up environment variables (create backend/.env):

    GOOGLE_API_KEY=your_gemini_api_key_here
    
  4. Install system dependencies:

    # Ubuntu/Debian:
    sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg
    
    # macOS:
    brew install cairo wkhtmltopdf ffmpeg
    
  5. Start development server:

    python3 run.py
    # Server runs on http://0.0.0.0:5010
    

Frontend Setup

  1. Install Node.js dependencies:

    cd frontend
    npm install
    
  2. Configure authentication (optional):

    • Edit frontend/.env:
      REACT_APP_DISABLE_AUTH=true  # Disable auth for local dev
      
    • For production, update src/auth/authConfig.js with Azure AD B2C details
  3. Configure backend URL for local development:

    • File frontend/public/config.local.js already configured for localhost:5010
    • This file is not committed (in .gitignore)
  4. Start development server:

    npm start
    # Server runs on http://localhost:3000
    

Production Deployment

System Requirements

  • Ubuntu/CentOS server
  • Apache/Nginx web server
  • Python 3.8+ with virtual environment
  • wkhtmltopdf system package
  • ffmpeg/ffprobe for video processing
  • Node.js for building frontend

Backend Deployment

  1. Install system packages:

    sudo apt-get update
    sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg
    
  2. Set up virtual environment and install dependencies:

    cd backend
    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    
  3. Create production .env file:

    echo "GOOGLE_API_KEY=your_production_api_key" > .env
    
  4. Create systemd service (see backend/video-query.service):

    sudo cp backend/video-query.service /etc/systemd/system/
    sudo systemctl daemon-reload
    sudo systemctl enable video-query
    sudo systemctl start video-query
    

Frontend Deployment

  1. Update production config (frontend/public/config.js):

    window.__APP_CONFIG__ = {
      "basePath": "/video-query",
      "domain": "https://your-domain.com",
      "api": {
        "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process",
        "chunkedUploadEndpoint": "https://your-domain.com/video_query_back"
      }
    };
    
  2. Build for production:

    cd frontend
    npm run build
    
  3. Deploy to web server:

    sudo cp -r build/* /var/www/html/video-query/
    
  4. Configure web server (Apache example):

    <VirtualHost *:443>
        DocumentRoot /var/www/html
    
        # Frontend
        Alias /video-query /var/www/html/video-query
    
        # Backend proxy
        ProxyPass /video_query_back http://localhost:5010
        ProxyPassReverse /video_query_back http://localhost:5010
    </VirtualHost>
    

API Reference

Video Processing Endpoints

  • POST /api/process: Single video processing endpoint
    • Accepts JSON: file_path, filename, prompt (for chunked uploads)
    • Returns: Processing result with content, processing time, chunks info
  • POST /api/process-batch: Batch video processing endpoint
    • Accepts JSON: videos (array of {file_path, filename, order}), prompt, batch_id
    • Returns: Unified result for all videos, total chunks processed
    • Maximum 10 videos per batch

Chunked Upload Endpoints

  • POST /api/init-upload: Initialize chunked upload session
  • POST /api/upload-chunk/<upload_id>: Upload file chunk
  • POST /api/complete-upload/<upload_id>: Mark upload complete
  • POST /api/cancel-upload/<upload_id>: Cancel upload

PDF Generation Endpoints

  • POST /api/generate-pdf: Generate PDF from HTML with Mermaid diagrams
    • JSON data: html, textDiagrams, diagramPngs, videoFileName

Authentication Endpoints (if enabled)

  • GET /api/auth-test: Verify authentication status

Configuration Files

Backend Configuration

  • backend/.env: Environment variables
    GOOGLE_API_KEY=your_api_key
    

Frontend Configuration

  • frontend/.env: React environment variables
    REACT_APP_DISABLE_AUTH=true  # Optional: disable auth for local dev
    
  • frontend/public/config.js: Production configuration (committed to git)
  • frontend/public/config.local.js: Local development override (not committed)

Key Configuration Details

  • Parallel Processing: Max 2 concurrent videos (App.js:245)
  • Rate Limiting: 2-second delay between API calls (video_processor.py:224)
  • File Size Threshold: 10MB for inline vs upload API (video_processor.py:167)
  • Video Chunk Duration: 54 minutes (video_splitter.py)

Usage

Local Development

  1. Start backend: cd backend && source venv/bin/activate && python3 run.py
  2. Start frontend: cd frontend && npm start
  3. Open: http://localhost:3000

Processing Videos

The application supports two processing modes, selected via a toggle at the top:

Single Video Mode (Default)

Process videos individually with per-video control:

  1. Select Mode: Click "Single Video Mode" button at the top
  2. Upload: Drag & drop videos or click to select files
  3. Choose Prompt: Select a prompt template or write custom prompt
  4. Process Each Video: Click individual "Process Video" button for each uploaded video
    • Videos show status: uploading → uploaded → processing → completed
    • Up to 2 videos process in parallel automatically
  5. Monitor Progress: Watch real-time status updates and processing indicators
  6. Manage Queue: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) per video
  7. View Results: Completed videos appear in "Processed Videos" section
  8. Download: Click "Download PDF" or "Copy Formatted" for any result

Batch Mode

Process multiple related videos as one unified analysis:

  1. Select Mode: Click "Batch Mode" button at the top
  2. Upload Videos: Add multiple related videos (max 10 per batch)
  3. Arrange Videos: Use Up/Down arrows to reorder videos in logical sequence
  4. Remove Unwanted: Click Remove button to exclude videos from batch
  5. Choose Prompt: Select or customize the analysis prompt
  6. Process Batch: Click single "Process Batch" button to analyze all videos together
    • Backend automatically handles video splitting and chunking
    • Two-stage synthesis creates unified result across all videos
    • Multiple Mermaid diagrams merged into one comprehensive diagram
  7. View Results: Single unified result appears for entire batch
  8. Download: Generate PDF with combined analysis

Key Differences:

  • Single Mode: Each video = separate result, manual per-video processing
  • Batch Mode: All videos = one unified result, single batch processing
  • Explicit Control: No auto-processing - all require button clicks

Processing Long Videos

  • Videos > 54 minutes automatically split into chunks
  • Each chunk processed in parallel (backend handles this)
  • Results intelligently combined
  • Processing time displayed for transparency

Development Utilities

  • restart.sh: Quick development environment restart
  • backend/test_*.py: API testing and validation scripts
  • backend/run.py: Production server with optimized settings for large uploads
  • extract_user_logs.sh*: Usage analytics extraction

Security Features

  • Azure AD B2C integration with JWT validation (optional)
  • CORS protection with specific origin allowlisting
  • Secure file upload validation
  • Temporary file cleanup
  • Token expiration handling
  • Rate limiting to prevent API abuse
  • Abort signal support for cancellation

Troubleshooting

Backend Issues

  • 400 INVALID_ARGUMENT: Usually rate limiting - check logs for details
  • File upload errors: Verify ffmpeg installed (which ffprobe)
  • PDF generation fails: Ensure wkhtmltopdf installed

Frontend Issues

  • CORS errors: Check backend CORS settings in app.py
  • Changes not visible: Clear browser cache (Ctrl+Shift+R)
  • Config not loading: Verify config.js and config.local.js exist in public/

Rate Limiting

  • Backend: 2-second delay between API calls (automatic)
  • Frontend: Max 2 parallel videos
  • Free tier: 5 RPM limit enforced by Gemini API

License

This project is proprietary and confidential.