video-query/CLAUDE.md
2025-11-27 02:48:15 +05:30

19 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

A full-stack video query application using Google Gemini 2.0 Flash Exp AI for video analysis. Features parallel processing, automatic video splitting, Azure AD B2C authentication (optional), chunked file uploads (up to 5GB), and PDF generation with Mermaid diagrams.

Tech Stack:

  • Backend: Flask 3.1.0, Hypercorn 0.17.3, google-genai 1.45.0, pdfkit, ffmpeg
  • Frontend: React 18.2.0, @azure/msal-react 3.0.12, Bootstrap 5.3.2, react-dropzone 14.2.3

Development Setup Commands

Backend Setup

cd backend
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# For Python 3.10 (use special requirements file)
pip install -r requirements-py310.txt
bash fix_jose.sh  # Fix jose module conflict if needed

# For other Python versions
pip install -r requirements.txt

# Install system dependencies (Ubuntu/Debian)
sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg

# macOS
brew install cairo wkhtmltopdf ffmpeg

# Create .env file with configuration options
cat > .env << EOF
GOOGLE_API_KEY=your_api_key_here

# Optional: Model Configuration (default: gemini-2.5-pro for both)
VIDEO_PROCESSOR_MODEL=gemini-2.5-pro
VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro

# Optional: Enhanced Logging (default: false)
BATCH_PROCESSING_LOG_PROMPTS=false
BATCH_PROCESSING_LOG_SUMMARIES=false
EOF

# Run development server
python3 run.py
# Server runs on http://0.0.0.0:5010

Frontend Setup

cd frontend
npm install

# Configure for local development (optional auth disable)
echo "REACT_APP_DISABLE_AUTH=true" > .env

# Start development server
npm start
# Server runs on http://localhost:3000

# Build for production
npm run build

Quick Restart (Development)

# From project root
./restart.sh

Build/Test Commands

Backend

  • Run production server: cd backend && source venv/bin/activate && python3 run.py
  • Test API: python test_api.py (in backend directory)
  • Test webhook: python test_webhook.py
  • Manual test: python test_webhook_manual.py

Frontend

  • Development server: npm start (port 3000)
  • Production build: ./build.sh (recommended - sets PUBLIC_URL automatically)
  • Manual production build: PUBLIC_URL=/video_query npm run build (if not using script)

Video Processing

  • Standalone script: python video_query.py <video_path> [--prompt "Your custom prompt"]
  • Note: The standalone script is deprecated. Use the web application for full features.

Production Deployment

Backend Deployment (Ubuntu/CentOS)

  1. Install system packages:

    sudo apt-get update
    sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg python3-venv
    
  2. Set up virtual environment:

    cd /path/to/video-query/backend
    python3 -m venv venv
    source venv/bin/activate
    
    # For Python 3.10
    pip install -r requirements-py310.txt
    bash fix_jose.sh
    
    # For other Python versions
    pip install -r requirements.txt
    
  3. Configure environment:

    cat > .env << EOF
    

GOOGLE_API_KEY=your_production_api_key

Optional: Model Configuration

VIDEO_PROCESSOR_MODEL=gemini-2.5-pro VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro

Optional: Enable detailed logging (useful for debugging, but increases log volume)

BATCH_PROCESSING_LOG_PROMPTS=false BATCH_PROCESSING_LOG_SUMMARIES=false EOF


4. **Set up systemd service**:
```bash
sudo cp video-query.service /etc/systemd/system/
# Edit service file to match your paths
sudo nano /etc/systemd/system/video-query.service

sudo systemctl daemon-reload
sudo systemctl enable video-query
sudo systemctl start video-query
sudo systemctl status video-query
  1. Verify service:
    curl http://localhost:5010/api/health  # If health endpoint exists
    journalctl -u video-query -f  # View logs
    

Frontend Deployment

  1. Update production configuration (frontend/public/config.js):

    window.__APP_CONFIG__ = {
      "basePath": "/video-query",
      "domain": "https://your-domain.com",
      "api": {
        "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process",
        "chunkedUploadEndpoint": "https://your-domain.com/video_query_back"
      }
    };
    
  2. Build for production:

    cd frontend
    # Option 1: Use build script (recommended - sets PUBLIC_URL automatically)
    ./build.sh
    
    # Option 2: Manual build with PUBLIC_URL
    PUBLIC_URL=/video_query npm run build
    
  3. Deploy to web server:

    sudo cp -r build/* /var/www/html/video-query/
    sudo chown -R www-data:www-data /var/www/html/video-query/
    
  4. Configure Apache (example):

    <VirtualHost *:443>
        ServerName your-domain.com
        DocumentRoot /var/www/html
    
        # Frontend
        Alias /video-query /var/www/html/video-query
        <Directory /var/www/html/video-query>
            Options -Indexes +FollowSymLinks
            AllowOverride All
            Require all granted
        </Directory>
    
        # Backend proxy
        ProxyPass /video_query_back http://localhost:5010
        ProxyPassReverse /video_query_back http://localhost:5010
    
        # WebSocket support (if needed)
        ProxyPass /video_query_back/ws ws://localhost:5010/ws
        ProxyPassReverse /video_query_back/ws ws://localhost:5010/ws
    
        SSLEngine on
        SSLCertificateFile /etc/ssl/certs/your-cert.crt
        SSLCertificateKeyFile /etc/ssl/private/your-key.key
    </VirtualHost>
    
  5. Configure Nginx (alternative):

    server {
        listen 443 ssl;
        server_name your-domain.com;
    
        ssl_certificate /etc/ssl/certs/your-cert.crt;
        ssl_certificate_key /etc/ssl/private/your-key.key;
    
        # Frontend
        location /video-query {
            alias /var/www/html/video-query;
            try_files $uri $uri/ /video-query/index.html;
        }
    
        # Backend proxy
        location /video_query_back {
            proxy_pass http://localhost:5010;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
    
            # Timeouts for long video processing
            proxy_read_timeout 3600s;
            proxy_connect_timeout 3600s;
            proxy_send_timeout 3600s;
        }
    }
    
  6. Restart web server:

    # Apache
    sudo systemctl restart apache2
    
    # Nginx
    sudo systemctl restart nginx
    

Azure AD B2C Authentication Setup (Optional)

  1. Configure Azure AD B2C tenant and register application
  2. Update frontend/src/auth/authConfig.js with your tenant details
  3. Set authentication in frontend/.env:
    # Enable auth for production
    REACT_APP_DISABLE_AUTH=false
    
    # Disable auth for local dev
    REACT_APP_DISABLE_AUTH=true
    

Key Architecture Components

Parallel Processing (App.js:244-268)

  • Max concurrent videos: 2 (MAX_PARALLEL constant)
  • Uses Promise.allSettled() for batch processing
  • Each video gets an AbortController for cancellation

Rate Limiting (video_processor.py:221-224)

  • Delay: 2 seconds between API calls
  • Uses threading.Lock for thread safety
  • Prevents Gemini API rate limit errors (5 RPM free tier)

Upload Strategy (video_processor.py:388-450)

  • Current approach: Base64 inline encoding for ALL videos
  • Reason: File Upload API has known issues in SDK 1.45.0-1.49.0
  • API limit: 1000MB (1GB) for base64-encoded requests
  • Base64 overhead: +37% size increase (1.37x multiplier)
  • Effective limit: ~730MB raw video per chunk (after encoding)

Video Splitting (video_splitter.py) - Robust Multi-Constraint Algorithm

  • Chunk duration limit: 53 minutes (Gemini API ~55 min limit)
  • Chunk size limit: ~560MB safe target (with 30% VBR margin)
  • Hard size limit: ~730MB maximum (1000MB ÷ 1.37 encoding overhead)
  • Automatic splitting: Based on BOTH duration AND size constraints
  • VBR handling: 30% safety margin for Variable Bitrate variance
  • Validation & re-splitting: Automatic re-split if chunks exceed hard limit
  • Algorithm: Uses max(chunks_by_size, chunks_by_duration) for safety
  • Processing: Chunks processed in parallel on backend

Example:

  • 30min/1.5GB video → Split into 3 chunks of ~10min/500MB each
  • 50min/720MB video → Split into 2 chunks of ~25min/360MB each
  • 5min/300MB video → No split (under both limits)

Queue Management (App.js:63-336)

  • Queue states: queued, processing, completed, failed, cancelled
  • Operations: Stop (cancel), Retry, Remove
  • Abort signal: Support for canceling in-flight requests

Batch Processing Architecture (video_processor.py)

  • Two-Stage Synthesis: Individual summaries → unified result
  • Prompt Consistency: Same prompt used for all videos in batch
  • Intelligent Strategy: Detects prompt type (meeting, documentation, generic)
  • Specialized Synthesis: Different synthesis strategies for different content types
    • Meeting summaries: Consolidates discussion points and action items
    • Documentation: Sequential step-by-step guide format
    • With diagrams: Merges Mermaid diagrams intelligently
  • Model Consistency: Uses gemini-2.5-pro for both processing and synthesis
  • Enhanced Logging: Optional detailed logging for debugging

Batch Processing Flow

  1. Stage 1: Each video/chunk processed separately with context-aware prompts
  2. Intermediate: Summaries collected with metadata (video name, chunk info)
  3. Stage 2: AI synthesis combines all summaries into unified response
  4. Traceability: Clear mapping of video → chunk → summary → final result

Logging Levels

  • INFO (default): High-level progress, timing, success/failure
  • DEBUG (with env vars): Detailed prompts, summary previews, synthesis details

Example: Enable detailed logging for troubleshooting

# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true

Configuration Files

Backend Configuration

  • backend/.env: Environment variables for API keys and processing options
    • GOOGLE_API_KEY: Your Gemini API key (required)
    • VIDEO_PROCESSOR_MODEL: Model for individual video processing (default: gemini-2.5-pro)
    • VIDEO_SYNTHESIS_MODEL: Model for batch synthesis (default: gemini-2.5-pro)
    • CHUNK_DURATION_MINUTES: Max chunk duration (default: 53 minutes)
    • VBR_SAFETY_MARGIN: Safety margin for variable bitrate videos (default: 1.30 = 30%)
    • BASE64_API_LIMIT_MB: API limit for base64 requests (default: 1000MB)
    • MAX_PARALLEL_CHUNKS: Concurrent chunk processing (default: 4)
    • BATCH_PROCESSING_LOG_PROMPTS: Enable detailed prompt logging (default: false)
    • BATCH_PROCESSING_LOG_SUMMARIES: Enable summary preview logging (default: false)
  • backend/run.py: Hypercorn server config (body size limits, timeouts)

Frontend Configuration

  • frontend/.env: REACT_APP_DISABLE_AUTH=true/false
  • frontend/public/config.js: Production config (committed)
  • frontend/public/config.local.js: Local dev override (not committed, in .gitignore)

Configuration Priority

  1. config.local.js (local development) - highest priority
  2. config.js (production) - fallback

Code Style Guidelines

Python (Backend)

  • Imports: Standard library → third-party → local imports
  • Formatting: PEP 8 compliant with 4-space indentation
  • Types: Use type hints for function parameters and return values
  • Naming: snake_case for variables/functions, PascalCase for classes
  • Error handling: Use try/except blocks with specific exception types
  • API Keys: Store in environment variables, never hardcode
  • Documentation: Use docstrings for functions and main modules
  • Max line length: 100 characters
  • Comments: Include helpful comments for complex operations

JavaScript/React (Frontend)

  • Formatting: 2-space indentation for React components
  • Naming: camelCase for variables/functions, PascalCase for components
  • State management: Use React hooks (useState, useEffect, useMsal)
  • Error handling: Try/catch with user-friendly error messages
  • API calls: Use authApiClient.js with abort signal support
  • Comments: JSDoc style for complex functions

Important Implementation Notes

Authentication

  • Controlled via .env: Do NOT remove MSAL code
  • Toggle: Set REACT_APP_DISABLE_AUTH=true to disable
  • Components: AuthProvider.js, authApiClient.js, authConfig.js
  • Session storage: Tokens stored in sessionStorage for implicit flow

File Upload

  • Chunked upload: All files use chunked upload (chunkedUploader.js)
  • Max file size: 5GB per file
  • Progress tracking: Real-time progress via callbacks
  • Supported formats: MP4, AVI, MOV, WMV, MKV, WEBM

PDF Generation

  • Dependencies: wkhtmltopdf, cairosvg, pdfkit
  • Mermaid support: Diagrams converted to PNG then embedded
  • Endpoint: POST /api/generate-pdf
  • Client: ResultDisplay.js handles download

Error Handling

  • Rate limiting: 400 INVALID_ARGUMENT → check logs, reduce parallel count
  • Large files: Use hybrid upload strategy (SIZE_THRESHOLD_MB)
  • Abort errors: Check for err.code === 'ERR_CANCELED' or abortSignal.aborted

Troubleshooting

Backend Issues

# Check service status
sudo systemctl status video-query

# View logs
journalctl -u video-query -f

# Restart service
sudo systemctl restart video-query

# Check ffmpeg installation
which ffprobe
ffprobe -version

# Python 3.10: Fix jose module SyntaxError
# If you see: "SyntaxError: Missing parentheses in call to 'print'"
cd backend
bash fix_jose.sh

Frontend Issues

# Clear browser cache
# Chrome/Firefox: Ctrl+Shift+R (force reload)

# Check config loading
# Browser console: window.__APP_CONFIG__

# Verify build
ls -la frontend/build/

# Check Apache/Nginx logs
tail -f /var/log/apache2/error.log
tail -f /var/log/nginx/error.log

Static Assets Not Loading (JS/CSS 404 errors):

  • Symptom: Loading failed for <script> with source "https://domain.com/static/js/main.xxx.js"
  • Cause: Application built without correct PUBLIC_URL
  • Solution: Rebuild with ./build.sh or PUBLIC_URL=/video_query npm run build

Rate Limiting Issues

  • Symptom: 400 INVALID_ARGUMENT after 3-4 videos
  • Check: Gemini API rate limits (5 RPM for free tier)
  • Solutions:
    • Increase delay in video_processor.py (line 222)
    • Reduce MAX_PARALLEL in App.js (line 245)
    • Lower SIZE_THRESHOLD_MB in video_processor.py (line 163)

CORS Issues

  • Local dev: Verify config.local.js exists and points to localhost:5010
  • Production: Check Apache/Nginx proxy configuration
  • Backend: Verify CORS settings in app.py

Batch Processing Issues

Problem: Inconsistent or poor quality batch summaries

Diagnosis:

# Enable detailed logging in backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true

# Restart backend
sudo systemctl restart video-query

# Monitor logs to see:
# - What prompts were sent for each video
# - What summaries were generated
# - How synthesis combined them
journalctl -u video-query -f | grep "Batch"

Common Causes:

  1. Wrong prompt type detected: Check logs for [Stage 2] Detected prompt type
    • If wrong type, adjust prompt keywords (meeting, documentation, diagram, etc.)
  2. Individual summaries too brief: Check [Stage 1] summary lengths
    • Should be substantial (500+ chars typically)
  3. Synthesis failure: Check for [Stage 2] Synthesis failed
    • May fallback to simple concatenation

Problem: Cannot see what prompt was used for each video

Solution: Enable prompt logging

# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true

# Logs will show:
# Batch xyz: [Stage 1] Prompt for video 1:
# You are analyzing segment 1 of 3 from video "meeting1.mp4"...

Problem: Want to verify video-to-result mapping

Solution: Check traceability logs (always enabled)

journalctl -u video-query -f | grep "Traceability"

# Shows:
# Batch xyz: [Traceability] Video-to-summary mapping:
# Batch xyz:   - Video 1: meeting1.mp4 → Summary 1
# Batch xyz:   - Video 2: meeting2.mp4 → Summary 2

Problem: Batch processing taking too long

Solution: Check performance metrics

journalctl -u video-query -f | grep "Metrics"

# Shows:
# Batch xyz: [Metrics] Stage 1: 120.5s, Stage 2: 25.3s, Total: 145.8s
# Batch xyz: [Metrics] Avg time per video: 40.2s

# If Stage 1 is slow: Consider upgrading Gemini API tier for higher RPM
# If Stage 2 is slow: Synthesis model may be overloaded

Testing

Backend Testing

cd backend
source venv/bin/activate

# Test video processing
python test_api.py

# Test webhook
python test_webhook.py

# Manual API test
curl -X POST http://localhost:5010/api/process \
  -H "Content-Type: application/json" \
  -d '{"file_path": "/path/to/video.mp4", "filename": "video.mp4", "prompt": "Test prompt"}'

Frontend Testing

cd frontend

# Development mode
npm start

# Production build test
npm run build
npx serve -s build -l 3000

Log Extraction & Analytics

Extract User Logs

# Quick extraction
./quick_extract.sh

# Robust extraction with error handling
./extract_user_logs_robust.sh

# See LOG_EXTRACTION_README.md for details

Useful File Locations

Backend

  • Main app: backend/app.py
  • Video processor: backend/video_processor.py (Gemini API integration)
  • Video splitter: backend/video_splitter.py (54-min chunks)
  • Chunked upload: backend/chunked_upload.py
  • Authentication: backend/auth.py

Frontend

  • Main app: frontend/src/App.js (queue management, parallel processing)
  • Queue UI: frontend/src/components/AuthenticatedContent.js
  • Upload: frontend/src/components/VideoUpload.js
  • Results: frontend/src/components/ResultDisplay.js
  • Chunked uploader: frontend/src/utils/chunkedUploader.js

Configuration

  • Backend env: backend/.env
  • Frontend env: frontend/.env
  • Production config: frontend/public/config.js
  • Local config: frontend/public/config.local.js
  • Systemd service: backend/video-query.service

Dependencies Management

Backend Updates

cd backend
source venv/bin/activate
pip install --upgrade google-genai flask flask-cors pdfkit
pip freeze > requirements.txt

Frontend Updates

cd frontend
npm update
npm audit fix

Security Considerations

  • API Keys: Never commit .env files
  • Authentication: Azure AD B2C tokens in sessionStorage (consider security implications)
  • CORS: Specific origin allowlisting in production
  • File validation: Size and type checks in VideoUpload.js
  • Temporary files: Automatic cleanup in backend
  • Rate limiting: Built-in to prevent abuse

Support & Documentation

  • Main docs: See README.md for comprehensive feature documentation
  • Deployment: See DEPLOYMENT.md for detailed deployment guide
  • Log extraction: See LOG_EXTRACTION_README.md for analytics
  • Parallel processing: See PARALLEL_PROCESSING.md (if exists)
  • CORS fixes: See CORS_FIX_SUMMARY.md (if exists)