Manish Tanwar ec9f7426e4 Update in Bitrate and Small Large Size Logic

2025-11-27 02:48:15 +05:30

19 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

A full-stack video query application using Google Gemini 2.0 Flash Exp AI for video analysis. Features parallel processing, automatic video splitting, Azure AD B2C authentication (optional), chunked file uploads (up to 5GB), and PDF generation with Mermaid diagrams.

Tech Stack:

Backend: Flask 3.1.0, Hypercorn 0.17.3, google-genai 1.45.0, pdfkit, ffmpeg
Frontend: React 18.2.0, @azure/msal-react 3.0.12, Bootstrap 5.3.2, react-dropzone 14.2.3

Development Setup Commands

Backend Setup

cd backend
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# For Python 3.10 (use special requirements file)
pip install -r requirements-py310.txt
bash fix_jose.sh  # Fix jose module conflict if needed

# For other Python versions
pip install -r requirements.txt

# Install system dependencies (Ubuntu/Debian)
sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg

# macOS
brew install cairo wkhtmltopdf ffmpeg

# Create .env file with configuration options
cat > .env << EOF
GOOGLE_API_KEY=your_api_key_here

# Optional: Model Configuration (default: gemini-2.5-pro for both)
VIDEO_PROCESSOR_MODEL=gemini-2.5-pro
VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro

# Optional: Enhanced Logging (default: false)
BATCH_PROCESSING_LOG_PROMPTS=false
BATCH_PROCESSING_LOG_SUMMARIES=false
EOF

# Run development server
python3 run.py
# Server runs on http://0.0.0.0:5010

Frontend Setup

cd frontend
npm install

# Configure for local development (optional auth disable)
echo "REACT_APP_DISABLE_AUTH=true" > .env

# Start development server
npm start
# Server runs on http://localhost:3000

# Build for production
npm run build

Quick Restart (Development)

# From project root
./restart.sh

Build/Test Commands

Backend

Run production server: cd backend && source venv/bin/activate && python3 run.py
Test API: python test_api.py (in backend directory)
Test webhook: python test_webhook.py
Manual test: python test_webhook_manual.py

Frontend

Development server: npm start (port 3000)
Production build: ./build.sh (recommended - sets PUBLIC_URL automatically)
Manual production build: PUBLIC_URL=/video_query npm run build (if not using script)

Video Processing

Standalone script: python video_query.py <video_path> [--prompt "Your custom prompt"]
Note: The standalone script is deprecated. Use the web application for full features.

Production Deployment

Backend Deployment (Ubuntu/CentOS)

Install system packages:

sudo apt-get update
sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg python3-venv

Set up virtual environment:

cd /path/to/video-query/backend
python3 -m venv venv
source venv/bin/activate

# For Python 3.10
pip install -r requirements-py310.txt
bash fix_jose.sh

# For other Python versions
pip install -r requirements.txt

Configure environment:
```
cat > .env << EOF
```

GOOGLE_API_KEY=your_production_api_key

Optional: Model Configuration

VIDEO_PROCESSOR_MODEL=gemini-2.5-pro VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro

Optional: Enable detailed logging (useful for debugging, but increases log volume)

BATCH_PROCESSING_LOG_PROMPTS=false BATCH_PROCESSING_LOG_SUMMARIES=false EOF


4. **Set up systemd service**:
```bash
sudo cp video-query.service /etc/systemd/system/
# Edit service file to match your paths
sudo nano /etc/systemd/system/video-query.service

sudo systemctl daemon-reload
sudo systemctl enable video-query
sudo systemctl start video-query
sudo systemctl status video-query

Verify service:

curl http://localhost:5010/api/health  # If health endpoint exists
journalctl -u video-query -f  # View logs

Frontend Deployment

Update production configuration (frontend/public/config.js):

window.__APP_CONFIG__ = {
  "basePath": "/video-query",
  "domain": "https://your-domain.com",
  "api": {
    "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process",
    "chunkedUploadEndpoint": "https://your-domain.com/video_query_back"
  }
};

Build for production:

cd frontend
# Option 1: Use build script (recommended - sets PUBLIC_URL automatically)
./build.sh

# Option 2: Manual build with PUBLIC_URL
PUBLIC_URL=/video_query npm run build

Deploy to web server:

sudo cp -r build/* /var/www/html/video-query/
sudo chown -R www-data:www-data /var/www/html/video-query/

Configure Apache (example):

<VirtualHost *:443>
    ServerName your-domain.com
    DocumentRoot /var/www/html

    # Frontend
    Alias /video-query /var/www/html/video-query
    <Directory /var/www/html/video-query>
        Options -Indexes +FollowSymLinks
        AllowOverride All
        Require all granted
    </Directory>

    # Backend proxy
    ProxyPass /video_query_back http://localhost:5010
    ProxyPassReverse /video_query_back http://localhost:5010

    # WebSocket support (if needed)
    ProxyPass /video_query_back/ws ws://localhost:5010/ws
    ProxyPassReverse /video_query_back/ws ws://localhost:5010/ws

    SSLEngine on
    SSLCertificateFile /etc/ssl/certs/your-cert.crt
    SSLCertificateKeyFile /etc/ssl/private/your-key.key
</VirtualHost>

Configure Nginx (alternative):

server {
    listen 443 ssl;
    server_name your-domain.com;

    ssl_certificate /etc/ssl/certs/your-cert.crt;
    ssl_certificate_key /etc/ssl/private/your-key.key;

    # Frontend
    location /video-query {
        alias /var/www/html/video-query;
        try_files $uri $uri/ /video-query/index.html;
    }

    # Backend proxy
    location /video_query_back {
        proxy_pass http://localhost:5010;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts for long video processing
        proxy_read_timeout 3600s;
        proxy_connect_timeout 3600s;
        proxy_send_timeout 3600s;
    }
}

Restart web server:

# Apache
sudo systemctl restart apache2

# Nginx
sudo systemctl restart nginx

Azure AD B2C Authentication Setup (Optional)

Configure Azure AD B2C tenant and register application
Update frontend/src/auth/authConfig.js with your tenant details

Set authentication in frontend/.env:

# Enable auth for production
REACT_APP_DISABLE_AUTH=false

# Disable auth for local dev
REACT_APP_DISABLE_AUTH=true

Key Architecture Components

Parallel Processing (App.js:244-268)

Max concurrent videos: 2 (MAX_PARALLEL constant)
Uses Promise.allSettled() for batch processing
Each video gets an AbortController for cancellation

Rate Limiting (video_processor.py:221-224)

Delay: 2 seconds between API calls
Uses threading.Lock for thread safety
Prevents Gemini API rate limit errors (5 RPM free tier)

Upload Strategy (video_processor.py:388-450)

Current approach: Base64 inline encoding for ALL videos
Reason: File Upload API has known issues in SDK 1.45.0-1.49.0
API limit: 1000MB (1GB) for base64-encoded requests
Base64 overhead: +37% size increase (1.37x multiplier)
Effective limit: ~730MB raw video per chunk (after encoding)

Video Splitting (video_splitter.py) - Robust Multi-Constraint Algorithm

Chunk duration limit: 53 minutes (Gemini API ~55 min limit)
Chunk size limit: ~560MB safe target (with 30% VBR margin)
Hard size limit: ~730MB maximum (1000MB ÷ 1.37 encoding overhead)
Automatic splitting: Based on BOTH duration AND size constraints
VBR handling: 30% safety margin for Variable Bitrate variance
Validation & re-splitting: Automatic re-split if chunks exceed hard limit
Algorithm: Uses max(chunks_by_size, chunks_by_duration) for safety
Processing: Chunks processed in parallel on backend

Example:

30min/1.5GB video → Split into 3 chunks of ~10min/500MB each
50min/720MB video → Split into 2 chunks of ~25min/360MB each
5min/300MB video → No split (under both limits)

Queue Management (App.js:63-336)

Queue states: queued, processing, completed, failed, cancelled
Operations: Stop (cancel), Retry, Remove
Abort signal: Support for canceling in-flight requests

Batch Processing Architecture (video_processor.py)

Two-Stage Synthesis: Individual summaries → unified result
Prompt Consistency: Same prompt used for all videos in batch
Intelligent Strategy: Detects prompt type (meeting, documentation, generic)
Specialized Synthesis: Different synthesis strategies for different content types
- Meeting summaries: Consolidates discussion points and action items
- Documentation: Sequential step-by-step guide format
- With diagrams: Merges Mermaid diagrams intelligently
Model Consistency: Uses gemini-2.5-pro for both processing and synthesis
Enhanced Logging: Optional detailed logging for debugging

Batch Processing Flow

Stage 1: Each video/chunk processed separately with context-aware prompts
Intermediate: Summaries collected with metadata (video name, chunk info)
Stage 2: AI synthesis combines all summaries into unified response
Traceability: Clear mapping of video → chunk → summary → final result

Logging Levels

INFO (default): High-level progress, timing, success/failure
DEBUG (with env vars): Detailed prompts, summary previews, synthesis details

Example: Enable detailed logging for troubleshooting

# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true

Configuration Files

Backend Configuration

backend/.env: Environment variables for API keys and processing options
- GOOGLE_API_KEY: Your Gemini API key (required)
- VIDEO_PROCESSOR_MODEL: Model for individual video processing (default: gemini-2.5-pro)
- VIDEO_SYNTHESIS_MODEL: Model for batch synthesis (default: gemini-2.5-pro)
- CHUNK_DURATION_MINUTES: Max chunk duration (default: 53 minutes)
- VBR_SAFETY_MARGIN: Safety margin for variable bitrate videos (default: 1.30 = 30%)
- BASE64_API_LIMIT_MB: API limit for base64 requests (default: 1000MB)
- MAX_PARALLEL_CHUNKS: Concurrent chunk processing (default: 4)
- BATCH_PROCESSING_LOG_PROMPTS: Enable detailed prompt logging (default: false)
- BATCH_PROCESSING_LOG_SUMMARIES: Enable summary preview logging (default: false)
backend/run.py: Hypercorn server config (body size limits, timeouts)

Frontend Configuration

frontend/.env: REACT_APP_DISABLE_AUTH=true/false
frontend/public/config.js: Production config (committed)
frontend/public/config.local.js: Local dev override (not committed, in .gitignore)

Configuration Priority

config.local.js (local development) - highest priority
config.js (production) - fallback

Code Style Guidelines

Python (Backend)

Imports: Standard library → third-party → local imports
Formatting: PEP 8 compliant with 4-space indentation
Types: Use type hints for function parameters and return values
Naming: snake_case for variables/functions, PascalCase for classes
Error handling: Use try/except blocks with specific exception types
API Keys: Store in environment variables, never hardcode
Documentation: Use docstrings for functions and main modules
Max line length: 100 characters
Comments: Include helpful comments for complex operations

JavaScript/React (Frontend)

Formatting: 2-space indentation for React components
Naming: camelCase for variables/functions, PascalCase for components
State management: Use React hooks (useState, useEffect, useMsal)
Error handling: Try/catch with user-friendly error messages
API calls: Use authApiClient.js with abort signal support
Comments: JSDoc style for complex functions

Important Implementation Notes

Authentication

Controlled via .env: Do NOT remove MSAL code
Toggle: Set REACT_APP_DISABLE_AUTH=true to disable
Components: AuthProvider.js, authApiClient.js, authConfig.js
Session storage: Tokens stored in sessionStorage for implicit flow

File Upload

Chunked upload: All files use chunked upload (chunkedUploader.js)
Max file size: 5GB per file
Progress tracking: Real-time progress via callbacks
Supported formats: MP4, AVI, MOV, WMV, MKV, WEBM

PDF Generation

Dependencies: wkhtmltopdf, cairosvg, pdfkit
Mermaid support: Diagrams converted to PNG then embedded
Endpoint: POST /api/generate-pdf
Client: ResultDisplay.js handles download

Error Handling

Rate limiting: 400 INVALID_ARGUMENT → check logs, reduce parallel count
Large files: Use hybrid upload strategy (SIZE_THRESHOLD_MB)
Abort errors: Check for err.code === 'ERR_CANCELED' or abortSignal.aborted

Troubleshooting

Backend Issues

# Check service status
sudo systemctl status video-query

# View logs
journalctl -u video-query -f

# Restart service
sudo systemctl restart video-query

# Check ffmpeg installation
which ffprobe
ffprobe -version

# Python 3.10: Fix jose module SyntaxError
# If you see: "SyntaxError: Missing parentheses in call to 'print'"
cd backend
bash fix_jose.sh

Frontend Issues

# Clear browser cache
# Chrome/Firefox: Ctrl+Shift+R (force reload)

# Check config loading
# Browser console: window.__APP_CONFIG__

# Verify build
ls -la frontend/build/

# Check Apache/Nginx logs
tail -f /var/log/apache2/error.log
tail -f /var/log/nginx/error.log

Static Assets Not Loading (JS/CSS 404 errors):

Symptom: Loading failed for <script> with source "https://domain.com/static/js/main.xxx.js"
Cause: Application built without correct PUBLIC_URL
Solution: Rebuild with ./build.sh or PUBLIC_URL=/video_query npm run build

Rate Limiting Issues

Symptom: 400 INVALID_ARGUMENT after 3-4 videos
Check: Gemini API rate limits (5 RPM for free tier)
Solutions:
- Increase delay in video_processor.py (line 222)
- Reduce MAX_PARALLEL in App.js (line 245)
- Lower SIZE_THRESHOLD_MB in video_processor.py (line 163)

CORS Issues

Local dev: Verify config.local.js exists and points to localhost:5010
Production: Check Apache/Nginx proxy configuration
Backend: Verify CORS settings in app.py

Batch Processing Issues

Problem: Inconsistent or poor quality batch summaries

Diagnosis:

# Enable detailed logging in backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true

# Restart backend
sudo systemctl restart video-query

# Monitor logs to see:
# - What prompts were sent for each video
# - What summaries were generated
# - How synthesis combined them
journalctl -u video-query -f | grep "Batch"

Common Causes:

Wrong prompt type detected: Check logs for [Stage 2] Detected prompt type
- If wrong type, adjust prompt keywords (meeting, documentation, diagram, etc.)
Individual summaries too brief: Check [Stage 1] summary lengths
- Should be substantial (500+ chars typically)
Synthesis failure: Check for [Stage 2] Synthesis failed
- May fallback to simple concatenation

Problem: Cannot see what prompt was used for each video

Solution: Enable prompt logging

# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true

# Logs will show:
# Batch xyz: [Stage 1] Prompt for video 1:
# You are analyzing segment 1 of 3 from video "meeting1.mp4"...

Problem: Want to verify video-to-result mapping

Solution: Check traceability logs (always enabled)

journalctl -u video-query -f | grep "Traceability"

# Shows:
# Batch xyz: [Traceability] Video-to-summary mapping:
# Batch xyz:   - Video 1: meeting1.mp4 → Summary 1
# Batch xyz:   - Video 2: meeting2.mp4 → Summary 2

Problem: Batch processing taking too long

Solution: Check performance metrics

journalctl -u video-query -f | grep "Metrics"

# Shows:
# Batch xyz: [Metrics] Stage 1: 120.5s, Stage 2: 25.3s, Total: 145.8s
# Batch xyz: [Metrics] Avg time per video: 40.2s

# If Stage 1 is slow: Consider upgrading Gemini API tier for higher RPM
# If Stage 2 is slow: Synthesis model may be overloaded

Testing

Backend Testing

cd backend
source venv/bin/activate

# Test video processing
python test_api.py

# Test webhook
python test_webhook.py

# Manual API test
curl -X POST http://localhost:5010/api/process \
  -H "Content-Type: application/json" \
  -d '{"file_path": "/path/to/video.mp4", "filename": "video.mp4", "prompt": "Test prompt"}'

Frontend Testing

cd frontend

# Development mode
npm start

# Production build test
npm run build
npx serve -s build -l 3000

Log Extraction & Analytics

Extract User Logs

# Quick extraction
./quick_extract.sh

# Robust extraction with error handling
./extract_user_logs_robust.sh

# See LOG_EXTRACTION_README.md for details

Useful File Locations

Backend

Main app: backend/app.py
Video processor: backend/video_processor.py (Gemini API integration)
Video splitter: backend/video_splitter.py (54-min chunks)
Chunked upload: backend/chunked_upload.py
Authentication: backend/auth.py

Frontend

Main app: frontend/src/App.js (queue management, parallel processing)
Queue UI: frontend/src/components/AuthenticatedContent.js
Upload: frontend/src/components/VideoUpload.js
Results: frontend/src/components/ResultDisplay.js
Chunked uploader: frontend/src/utils/chunkedUploader.js

Configuration

Backend env: backend/.env
Frontend env: frontend/.env
Production config: frontend/public/config.js
Local config: frontend/public/config.local.js
Systemd service: backend/video-query.service

Dependencies Management

Backend Updates

cd backend
source venv/bin/activate
pip install --upgrade google-genai flask flask-cors pdfkit
pip freeze > requirements.txt

Frontend Updates

cd frontend
npm update
npm audit fix

Security Considerations

API Keys: Never commit .env files
Authentication: Azure AD B2C tokens in sessionStorage (consider security implications)
CORS: Specific origin allowlisting in production
File validation: Size and type checks in VideoUpload.js
Temporary files: Automatic cleanup in backend
Rate limiting: Built-in to prevent abuse

Support & Documentation

Main docs: See README.md for comprehensive feature documentation
Deployment: See DEPLOYMENT.md for detailed deployment guide
Log extraction: See LOG_EXTRACTION_README.md for analytics
Parallel processing: See PARALLEL_PROCESSING.md (if exists)
CORS fixes: See CORS_FIX_SUMMARY.md (if exists)

19 KiB Raw Permalink Blame History

CLAUDE.md

Project Overview

Development Setup Commands

Backend Setup

Frontend Setup

Quick Restart (Development)

Build/Test Commands

Backend

Frontend

Video Processing

Production Deployment

Backend Deployment (Ubuntu/CentOS)

Optional: Model Configuration

Optional: Enable detailed logging (useful for debugging, but increases log volume)

Frontend Deployment

Azure AD B2C Authentication Setup (Optional)

Key Architecture Components

Parallel Processing (App.js:244-268)

Rate Limiting (video_processor.py:221-224)

Upload Strategy (video_processor.py:388-450)

Video Splitting (video_splitter.py) - Robust Multi-Constraint Algorithm

Queue Management (App.js:63-336)

Batch Processing Architecture (video_processor.py)

Batch Processing Flow

Logging Levels

Configuration Files

Backend Configuration

Frontend Configuration

Configuration Priority

Code Style Guidelines

Python (Backend)

JavaScript/React (Frontend)

Important Implementation Notes

Authentication

File Upload

PDF Generation

Error Handling

Troubleshooting

Backend Issues

Frontend Issues

Rate Limiting Issues

CORS Issues

Batch Processing Issues

Problem: Inconsistent or poor quality batch summaries

Problem: Cannot see what prompt was used for each video

Problem: Want to verify video-to-result mapping

Problem: Batch processing taking too long

Testing

Backend Testing

Frontend Testing

Log Extraction & Analytics

Extract User Logs

Useful File Locations

Backend

Frontend

Configuration

Dependencies Management

Backend Updates

Frontend Updates

Security Considerations

Support & Documentation

19 KiB

Raw Permalink Blame History