19 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
A full-stack video query application using Google Gemini 2.0 Flash Exp AI for video analysis. Features parallel processing, automatic video splitting, Azure AD B2C authentication (optional), chunked file uploads (up to 5GB), and PDF generation with Mermaid diagrams.
Tech Stack:
- Backend: Flask 3.1.0, Hypercorn 0.17.3, google-genai 1.45.0, pdfkit, ffmpeg
- Frontend: React 18.2.0, @azure/msal-react 3.0.12, Bootstrap 5.3.2, react-dropzone 14.2.3
Development Setup Commands
Backend Setup
cd backend
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# For Python 3.10 (use special requirements file)
pip install -r requirements-py310.txt
bash fix_jose.sh # Fix jose module conflict if needed
# For other Python versions
pip install -r requirements.txt
# Install system dependencies (Ubuntu/Debian)
sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg
# macOS
brew install cairo wkhtmltopdf ffmpeg
# Create .env file with configuration options
cat > .env << EOF
GOOGLE_API_KEY=your_api_key_here
# Optional: Model Configuration (default: gemini-2.5-pro for both)
VIDEO_PROCESSOR_MODEL=gemini-2.5-pro
VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro
# Optional: Enhanced Logging (default: false)
BATCH_PROCESSING_LOG_PROMPTS=false
BATCH_PROCESSING_LOG_SUMMARIES=false
EOF
# Run development server
python3 run.py
# Server runs on http://0.0.0.0:5010
Frontend Setup
cd frontend
npm install
# Configure for local development (optional auth disable)
echo "REACT_APP_DISABLE_AUTH=true" > .env
# Start development server
npm start
# Server runs on http://localhost:3000
# Build for production
npm run build
Quick Restart (Development)
# From project root
./restart.sh
Build/Test Commands
Backend
- Run production server:
cd backend && source venv/bin/activate && python3 run.py - Test API:
python test_api.py(in backend directory) - Test webhook:
python test_webhook.py - Manual test:
python test_webhook_manual.py
Frontend
- Development server:
npm start(port 3000) - Production build:
./build.sh(recommended - sets PUBLIC_URL automatically) - Manual production build:
PUBLIC_URL=/video_query npm run build(if not using script)
Video Processing
- Standalone script:
python video_query.py <video_path> [--prompt "Your custom prompt"] - Note: The standalone script is deprecated. Use the web application for full features.
Production Deployment
Backend Deployment (Ubuntu/CentOS)
-
Install system packages:
sudo apt-get update sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg python3-venv -
Set up virtual environment:
cd /path/to/video-query/backend python3 -m venv venv source venv/bin/activate # For Python 3.10 pip install -r requirements-py310.txt bash fix_jose.sh # For other Python versions pip install -r requirements.txt -
Configure environment:
cat > .env << EOF
GOOGLE_API_KEY=your_production_api_key
Optional: Model Configuration
VIDEO_PROCESSOR_MODEL=gemini-2.5-pro VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro
Optional: Enable detailed logging (useful for debugging, but increases log volume)
BATCH_PROCESSING_LOG_PROMPTS=false BATCH_PROCESSING_LOG_SUMMARIES=false EOF
4. **Set up systemd service**:
```bash
sudo cp video-query.service /etc/systemd/system/
# Edit service file to match your paths
sudo nano /etc/systemd/system/video-query.service
sudo systemctl daemon-reload
sudo systemctl enable video-query
sudo systemctl start video-query
sudo systemctl status video-query
- Verify service:
curl http://localhost:5010/api/health # If health endpoint exists journalctl -u video-query -f # View logs
Frontend Deployment
-
Update production configuration (
frontend/public/config.js):window.__APP_CONFIG__ = { "basePath": "/video-query", "domain": "https://your-domain.com", "api": { "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process", "chunkedUploadEndpoint": "https://your-domain.com/video_query_back" } }; -
Build for production:
cd frontend # Option 1: Use build script (recommended - sets PUBLIC_URL automatically) ./build.sh # Option 2: Manual build with PUBLIC_URL PUBLIC_URL=/video_query npm run build -
Deploy to web server:
sudo cp -r build/* /var/www/html/video-query/ sudo chown -R www-data:www-data /var/www/html/video-query/ -
Configure Apache (example):
<VirtualHost *:443> ServerName your-domain.com DocumentRoot /var/www/html # Frontend Alias /video-query /var/www/html/video-query <Directory /var/www/html/video-query> Options -Indexes +FollowSymLinks AllowOverride All Require all granted </Directory> # Backend proxy ProxyPass /video_query_back http://localhost:5010 ProxyPassReverse /video_query_back http://localhost:5010 # WebSocket support (if needed) ProxyPass /video_query_back/ws ws://localhost:5010/ws ProxyPassReverse /video_query_back/ws ws://localhost:5010/ws SSLEngine on SSLCertificateFile /etc/ssl/certs/your-cert.crt SSLCertificateKeyFile /etc/ssl/private/your-key.key </VirtualHost> -
Configure Nginx (alternative):
server { listen 443 ssl; server_name your-domain.com; ssl_certificate /etc/ssl/certs/your-cert.crt; ssl_certificate_key /etc/ssl/private/your-key.key; # Frontend location /video-query { alias /var/www/html/video-query; try_files $uri $uri/ /video-query/index.html; } # Backend proxy location /video_query_back { proxy_pass http://localhost:5010; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Timeouts for long video processing proxy_read_timeout 3600s; proxy_connect_timeout 3600s; proxy_send_timeout 3600s; } } -
Restart web server:
# Apache sudo systemctl restart apache2 # Nginx sudo systemctl restart nginx
Azure AD B2C Authentication Setup (Optional)
- Configure Azure AD B2C tenant and register application
- Update
frontend/src/auth/authConfig.jswith your tenant details - Set authentication in frontend/.env:
# Enable auth for production REACT_APP_DISABLE_AUTH=false # Disable auth for local dev REACT_APP_DISABLE_AUTH=true
Key Architecture Components
Parallel Processing (App.js:244-268)
- Max concurrent videos: 2 (MAX_PARALLEL constant)
- Uses
Promise.allSettled()for batch processing - Each video gets an
AbortControllerfor cancellation
Rate Limiting (video_processor.py:221-224)
- Delay: 2 seconds between API calls
- Uses
threading.Lockfor thread safety - Prevents Gemini API rate limit errors (5 RPM free tier)
Upload Strategy (video_processor.py:388-450)
- Current approach: Base64 inline encoding for ALL videos
- Reason: File Upload API has known issues in SDK 1.45.0-1.49.0
- API limit: 1000MB (1GB) for base64-encoded requests
- Base64 overhead: +37% size increase (1.37x multiplier)
- Effective limit: ~730MB raw video per chunk (after encoding)
Video Splitting (video_splitter.py) - Robust Multi-Constraint Algorithm
- Chunk duration limit: 53 minutes (Gemini API ~55 min limit)
- Chunk size limit: ~560MB safe target (with 30% VBR margin)
- Hard size limit: ~730MB maximum (1000MB ÷ 1.37 encoding overhead)
- Automatic splitting: Based on BOTH duration AND size constraints
- VBR handling: 30% safety margin for Variable Bitrate variance
- Validation & re-splitting: Automatic re-split if chunks exceed hard limit
- Algorithm: Uses max(chunks_by_size, chunks_by_duration) for safety
- Processing: Chunks processed in parallel on backend
Example:
- 30min/1.5GB video → Split into 3 chunks of ~10min/500MB each
- 50min/720MB video → Split into 2 chunks of ~25min/360MB each
- 5min/300MB video → No split (under both limits)
Queue Management (App.js:63-336)
- Queue states: queued, processing, completed, failed, cancelled
- Operations: Stop (cancel), Retry, Remove
- Abort signal: Support for canceling in-flight requests
Batch Processing Architecture (video_processor.py)
- Two-Stage Synthesis: Individual summaries → unified result
- Prompt Consistency: Same prompt used for all videos in batch
- Intelligent Strategy: Detects prompt type (meeting, documentation, generic)
- Specialized Synthesis: Different synthesis strategies for different content types
- Meeting summaries: Consolidates discussion points and action items
- Documentation: Sequential step-by-step guide format
- With diagrams: Merges Mermaid diagrams intelligently
- Model Consistency: Uses gemini-2.5-pro for both processing and synthesis
- Enhanced Logging: Optional detailed logging for debugging
Batch Processing Flow
- Stage 1: Each video/chunk processed separately with context-aware prompts
- Intermediate: Summaries collected with metadata (video name, chunk info)
- Stage 2: AI synthesis combines all summaries into unified response
- Traceability: Clear mapping of video → chunk → summary → final result
Logging Levels
- INFO (default): High-level progress, timing, success/failure
- DEBUG (with env vars): Detailed prompts, summary previews, synthesis details
Example: Enable detailed logging for troubleshooting
# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true
Configuration Files
Backend Configuration
- backend/.env: Environment variables for API keys and processing options
GOOGLE_API_KEY: Your Gemini API key (required)VIDEO_PROCESSOR_MODEL: Model for individual video processing (default: gemini-2.5-pro)VIDEO_SYNTHESIS_MODEL: Model for batch synthesis (default: gemini-2.5-pro)CHUNK_DURATION_MINUTES: Max chunk duration (default: 53 minutes)VBR_SAFETY_MARGIN: Safety margin for variable bitrate videos (default: 1.30 = 30%)BASE64_API_LIMIT_MB: API limit for base64 requests (default: 1000MB)MAX_PARALLEL_CHUNKS: Concurrent chunk processing (default: 4)BATCH_PROCESSING_LOG_PROMPTS: Enable detailed prompt logging (default: false)BATCH_PROCESSING_LOG_SUMMARIES: Enable summary preview logging (default: false)
- backend/run.py: Hypercorn server config (body size limits, timeouts)
Frontend Configuration
- frontend/.env:
REACT_APP_DISABLE_AUTH=true/false - frontend/public/config.js: Production config (committed)
- frontend/public/config.local.js: Local dev override (not committed, in .gitignore)
Configuration Priority
config.local.js(local development) - highest priorityconfig.js(production) - fallback
Code Style Guidelines
Python (Backend)
- Imports: Standard library → third-party → local imports
- Formatting: PEP 8 compliant with 4-space indentation
- Types: Use type hints for function parameters and return values
- Naming: snake_case for variables/functions, PascalCase for classes
- Error handling: Use try/except blocks with specific exception types
- API Keys: Store in environment variables, never hardcode
- Documentation: Use docstrings for functions and main modules
- Max line length: 100 characters
- Comments: Include helpful comments for complex operations
JavaScript/React (Frontend)
- Formatting: 2-space indentation for React components
- Naming: camelCase for variables/functions, PascalCase for components
- State management: Use React hooks (useState, useEffect, useMsal)
- Error handling: Try/catch with user-friendly error messages
- API calls: Use authApiClient.js with abort signal support
- Comments: JSDoc style for complex functions
Important Implementation Notes
Authentication
- Controlled via .env: Do NOT remove MSAL code
- Toggle: Set
REACT_APP_DISABLE_AUTH=trueto disable - Components: AuthProvider.js, authApiClient.js, authConfig.js
- Session storage: Tokens stored in sessionStorage for implicit flow
File Upload
- Chunked upload: All files use chunked upload (chunkedUploader.js)
- Max file size: 5GB per file
- Progress tracking: Real-time progress via callbacks
- Supported formats: MP4, AVI, MOV, WMV, MKV, WEBM
PDF Generation
- Dependencies: wkhtmltopdf, cairosvg, pdfkit
- Mermaid support: Diagrams converted to PNG then embedded
- Endpoint: POST /api/generate-pdf
- Client: ResultDisplay.js handles download
Error Handling
- Rate limiting: 400 INVALID_ARGUMENT → check logs, reduce parallel count
- Large files: Use hybrid upload strategy (SIZE_THRESHOLD_MB)
- Abort errors: Check for
err.code === 'ERR_CANCELED'orabortSignal.aborted
Troubleshooting
Backend Issues
# Check service status
sudo systemctl status video-query
# View logs
journalctl -u video-query -f
# Restart service
sudo systemctl restart video-query
# Check ffmpeg installation
which ffprobe
ffprobe -version
# Python 3.10: Fix jose module SyntaxError
# If you see: "SyntaxError: Missing parentheses in call to 'print'"
cd backend
bash fix_jose.sh
Frontend Issues
# Clear browser cache
# Chrome/Firefox: Ctrl+Shift+R (force reload)
# Check config loading
# Browser console: window.__APP_CONFIG__
# Verify build
ls -la frontend/build/
# Check Apache/Nginx logs
tail -f /var/log/apache2/error.log
tail -f /var/log/nginx/error.log
Static Assets Not Loading (JS/CSS 404 errors):
- Symptom:
Loading failed for <script> with source "https://domain.com/static/js/main.xxx.js" - Cause: Application built without correct PUBLIC_URL
- Solution: Rebuild with
./build.shorPUBLIC_URL=/video_query npm run build
Rate Limiting Issues
- Symptom: 400 INVALID_ARGUMENT after 3-4 videos
- Check: Gemini API rate limits (5 RPM for free tier)
- Solutions:
- Increase delay in video_processor.py (line 222)
- Reduce MAX_PARALLEL in App.js (line 245)
- Lower SIZE_THRESHOLD_MB in video_processor.py (line 163)
CORS Issues
- Local dev: Verify config.local.js exists and points to localhost:5010
- Production: Check Apache/Nginx proxy configuration
- Backend: Verify CORS settings in app.py
Batch Processing Issues
Problem: Inconsistent or poor quality batch summaries
Diagnosis:
# Enable detailed logging in backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true
# Restart backend
sudo systemctl restart video-query
# Monitor logs to see:
# - What prompts were sent for each video
# - What summaries were generated
# - How synthesis combined them
journalctl -u video-query -f | grep "Batch"
Common Causes:
- Wrong prompt type detected: Check logs for
[Stage 2] Detected prompt type- If wrong type, adjust prompt keywords (meeting, documentation, diagram, etc.)
- Individual summaries too brief: Check
[Stage 1]summary lengths- Should be substantial (500+ chars typically)
- Synthesis failure: Check for
[Stage 2] Synthesis failed- May fallback to simple concatenation
Problem: Cannot see what prompt was used for each video
Solution: Enable prompt logging
# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
# Logs will show:
# Batch xyz: [Stage 1] Prompt for video 1:
# You are analyzing segment 1 of 3 from video "meeting1.mp4"...
Problem: Want to verify video-to-result mapping
Solution: Check traceability logs (always enabled)
journalctl -u video-query -f | grep "Traceability"
# Shows:
# Batch xyz: [Traceability] Video-to-summary mapping:
# Batch xyz: - Video 1: meeting1.mp4 → Summary 1
# Batch xyz: - Video 2: meeting2.mp4 → Summary 2
Problem: Batch processing taking too long
Solution: Check performance metrics
journalctl -u video-query -f | grep "Metrics"
# Shows:
# Batch xyz: [Metrics] Stage 1: 120.5s, Stage 2: 25.3s, Total: 145.8s
# Batch xyz: [Metrics] Avg time per video: 40.2s
# If Stage 1 is slow: Consider upgrading Gemini API tier for higher RPM
# If Stage 2 is slow: Synthesis model may be overloaded
Testing
Backend Testing
cd backend
source venv/bin/activate
# Test video processing
python test_api.py
# Test webhook
python test_webhook.py
# Manual API test
curl -X POST http://localhost:5010/api/process \
-H "Content-Type: application/json" \
-d '{"file_path": "/path/to/video.mp4", "filename": "video.mp4", "prompt": "Test prompt"}'
Frontend Testing
cd frontend
# Development mode
npm start
# Production build test
npm run build
npx serve -s build -l 3000
Log Extraction & Analytics
Extract User Logs
# Quick extraction
./quick_extract.sh
# Robust extraction with error handling
./extract_user_logs_robust.sh
# See LOG_EXTRACTION_README.md for details
Useful File Locations
Backend
- Main app:
backend/app.py - Video processor:
backend/video_processor.py(Gemini API integration) - Video splitter:
backend/video_splitter.py(54-min chunks) - Chunked upload:
backend/chunked_upload.py - Authentication:
backend/auth.py
Frontend
- Main app:
frontend/src/App.js(queue management, parallel processing) - Queue UI:
frontend/src/components/AuthenticatedContent.js - Upload:
frontend/src/components/VideoUpload.js - Results:
frontend/src/components/ResultDisplay.js - Chunked uploader:
frontend/src/utils/chunkedUploader.js
Configuration
- Backend env:
backend/.env - Frontend env:
frontend/.env - Production config:
frontend/public/config.js - Local config:
frontend/public/config.local.js - Systemd service:
backend/video-query.service
Dependencies Management
Backend Updates
cd backend
source venv/bin/activate
pip install --upgrade google-genai flask flask-cors pdfkit
pip freeze > requirements.txt
Frontend Updates
cd frontend
npm update
npm audit fix
Security Considerations
- API Keys: Never commit .env files
- Authentication: Azure AD B2C tokens in sessionStorage (consider security implications)
- CORS: Specific origin allowlisting in production
- File validation: Size and type checks in VideoUpload.js
- Temporary files: Automatic cleanup in backend
- Rate limiting: Built-in to prevent abuse
Support & Documentation
- Main docs: See README.md for comprehensive feature documentation
- Deployment: See DEPLOYMENT.md for detailed deployment guide
- Log extraction: See LOG_EXTRACTION_README.md for analytics
- Parallel processing: See PARALLEL_PROCESSING.md (if exists)
- CORS fixes: See CORS_FIX_SUMMARY.md (if exists)