video-query/README.md

# Video Query Tool

A full-stack web application that processes videos using Google's Gemini AI model with dual processing modes (Single Video and Batch). Features intelligent two-stage synthesis, automatic video splitting, Azure AD B2C authentication, chunked file uploads up to 5GB, PDF generation with merged Mermaid diagrams, and comprehensive usage tracking.

## Features

### Core Functionality
- **Dual Processing Modes**:
  - **Single Video Mode**: Process videos individually with per-video control
  - **Batch Mode**: Combine multiple related videos (up to 10) for unified analysis
- **Intelligent AI Synthesis**: Two-stage processing ensures seamless results
  - Stage 1: Each video/chunk → concise summary
  - Stage 2: All summaries → unified cohesive result
- **Video Processing**: Upload and analyze using Google Gemini 2.0 Flash Exp AI model
- **Prompt Templates**:
  - Meeting Summary
  - Process/Tool Documentation
  - Process Documentation with Mermaid Charts
  - Custom Prompts
- **Large File Support**: Chunked upload system supporting files up to 5GB per file
- **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams
- **Authentication**: Azure AD B2C integration (optional, controlled via .env)
- **Parallel Processing**: Process up to 2 videos simultaneously (single mode)
- **Long Video Support**: Automatic splitting and parallel chunk processing for videos > 54 minutes

### Technical Features
- **Explicit User Control**: No auto-processing - all videos require explicit "Process" button click
- **Batch Video Management**: Reorder, arrange, and remove videos before processing
- **Smart Diagram Merging**: Multiple Mermaid diagrams intelligently combined into one
- **Persistent Mode Selection**: Processing mode and batch queue persist across page refreshes
- **Multiple File Queue**: Upload multiple videos, manage queue (Stop, Retry, Remove)
- **Drag & Drop Upload**: Modern file upload interface with progress tracking
- **Real-time Status**: Live status updates (uploading → uploaded → processing → completed)
- **Queue Management**: Stop, retry, or remove videos from processing queue anytime
- **Automatic Video Splitting**: Videos > 54 minutes automatically split into 54-min chunks
- **Rate Limiting**: Built-in API rate limiting (2-second delay) to prevent quota errors
- **Error Handling**: Comprehensive error handling with retry capability
- **Processing Time Display**: Shows processing duration for each completed video/batch
- **Usage Analytics**: Automated tracking via webhook integration
- **Production Ready**: Systemd service configuration and deployment scripts

## Limitations

- **Video Length**: No limit - videos automatically split into 54-minute chunks
- **Single Chunk Limit**: Individual chunks must be under 55 minutes (handled automatically)
- **File Size**: Application supports uploads up to 5GB per file
- **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM
- **Parallel Processing**: Max 2 videos simultaneously in single mode (rate limit protection)
- **Batch Size**: Maximum 10 videos per batch processing session
- **API Rate Limits**: Gemini free tier: 5 RPM (built-in 2s delay between calls)

## Project Structure

```
video_query/
├── backend/                    # Flask/Hypercorn API server
│   ├── app.py                 # Main Flask application with PDF generation
│   ├── video_processor.py     # Gemini API integration, parallel processing, rate limiting
│   ├── video_splitter.py      # Video splitting for long videos (54-min chunks)
│   ├── auth.py                # Azure AD B2C authentication handlers
│   ├── chunked_upload.py      # Chunked file upload Blueprint
│   ├── run.py                 # Hypercorn production server
│   ├── requirements.txt       # Python dependencies
│   ├── .env                   # Environment variables (GOOGLE_API_KEY)
│   └── test_*.py              # API testing utilities
├── frontend/                   # React SPA
│   ├── src/
│   │   ├── components/        # React components
│   │   │   ├── VideoUpload.js    # Multi-file drag & drop upload
│   │   │   ├── PromptSelector.js # Mode selection and prompt editing
│   │   │   ├── ResultDisplay.js  # Results with PDF generation
│   │   │   ├── AuthenticatedContent.js # Queue management, processed list
│   │   │   └── Login.js         # Authentication interface
│   │   ├── auth/              # Authentication utilities
│   │   │   ├── authConfig.js     # Azure AD B2C configuration
│   │   │   ├── AuthProvider.js   # MSAL React provider
│   │   │   └── authApiClient.js  # Authenticated API client
│   │   └── utils/
│   │       ├── chunkedUploader.js # Large file upload handler
│   │       ├── configLoader.js    # Dynamic config loading
│   │       └── pathUtils.js       # Path utilities
│   ├── public/
│   │   ├── config.js              # Production config (committed)
│   │   ├── config.local.js        # Local dev config (not committed)
│   │   └── index.html             # Loads both configs
│   ├── package.json           # Node.js dependencies
│   ├── .env                   # Frontend environment variables
│   └── build/                 # Production build output
├── DEPLOYMENT.md              # Production deployment instructions
├── LOG_EXTRACTION_README.md   # Usage analytics documentation
├── CLAUDE.md                  # Development guidelines and build commands
├── restart.sh                 # Development restart script
├── quick_extract.sh           # Log extraction utility
└── extract_user_logs*.sh      # Advanced log processing
```

## Dependencies

### Backend Dependencies
- **Flask 3.1.0**: Web framework
- **google-genai 1.45.0**: Gemini AI SDK (updated API)
- **Hypercorn 0.17.3**: ASGI production server
- **python-jose**: JWT token validation for Azure AD
- **flask-cors 5.0.1**: Cross-origin resource sharing
- **pdfkit 1.0.0**: PDF generation from HTML
- **cairosvg 2.8.0**: SVG to PNG conversion for diagrams
- **Pillow 11.2.1**: Image processing
- **python-dotenv 1.1.0**: Environment variable management
- **ffmpeg-python**: Video splitting functionality

### Frontend Dependencies
- **React 18.2.0**: UI framework
- **@azure/msal-react 3.0.12**: Microsoft Authentication Library
- **axios 1.6.0**: HTTP client with abort signal support
- **bootstrap 5.3.2**: UI components and styling
- **mermaid 11.6.0**: Diagram generation
- **react-dropzone 14.2.3**: Multi-file upload interface
- **showdown 2.1.0**: Markdown to HTML conversion

## Setup Instructions

### Prerequisites
- Python 3.8+
- Node.js 16+
- Google Cloud API key with Gemini access
- Azure AD B2C tenant (optional, for authentication)
- wkhtmltopdf (for PDF generation)
- ffmpeg/ffprobe (for video splitting)

### Backend Setup

1. **Create and activate virtual environment**:
   ```bash
   cd backend
   python3 -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```

2. **Install dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

3. **Set up environment variables** (create `backend/.env`):
   ```bash
   GOOGLE_API_KEY=your_gemini_api_key_here
   ```

4. **Install system dependencies**:
   ```bash
   # Ubuntu/Debian:
   sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev ffmpeg

   # macOS:
   brew install cairo wkhtmltopdf ffmpeg
   ```

5. **Start development server**:
   ```bash
   python3 run.py
   # Server runs on http://0.0.0.0:5010
   ```

### Frontend Setup

1. **Install Node.js dependencies**:
   ```bash
   cd frontend
   npm install
   ```

2. **Configure authentication** (optional):
   - Edit `frontend/.env`:
     ```
     REACT_APP_DISABLE_AUTH=true  # Disable auth for local dev
     ```
   - For production, update `src/auth/authConfig.js` with Azure AD B2C details

3. **Configure backend URL for local development**:
   - File `frontend/public/config.local.js` already configured for localhost:5010
   - This file is not committed (in .gitignore)

4. **Start development server**:
   ```bash
   npm start
   # Server runs on http://localhost:3000
   ```

## Production Deployment

### System Requirements
- Ubuntu/CentOS server
- Apache/Nginx web server
- Python 3.8+ with virtual environment
- wkhtmltopdf system package
- ffmpeg/ffprobe for video processing
- Node.js for building frontend

### Backend Deployment

1. **Install system packages**:
   ```bash
   sudo apt-get update
   sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ffmpeg
   ```

2. **Set up virtual environment and install dependencies**:
   ```bash
   cd backend
   python3 -m venv venv
   source venv/bin/activate
   pip install -r requirements.txt
   ```

3. **Create production .env file**:
   ```bash
   echo "GOOGLE_API_KEY=your_production_api_key" > .env
   ```

4. **Create systemd service** (see `backend/video-query.service`):
   ```bash
   sudo cp backend/video-query.service /etc/systemd/system/
   sudo systemctl daemon-reload
   sudo systemctl enable video-query
   sudo systemctl start video-query
   ```

### Frontend Deployment

1. **Update production config** (`frontend/public/config.js`):
   ```javascript
   window.__APP_CONFIG__ = {
     "basePath": "/video-query",
     "domain": "https://your-domain.com",
     "api": {
       "videoProcessingEndpoint": "https://your-domain.com/video_query_back/api/process",
       "chunkedUploadEndpoint": "https://your-domain.com/video_query_back"
     }
   };
   ```

2. **Build for production**:
   ```bash
   cd frontend
   npm run build
   ```

3. **Deploy to web server**:
   ```bash
   sudo cp -r build/* /var/www/html/video-query/
   ```

4. **Configure web server** (Apache example):
   ```apache
   <VirtualHost *:443>
       DocumentRoot /var/www/html

       # Frontend
       Alias /video-query /var/www/html/video-query

       # Backend proxy
       ProxyPass /video_query_back http://localhost:5010
       ProxyPassReverse /video_query_back http://localhost:5010
   </VirtualHost>
   ```

## API Reference

### Video Processing Endpoints
- **POST /api/process**: Single video processing endpoint
  - Accepts JSON: `file_path`, `filename`, `prompt` (for chunked uploads)
  - Returns: Processing result with content, processing time, chunks info
- **POST /api/process-batch**: Batch video processing endpoint
  - Accepts JSON: `videos` (array of {file_path, filename, order}), `prompt`, `batch_id`
  - Returns: Unified result for all videos, total chunks processed
  - Maximum 10 videos per batch

### Chunked Upload Endpoints
- **POST /api/init-upload**: Initialize chunked upload session
- **POST /api/upload-chunk/<upload_id>**: Upload file chunk
- **POST /api/complete-upload/<upload_id>**: Mark upload complete
- **POST /api/cancel-upload/<upload_id>**: Cancel upload

### PDF Generation Endpoints
- **POST /api/generate-pdf**: Generate PDF from HTML with Mermaid diagrams
  - JSON data: `html`, `textDiagrams`, `diagramPngs`, `videoFileName`

### Authentication Endpoints (if enabled)
- **GET /api/auth-test**: Verify authentication status

## Configuration Files

### Backend Configuration
- **backend/.env**: Environment variables
  ```
  GOOGLE_API_KEY=your_api_key
  ```

### Frontend Configuration
- **frontend/.env**: React environment variables
  ```
  REACT_APP_DISABLE_AUTH=true  # Optional: disable auth for local dev
  ```
- **frontend/public/config.js**: Production configuration (committed to git)
- **frontend/public/config.local.js**: Local development override (not committed)

### Key Configuration Details
- **Parallel Processing**: Max 2 concurrent videos (App.js:245)
- **Rate Limiting**: 2-second delay between API calls (video_processor.py:224)
- **File Size Threshold**: 10MB for inline vs upload API (video_processor.py:167)
- **Video Chunk Duration**: 54 minutes (video_splitter.py)

## Usage

### Local Development
1. Start backend: `cd backend && source venv/bin/activate && python3 run.py`
2. Start frontend: `cd frontend && npm start`
3. Open: http://localhost:3000

### Processing Videos

The application supports two processing modes, selected via a toggle at the top:

#### Single Video Mode (Default)
Process videos individually with per-video control:
1. **Select Mode**: Click "Single Video Mode" button at the top
2. **Upload**: Drag & drop videos or click to select files
3. **Choose Prompt**: Select a prompt template or write custom prompt
4. **Process Each Video**: Click individual "Process Video" button for each uploaded video
   - Videos show status: uploading → uploaded → processing → completed
   - Up to 2 videos process in parallel automatically
5. **Monitor Progress**: Watch real-time status updates and processing indicators
6. **Manage Queue**: Use Stop (⏸️), Retry (🔄), or Remove (🗑️) per video
7. **View Results**: Completed videos appear in "Processed Videos" section
8. **Download**: Click "Download PDF" or "Copy Formatted" for any result

#### Batch Mode
Process multiple related videos as one unified analysis:
1. **Select Mode**: Click "Batch Mode" button at the top
2. **Upload Videos**: Add multiple related videos (max 10 per batch)
3. **Arrange Videos**: Use Up/Down arrows to reorder videos in logical sequence
4. **Remove Unwanted**: Click Remove button to exclude videos from batch
5. **Choose Prompt**: Select or customize the analysis prompt
6. **Process Batch**: Click single "Process Batch" button to analyze all videos together
   - Backend automatically handles video splitting and chunking
   - Two-stage synthesis creates unified result across all videos
   - Multiple Mermaid diagrams merged into one comprehensive diagram
7. **View Results**: Single unified result appears for entire batch
8. **Download**: Generate PDF with combined analysis

**Key Differences**:
- **Single Mode**: Each video = separate result, manual per-video processing
- **Batch Mode**: All videos = one unified result, single batch processing
- **Explicit Control**: No auto-processing - all require button clicks

### Processing Long Videos
- Videos > 54 minutes automatically split into chunks
- Each chunk processed in parallel (backend handles this)
- Results intelligently combined
- Processing time displayed for transparency

## Development Utilities

- **restart.sh**: Quick development environment restart
- **backend/test_*.py**: API testing and validation scripts
- **backend/run.py**: Production server with optimized settings for large uploads
- **extract_user_logs*.sh**: Usage analytics extraction

## Security Features

- Azure AD B2C integration with JWT validation (optional)
- CORS protection with specific origin allowlisting
- Secure file upload validation
- Temporary file cleanup
- Token expiration handling
- Rate limiting to prevent API abuse
- Abort signal support for cancellation

## Troubleshooting

### Backend Issues
- **400 INVALID_ARGUMENT**: Usually rate limiting - check logs for details
- **File upload errors**: Verify ffmpeg installed (`which ffprobe`)
- **PDF generation fails**: Ensure wkhtmltopdf installed

### Frontend Issues
- **CORS errors**: Check backend CORS settings in app.py
- **Changes not visible**: Clear browser cache (Ctrl+Shift+R)
- **Config not loading**: Verify config.js and config.local.js exist in public/

### Rate Limiting
- Backend: 2-second delay between API calls (automatic)
- Frontend: Max 2 parallel videos
- Free tier: 5 RPM limit enforced by Gemini API

## License

This project is proprietary and confidential.