pdf instructions update

2025-11-15 03:56:00 +05:30 · 2025-11-15 03:56:00 +05:30 · dc770d65d3
commit dc770d65d3
parent b7bfd679dd
13 changed files with 40 additions and 3088 deletions
--- a/.gitignore
+++ b/.gitignore
@ -6,6 +6,7 @@ __pycache__/
 # Development Claude Notes files
 .claude/
 overview.txt
+*.pdf
 # C extensions
 *.so

--- a/503_ERROR_FIX_IMPLEMENTATION.md
+++ b/503_ERROR_FIX_IMPLEMENTATION.md
@ -1,434 +0,0 @@
-# 503 Error Fix - Implementation Summary
-
-**Date:** 2025-11-13
-**Status:** ✅ **COMPLETED**
-**Issue:** 503 UNAVAILABLE errors when processing long videos (chunk 2/2 failures)
-
---
-
-## Problem Analysis
-
-### **Root Cause:**
-```
-The application was overwhelming the Gemini API with:
-1. ❌ Parallel requests (4 workers) exceeding free tier rate limit (5 RPM)
-2. ❌ Insufficient delays between requests (2 seconds vs required 12 seconds)
-3. ❌ Chunk duration (54 min) exceeding Google's limit for videos with audio (45 min)
-4. ❌ Basic retry logic that didn't handle 503 errors
-```
-
-### **The 503 Error:**
-```
-Error: Failed to process chunk 2/2:
-503 UNAVAILABLE: {'error': {'code': 503, 'message': 'The model is overloaded.
-Please try again later.', 'status': 'UNAVAILABLE'}}
-```
-
-**Why it happened:**
- Free tier: 5 RPM = 1 request every 12 seconds
- Old behavior: 4 parallel workers × 2 second delay = 4 requests in 2 seconds ❌
- Result: API overloaded → 503 error
-
---
-
-## Solution Implemented
-
-### **1. Fixed Chunk Duration** ✅
-
-**Change:**
-```python
-# video_splitter.py line 26
-DEFAULT_CHUNK_DURATION = 43  # Changed from 54 to 43 minutes
-```
-
-**Reason:**
- Google Gemini 2.5 Pro limits:
-  - With audio: **~45 minutes max**
-  - Without audio: **~60 minutes max**
- Old 54-minute chunks exceeded the 45-min audio limit
- New 43-minute chunks stay safely under the limit
-
---
-
-### **2. Smart Rate Limiting** ✅
-
-**New Configuration:**
-```python
-# video_processor.py lines 54-58
-MIN_REQUEST_INTERVAL_FREE = 12  # 12 seconds for free tier (5 RPM)
-MIN_REQUEST_INTERVAL_PAID = 1   # 1 second for paid tier (60 RPM)
-MAX_RETRY_ATTEMPTS = 5          # Up to 5 attempts (not infinite!)
-RETRY_DELAYS = [5, 10, 20, 40, 60]  # Exponential backoff
-```
-
-**How it works:**
-```
-Free Tier (5 RPM):
- Request 1 → Wait 12s → Request 2 → Wait 12s → Request 3
- Ensures: 60 seconds / 5 requests = 12 seconds between each
-
-Paid Tier (60 RPM):
- Request 1 → Wait 1s → Request 2 → Wait 1s → Request 3
- Faster processing with higher limits
-```
-
---
-
-### **3. Intelligent Retry Logic** ✅
-
-**New Method:** `_make_api_request_with_retry()`
-
-**Handles:**
- ✅ **503 UNAVAILABLE** (API overload) → Retry with exponential backoff
- ✅ **429 TOO_MANY_REQUESTS** (rate limit) → Retry with exponential backoff
- ✅ **500 INTERNAL_SERVER_ERROR** → Retry with exponential backoff
- ✅ **Network errors** (timeout, connection) → Retry with 5s delay
- ❌ **400 INVALID_ARGUMENT** → Fail immediately (not retryable)
-
-**Retry Strategy:**
-```
-Attempt 1: Initial try
-  ↓ (fails with 503)
-Attempt 2: Wait 5 seconds → Retry
-  ↓ (fails with 503)
-Attempt 3: Wait 10 seconds → Retry
-  ↓ (fails with 503)
-Attempt 4: Wait 20 seconds → Retry
-  ↓ (fails with 503)
-Attempt 5: Wait 40 seconds → Final retry
-  ↓ (if still fails)
-STOP → Return error (NOT INFINITE!)
-```
-
---
-
-### **4. Reduced Parallel Workers** ✅
-
-**Change:**
-```python
-# video_processor.py line 48
-DEFAULT_MAX_WORKERS = 2  # Reduced from 4 to 2
-```
-
-**Auto-Configuration:**
-```python
-if GEMINI_API_TIER == "free":
-    max_workers = 2  # Safe for 5 RPM
-elif GEMINI_API_TIER == "paid":
-    max_workers = 4  # Can handle 60 RPM
-```
-
-**Impact:**
- Free tier: 2 workers × 12s delay = 1 request every 12s ✅ Safe
- Paid tier: 4 workers × 1s delay = Fast processing ✅ Safe
-
---
-
-### **5. API Tier Detection** ✅
-
-**New Method:** `_detect_api_tier()`
-
-**Configuration:**
-```bash
-# .env file
-GEMINI_API_TIER=free  # or "paid"
-```
-
-**Benefits:**
- Automatically adjusts rate limits based on your subscription
- Prevents overload on free tier
- Maximizes speed on paid tier
- Easy to switch without code changes
-
---
-
-## Files Modified
-
-### **Modified Files (3):**
-
-| File | Lines Changed | Changes |
-|------|---------------|---------|
-| `backend/video_splitter.py` | Line 26 | Chunk duration: 54 → 43 minutes |
-| `backend/video_processor.py` | +200 lines | Rate limiting, retry logic, API tier detection |
-| `backend/.env` | +5 lines | Added GEMINI_API_TIER configuration |
-| `backend/.env.example` | +23 lines | Documented new configuration options |
-
---
-
-## Configuration
-
-### **Environment Variables (.env):**
-
-```bash
-# REQUIRED: Your API key
-GOOGLE_API_KEY=your_key_here
-
-# IMPORTANT: Set your API tier
-# This is KEY to preventing 503 errors!
-GEMINI_API_TIER=free  # or "paid"
-
-# Optional: Override parallel workers
-# (Auto-configured based on tier if not set)
-# MAX_PARALLEL_CHUNKS=2
-
-# Model configuration
-VIDEO_PROCESSOR_MODEL=gemini-2.5-pro
-VIDEO_SYNTHESIS_MODEL=gemini-2.5-pro
-```
-
---
-
-## How It Prevents 503 Errors
-
-### **Before Fix:**
-```
-Long video (2 hours) → Split into 3 chunks (54 min each)
-  ↓
-Process with 4 parallel workers:
-  Worker 1: Chunk 1 (t=0s)   ✅ Success
-  Worker 2: Chunk 2 (t=0s)   ❌ 503 UNAVAILABLE
-  Worker 3: Chunk 3 (t=0s)   ❌ 503 UNAVAILABLE
-  Worker 4: (idle)
-
-All 3 requests hit API simultaneously → Overload → 503
-```
-
-### **After Fix:**
-```
-Long video (2 hours) → Split into 3 chunks (43 min each)
-  ↓
-Process with 2 parallel workers + rate limiting:
-  Worker 1: Chunk 1 (t=0s)    → Wait 12s ✅ Success
-  Worker 2: Chunk 2 (t=12s)   → Wait 12s ✅ Success
-  Worker 1: Chunk 3 (t=24s)   → Wait 12s ✅ Success
-
-Requests spaced 12 seconds apart → Within rate limit → No 503
-```
-
---
-
-## Testing Scenarios
-
-### **Test Case 1: Short Video (<43 min)**
-```
-Input: 30-minute video
-Expected: Process directly (no splitting)
-Result: ✅ Works (1 API call)
-```
-
-### **Test Case 2: Long Video (2 hours)**
-```
-Input: 2-hour video
-Expected: Split into ~3 chunks (43 min each)
-Processing:
-  - Chunk 1: t=0s   ✅
-  - Chunk 2: t=12s  ✅ (no 503!)
-  - Chunk 3: t=24s  ✅ (no 503!)
-Result: ✅ All chunks succeed
-```
-
-### **Test Case 3: Very Long Video (5 hours)**
-```
-Input: 5-hour video
-Expected: Split into ~7 chunks
-Processing:
-  - Worker 1: Chunks 1,3,5,7 at t=0s, 24s, 48s, 72s
-  - Worker 2: Chunks 2,4,6   at t=12s, 36s, 60s
-Result: ✅ All chunks succeed with proper spacing
-```
-
-### **Test Case 4: Batch Mode (3 videos × 90 min)**
-```
-Input: 3 videos, each 90 minutes
-Expected: Each split into 3 chunks = 9 total chunks
-Processing: Rate limited, 2 workers
-Result: ✅ All 9 chunks process successfully
-```
-
---
-
-## Performance Comparison
-
-### **Free Tier (5 RPM):**
-
-| Scenario | Before | After |
-|----------|--------|-------|
-| 2-hour video | ❌ Fails (503) | ✅ Success (36s total) |
-| 5-hour video | ❌ Fails (503) | ✅ Success (84s total) |
-| Success rate | ~30-40% | **~98%+** |
-
-### **Paid Tier (60 RPM):**
-
-| Scenario | Before | After |
-|----------|--------|-------|
-| 2-hour video | ⚠️ Unreliable | ✅ Success (6s total) |
-| 5-hour video | ⚠️ Unreliable | ✅ Success (14s total) |
-| Success rate | ~70% | **~99%+** |
-
---
-
-## Retry Examples
-
-### **Scenario 1: Temporary 503 Error**
-```
-Attempt 1: 503 UNAVAILABLE
-  ↓ Wait 5s
-Attempt 2: ✅ SUCCESS
-Result: Video processed successfully after 1 retry
-```
-
-### **Scenario 2: Persistent Overload**
-```
-Attempt 1: 503 UNAVAILABLE
-  ↓ Wait 5s
-Attempt 2: 503 UNAVAILABLE
-  ↓ Wait 10s
-Attempt 3: 503 UNAVAILABLE
-  ↓ Wait 20s
-Attempt 4: ✅ SUCCESS
-Result: Video processed after 3 retries (35s delay)
-```
-
-### **Scenario 3: Complete Failure**
-```
-Attempt 1: 503 UNAVAILABLE
-Attempt 2: 503 UNAVAILABLE (5s)
-Attempt 3: 503 UNAVAILABLE (10s)
-Attempt 4: 503 UNAVAILABLE (20s)
-Attempt 5: 503 UNAVAILABLE (40s)
-Result: ❌ FAIL with error report
-User sees: "API temporarily overloaded. Please try again in a few minutes."
-```
-
---
-
-## Error Messages
-
-### **Old Error (Before Fix):**
-```
-Error: Failed to process chunk 2/2: Error processing video:
-503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The model is overloaded.'}}
-```
-
-### **New Error (After Fix with Retry):**
-```
-[Video: example.mp4] Retryable error (attempt 1/5): 503 - The model is overloaded
-[Video: example.mp4] Waiting 5s before retry...
-[Video: example.mp4] Retry attempt 2/5
-[Video: example.mp4] ✓ Request succeeded after 2 attempts
-```
-
-### **New Error (If All Retries Fail):**
-```
-❌ Gemini API is temporarily overloaded
-
-💡 Suggested Fix:
-The API is temporarily overloaded. The system will automatically retry.
-If this persists:
-  1. Wait a few minutes and try again
-  2. Reduce parallel processing: set MAX_PARALLEL_CHUNKS=1 in .env
-  3. Set GEMINI_API_TIER=free in .env for conservative rate limiting
-
-📋 Error ID: E7F8A1B2
-```
-
---
-
-## Troubleshooting
-
-### **Still Getting 503 Errors?**
-
-**Step 1: Verify configuration**
-```bash
-cd backend
-cat .env | grep GEMINI_API_TIER
-# Should show: GEMINI_API_TIER=free
-```
-
-**Step 2: Reduce parallel workers**
-```bash
-echo "MAX_PARALLEL_CHUNKS=1" >> .env
-```
-
-**Step 3: Check logs**
-```bash
-# Watch rate limiting in action
-journalctl -u video-query -f | grep "Rate limiting"
-
-# Should see: "Rate limiting: waiting 12.0s before next API call"
-```
-
-**Step 4: Verify chunk duration**
-```bash
-cd backend
-python -c "from video_splitter import VideoSplitter; print(VideoSplitter.DEFAULT_CHUNK_DURATION)"
-# Should show: 43
-```
-
---
-
-## Benefits Summary
-
-✅ **No more 503 errors on long videos**
-✅ **Automatic rate limiting based on API tier**
-✅ **Intelligent retry with exponential backoff**
-✅ **Chunk duration respects Google's 45-min limit**
-✅ **Works reliably on free tier (5 RPM)**
-✅ **Fast processing on paid tier (60 RPM)**
-✅ **Clear error messages with suggested fixes**
-✅ **User-friendly error IDs for support**
-
---
-
-## Next Steps
-
-1. **Test with a long video:**
-   ```bash
-   cd backend
-   python run.py
-   # Upload a 2-hour video through the frontend
-   ```
-
-2. **Monitor the logs:**
-   ```bash
-   # Watch rate limiting work
-   tail -f logs/video_query.log | grep "Rate limiting"
-
-   # Watch retry logic
-   tail -f logs/video_query.log | grep "Retry"
-   ```
-
-3. **If on paid tier:**
-   ```bash
-   # Update .env to unlock faster processing
-   sed -i 's/GEMINI_API_TIER=free/GEMINI_API_TIER=paid/' backend/.env
-
-   # Restart
-   python backend/run.py
-   ```
-
---
-
-## Conclusion
-
-The 503 errors were caused by:
-1. Rate limit violations (too many parallel requests)
-2. Inadequate delays between requests
-3. Chunk durations exceeding API limits
-
-All issues have been fixed with:
-1. ✅ Smart rate limiting (12s for free, 1s for paid)
-2. ✅ Reduced parallel workers (2 for free, 4 for paid)
-3. ✅ Shorter chunks (43 min vs 54 min)
-4. ✅ Intelligent retry logic (up to 5 attempts)
-5. ✅ API tier auto-detection
-
-**The application now handles long videos reliably on both free and paid tiers!**
-
---
-
-**Ready to test? Start the application:**
-```bash
-cd backend
-python run.py
-```
--- a/BATCH_PROCESSING_IMPROVEMENTS.md
+++ b/BATCH_PROCESSING_IMPROVEMENTS.md
@ -1,349 +0,0 @@
-# Batch Processing Improvements - Implementation Summary
-
-**Date**: 2025-11-10
-**Status**: ✅ All Phases Completed
-
-## Overview
-
-Implemented comprehensive improvements to batch video processing including model consistency fixes, specialized synthesis strategies, enhanced logging, and configurable options. All videos in a batch are now processed with the same prompt and synthesized intelligently based on content type.
-
---
-
-## Changes Implemented
-
-### ✅ Phase 1: Enhanced Logging
-
-**File Modified**: `backend/video_processor.py`
-
-**Changes**:
- Added structured logging with `[Stage 1]`, `[Stage 2]`, `[Traceability]`, and `[Metrics]` prefixes
- Implemented configurable debug-level logging for prompts and summaries
- Added performance metrics tracking (stage times, avg time per video, API call count)
- Added video-to-summary-to-result traceability logging
-
-**New Log Output**:
-```
-Batch abc123: [Stage 1] Processing video 1/3: meeting1.mp4
-Batch abc123: [Stage 1] Video 1 complete: 1,245 chars in 45.2s
-Batch abc123: [Stage 2] Detected prompt type: meeting_summary
-Batch abc123: [Stage 2] Synthesis complete: 3,456 chars in 15.3s
-Batch abc123: [Traceability] Video-to-summary mapping:
-Batch abc123:   - Video 1: meeting1.mp4 → Summary 1
-Batch abc123: [Metrics] Stage 1: 135.6s, Stage 2: 15.3s, Total: 150.9s
-```
-
-**Lines Modified**: 987-1055, 1123-1247
-
---
-
-### ✅ Phase 2: Model Consistency Fix
-
-**File Modified**: `backend/video_processor.py`
-
-**Changes**:
- Changed synthesis model from `gemini-2.0-flash-exp` to `gemini-2.5-pro`
- Added model configuration constants at class level
- Made models configurable via environment variables
-
-**Before**:
-```python
-# Individual processing
-model="gemini-2.5-pro"
-
-# Batch synthesis
-model="gemini-2.0-flash-exp"  # ❌ INCONSISTENT
-```
-
-**After**:
-```python
-# Both use same model
-self.processing_model = "gemini-2.5-pro"
-self.synthesis_model = "gemini-2.5-pro"  # ✅ CONSISTENT
-```
-
-**Lines Modified**: 48-50, 82-88, 339, 553, 1252
-
---
-
-### ✅ Phase 3: Specialized Synthesis Strategies
-
-**File Modified**: `backend/video_processor.py`
-
-**Changes**:
- Added `_detect_prompt_type()` method to classify prompts
- Added `_create_synthesis_prompt_meeting()` for meeting summaries
- Added `_create_synthesis_prompt_documentation()` for process docs
- Updated `_synthesize_final_result()` to route to specialized strategies
-
-**Prompt Type Detection**:
-```python
-def _detect_prompt_type(self, prompt: str, summaries: List[str]) -> str:
-    """
-    Detects: meeting_summary | documentation | documentation_with_charts | generic
-    """
-    # Keywords: meeting, discussion, action item → meeting_summary
-    # Keywords: documentation, process, training → documentation
-    # Keywords: diagram, chart, mermaid → documentation_with_charts
-```
-
-**Meeting Synthesis Strategy**:
- Consolidates discussion points across all videos
- Creates master action items list (removes duplicates)
- Formats with clear sections: Overview, Discussion, Action Items, Outcomes
-
-**Documentation Synthesis Strategy**:
- Combines steps into sequential guide
- Numbers steps continuously (Step 1, Step 2, ...)
- Includes Prerequisites, Tips, Troubleshooting sections
-
-**Lines Added**: 1195-1441
-
---
-
-### ✅ Phase 4: Configuration Options
-
-**Files Modified**:
- `backend/video_processor.py`
- `backend/.env.example` (created)
-
-**New Environment Variables**:
-
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `VIDEO_PROCESSOR_MODEL` | `gemini-2.5-pro` | Model for individual video processing |
-| `VIDEO_SYNTHESIS_MODEL` | `gemini-2.5-pro` | Model for batch synthesis |
-| `BATCH_PROCESSING_LOG_PROMPTS` | `false` | Enable prompt logging (debug) |
-| `BATCH_PROCESSING_LOG_SUMMARIES` | `false` | Enable summary preview logging (debug) |
-
-**Usage Example**:
-```bash
-# Enable detailed logging for debugging
-export BATCH_PROCESSING_LOG_PROMPTS=true
-export BATCH_PROCESSING_LOG_SUMMARIES=true
-
-# Use different model for synthesis (optional)
-export VIDEO_SYNTHESIS_MODEL=gemini-2.0-flash-exp
-```
-
-**Lines Modified**: 82-88, 1003-1004, 1016-1017, 1150-1151, 1170-1171, 1190-1192, 1204-1205, 1240-1242, 1272-1273
-
---
-
-### ✅ Documentation Updates
-
-**File Modified**: `CLAUDE.md`
-
-**Sections Added/Updated**:
-1. **Backend Setup**: Added .env example with all configuration options
-2. **Production Deployment**: Updated environment configuration section
-3. **Key Architecture Components**: Added comprehensive Batch Processing Architecture section
-4. **Configuration Files**: Documented all environment variables
-5. **Troubleshooting**: Added Batch Processing Issues section with debugging guide
-
-**New Documentation Sections**:
- Batch Processing Architecture
- Batch Processing Flow (4-stage explanation)
- Logging Levels guide
- Troubleshooting: Inconsistent summaries
- Troubleshooting: Prompt visibility
- Troubleshooting: Video-to-result mapping
- Troubleshooting: Performance issues
-
---
-
-## How to Use
-
-### Normal Operation (Default)
-```bash
-# No changes needed - works out of the box
-GOOGLE_API_KEY=your_key
-```
-
-### Enable Debugging
-```bash
-# In backend/.env
-GOOGLE_API_KEY=your_key
-BATCH_PROCESSING_LOG_PROMPTS=true
-BATCH_PROCESSING_LOG_SUMMARIES=true
-
-# Restart backend
-sudo systemctl restart video-query
-
-# View logs with filtering
-journalctl -u video-query -f | grep "Batch"
-```
-
-### View Traceability (Always Enabled)
-```bash
-# See which video contributed to which part of result
-journalctl -u video-query -f | grep "Traceability"
-```
-
-### View Performance Metrics (Always Enabled)
-```bash
-# See timing breakdown and API call counts
-journalctl -u video-query -f | grep "Metrics"
-```
-
---
-
-## Verification
-
-### Test Batch Processing
-```bash
-# Process multiple videos as batch
-curl -X POST http://localhost:5010/api/process-batch \
-  -H "Content-Type: application/json" \
-  -d '{
-    "videos": [
-      {"file_path": "/tmp/video1.mp4", "filename": "meeting_part1.mp4", "order": 1},
-      {"file_path": "/tmp/video2.mp4", "filename": "meeting_part2.mp4", "order": 2}
-    ],
-    "prompt": "Generate a detailed meeting summary with action items",
-    "batch_id": "test-batch-001"
-  }'
-
-# Check logs for:
-# 1. Prompt type detection: "Detected prompt type: meeting_summary"
-# 2. Model consistency: "model: gemini-2.5-pro" for both stages
-# 3. Traceability: Video-to-summary mapping
-# 4. Performance: Stage 1/2 timing
-```
-
-### Expected Log Output
-```
-2025-11-10 10:30:00 - Batch test-batch-001: Processing 2 videos (meeting_part1.mp4, meeting_part2.mp4)
-2025-11-10 10:30:00 - Batch test-batch-001: [Stage 1] Direct processing of 2 videos
-2025-11-10 10:30:05 - Batch test-batch-001: [Stage 1] Processing video 1/2: meeting_part1.mp4
-2025-11-10 10:30:50 - Batch test-batch-001: [Stage 1] Video 1 complete: 1,234 chars in 45.2s
-2025-11-10 10:30:55 - Batch test-batch-001: [Stage 1] Processing video 2/2: meeting_part2.mp4
-2025-11-10 10:31:40 - Batch test-batch-001: [Stage 1] Video 2 complete: 1,567 chars in 45.1s
-2025-11-10 10:31:40 - Batch test-batch-001: [Stage 1] Complete - 2 summaries in 95.3s
-2025-11-10 10:31:40 - Batch test-batch-001: [Traceability] Video-to-summary mapping:
-2025-11-10 10:31:40 - Batch test-batch-001:   - Video 1: meeting_part1.mp4 → Summary 1
-2025-11-10 10:31:40 - Batch test-batch-001:   - Video 2: meeting_part2.mp4 → Summary 2
-2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Synthesizing 2 summaries
-2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Combined summaries: 2 summaries, 2801 total chars
-2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Detected prompt type: meeting_summary
-2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Sending synthesis request to Gemini API (model: gemini-2.5-pro)
-2025-11-10 10:31:55 - Batch test-batch-001: [Stage 2] Synthesis complete: 3,456 chars in 15.2s
-2025-11-10 10:31:55 - Batch test-batch-001: [Metrics] Stage 1: 95.3s, Stage 2: 15.2s, Total: 110.5s
-2025-11-10 10:31:55 - Batch test-batch-001: [Metrics] Avg time per video: 47.7s
-```
-
---
-
-## Benefits
-
-### 1. Model Consistency ✅
- **Before**: Different models for processing vs synthesis
- **After**: Same model (gemini-2.5-pro) ensures consistent quality
- **Impact**: More predictable and reliable results
-
-### 2. Specialized Synthesis ✅
- **Before**: Generic synthesis for all content types
- **After**: Tailored strategies for meetings, documentation, diagrams
- **Impact**: Better quality summaries that match user intent
-
-### 3. Enhanced Visibility ✅
- **Before**: Limited logging, hard to debug issues
- **After**: Comprehensive logging with traceability and metrics
- **Impact**: Easy troubleshooting and performance optimization
-
-### 4. Configurability ✅
- **Before**: Models and logging hardcoded
- **After**: Configurable via environment variables
- **Impact**: Flexible for different use cases and debugging
-
---
-
-## Files Changed
-
-| File | Lines Modified | Changes |
-|------|---------------|---------|
-| `backend/video_processor.py` | ~200 lines | Model config, logging, synthesis strategies |
-| `backend/.env.example` | New file | Configuration documentation |
-| `CLAUDE.md` | ~100 lines | Architecture docs, troubleshooting guide |
-| `BATCH_PROCESSING_IMPROVEMENTS.md` | New file | This summary document |
-
---
-
-## Rollback Instructions
-
-If issues arise, rollback is simple:
-
-### Option 1: Use Git
-```bash
-cd /path/to/video-query
-git checkout HEAD~1 backend/video_processor.py
-sudo systemctl restart video-query
-```
-
-### Option 2: Disable New Features
-```bash
-# In backend/.env
-VIDEO_SYNTHESIS_MODEL=gemini-2.0-flash-exp  # Revert to old model
-BATCH_PROCESSING_LOG_PROMPTS=false
-BATCH_PROCESSING_LOG_SUMMARIES=false
-
-sudo systemctl restart video-query
-```
-
---
-
-## Next Steps
-
-### Recommended Testing
-1. **Test with meeting videos**: Verify meeting-specific synthesis
-2. **Test with documentation videos**: Verify documentation synthesis
-3. **Test with diagrams**: Verify diagram merging
-4. **Load test**: Process batch with 5+ videos
-5. **Performance test**: Compare stage 1 vs stage 2 times
-
-### Future Enhancements (Optional)
-1. Add structured JSON logging for log aggregation tools
-2. Add Prometheus metrics for monitoring
-3. Add batch processing status webhooks
-4. Add configurable synthesis strategies per user/tenant
-5. Add caching for similar prompts
-
---
-
-## Support
-
-### Enable Debug Logging
-```bash
-# In backend/.env
-BATCH_PROCESSING_LOG_PROMPTS=true
-BATCH_PROCESSING_LOG_SUMMARIES=true
-
-# View filtered logs
-journalctl -u video-query -f | grep -E "(Batch|Stage|Traceability|Metrics)"
-```
-
-### Common Issues
-See `CLAUDE.md` → Troubleshooting → Batch Processing Issues
-
-### Questions
-Refer to updated documentation in `CLAUDE.md`:
- Batch Processing Architecture section
- Configuration Files section
- Troubleshooting section
-
---
-
-## Implementation Summary
-
-✅ **Phase 1**: Enhanced Logging - COMPLETE
-✅ **Phase 2**: Model Consistency - COMPLETE
-✅ **Phase 3**: Specialized Synthesis - COMPLETE
-✅ **Phase 4**: Configuration Options - COMPLETE
-✅ **Documentation**: Updated CLAUDE.md - COMPLETE
-
-**Total Implementation Time**: ~3 hours
-**Testing Recommended**: 1-2 hours
-**Production Risk**: Low (backward compatible, configurable)
-
---
-
-**End of Implementation Summary**
--- a/BUGFIX_BATCH_PROCESSING.md
+++ b/BUGFIX_BATCH_PROCESSING.md
@ -1,224 +0,0 @@
-# Bug Fix: Batch Processing Error
-
-**Date**: 2025-11-10
-**Status**: ✅ Fixed
-**Severity**: Critical (prevented batch processing from working)
-
---
-
-## Error Description
-
-**Error Message**:
-```
-This Final Unified Meeting Summary could not be generated.
-
-Reason: The underlying analysis of all video segments failed, resulting in error messages instead of summaries.
-
-Error details from all provided segments: [Error: not enough values to unpack (expected 5, got 4)]
-```
-
-**Root Cause**: Tuple unpacking mismatch in parallel processing code
-
---
-
-## Technical Details
-
-### Problem
-
-In `video_processor.py`, the `_process_chunks_two_stage()` method calls `_process_single_chunk()` with only 4 parameters, but the function expects 5 parameters.
-
-**Expected signature** (line 660):
-```python
-def _process_single_chunk(self, chunk_info: Tuple[int, str, str, int, str]):
-    chunk_index, chunk_path, chunk_prompt, total_chunks, user_email = chunk_info
-    #                                        ^^^^^^^^^^^^^ MISSING!
-```
-
-**Incorrect call** (line 1155 - before fix):
-```python
-future = executor.submit(
-    self._process_single_chunk,
-    (i, chunk_path, summary_prompt, user_email)  # Only 4 params!
-)
-```
-
-### Additional Issue
-
-The result handling was also incorrect. The function returns `(chunk_index, result_dict)`, but the code was treating `result_dict` as a string directly instead of extracting the `'content'` field.
-
-**Incorrect handling** (line 1163 - before fix):
-```python
-chunk_idx, summary = future.result()  # summary is a dict, not a string!
-chunk_summaries.append((chunk_idx, summary))
-```
-
---
-
-## Fixes Applied
-
-### Fix 1: Added missing `total_chunks` parameter
-
-**File**: `backend/video_processor.py`
-**Line**: 1155
-
-**Before**:
-```python
-future = executor.submit(
-    self._process_single_chunk,
-    (i, chunk_path, summary_prompt, user_email)
-)
-```
-
-**After**:
-```python
-future = executor.submit(
-    self._process_single_chunk,
-    (i, chunk_path, summary_prompt, len(chunk_paths), user_email)
-)
-```
-
-### Fix 2: Extract content from result dict
-
-**File**: `backend/video_processor.py`
-**Lines**: 1163-1178
-
-**Before**:
-```python
-chunk_idx, summary = future.result()
-chunk_summaries.append((chunk_idx, summary))
-```
-
-**After**:
-```python
-chunk_idx, result = future.result()
-
-# Extract content from result dict
-if result.get('success'):
-    summary = result.get('content', '')
-else:
-    summary = f"[Error: {result.get('message', 'Unknown error')}]"
-
-chunk_summaries.append((chunk_idx, summary))
-```
-
---
-
-## Impact
-
-### Before Fix
- ❌ Batch processing with chunking completely broken
- ❌ Error: "not enough values to unpack (expected 5, got 4)"
- ❌ Users could not process multiple long videos as batch
-
-### After Fix
- ✅ Batch processing with chunking works correctly
- ✅ All 5 parameters passed correctly
- ✅ Result content extracted properly
- ✅ Users can process multiple long videos as batch
-
---
-
-## Testing
-
-### Verified Scenarios
-
-1. **Batch with 2 short videos** (< 54 min each, no chunking):
-   - Uses direct processing path
-   - ✅ Not affected by this bug (different code path)
-
-2. **Batch with 1 long video** (> 54 min, needs chunking):
-   - Uses chunking + parallel processing
-   - ✅ Fixed by this patch
-
-3. **Batch with mixed videos** (some short, one long):
-   - Long video gets chunked, short ones don't
-   - ✅ Fixed by this patch
-
-### Test Command
-
-```bash
-# Test batch processing with long video
-curl -X POST http://localhost:5010/api/process-batch \
-  -H "Content-Type: application/json" \
-  -d '{
-    "videos": [
-      {"file_path": "/path/to/long_video1.mp4", "filename": "video1.mp4", "order": 1},
-      {"file_path": "/path/to/long_video2.mp4", "filename": "video2.mp4", "order": 2}
-    ],
-    "prompt": "Generate a detailed meeting summary",
-    "batch_id": "test-batch"
-  }'
-```
-
---
-
-## Related Code
-
-### Other Parallel Processing (Not Affected)
-
-The `_process_chunks_parallel()` method (line 686-733) used for individual long videos was **NOT affected** because it was already correctly passing 5 parameters:
-
-```python
-# Line 706 - CORRECT (not modified)
-chunk_infos.append((i, chunk_path, chunk_prompt, num_chunks, user_email))
-```
-
---
-
-## Files Modified
-
- `backend/video_processor.py` (2 sections fixed)
-  - Line 1155: Added missing `total_chunks` parameter
-  - Lines 1163-1178: Fixed result dict extraction
-
---
-
-## Deployment
-
-### Apply Fix
-```bash
-cd /path/to/video-query
-
-# Pull latest changes (if in git)
-git pull
-
-# Or manually update video_processor.py with fixes
-
-# Restart backend
-sudo systemctl restart video-query
-
-# Verify
-journalctl -u video-query -f
-```
-
-### Verify Fix
-```bash
-# Check logs show proper processing
-journalctl -u video-query -f | grep "Stage 1"
-
-# Should see:
-# Batch xxx: [Stage 1] Chunk 1/5 complete (1/5 total)
-# NOT: "not enough values to unpack"
-```
-
---
-
-## Prevention
-
-To prevent similar issues:
-
-1. **Type Hints**: Function signatures already have type hints
-2. **Testing**: Add unit tests for parallel processing
-3. **Code Review**: Check tuple unpacking matches function signatures
-
---
-
-## Related Issues
-
-This bug was introduced during the enhancement work (see `BATCH_PROCESSING_IMPROVEMENTS.md`) when adding detailed logging to the `_process_chunks_two_stage()` method. The original code was refactored but the tuple unpacking wasn't updated consistently.
-
---
-
-**Status**: ✅ Fixed and verified
-**Testing**: Manual testing recommended for batch processing with long videos
-**Risk**: Low - targeted fix with minimal changes
--- a/CORS_FIX_SUMMARY.md
+++ b/CORS_FIX_SUMMARY.md
@ -1,170 +0,0 @@
-# CORS Fix Summary
-
-## Issue
-Frontend running on `http://localhost:3000` was blocked by CORS policy when trying to access backend API at `https://brandtechsandbox.oliver.solutions/video_query_back/api/init-upload`
-
-### Error Message
-```
-Access to XMLHttpRequest at 'https://brandtechsandbox.oliver.solutions/video_query_back/api/init-upload'
-from origin 'http://localhost:3000' has been blocked by CORS policy:
-Response to preflight request doesn't pass access control check:
-No 'Access-Control-Allow-Origin' header is present on the requested resource.
-```
-
---
-
-## Root Cause
-The OPTIONS preflight handler in `app.py` (lines 1124-1132) was only returning `https://ai-sandbox.oliver.solutions` as the allowed origin, not `http://localhost:3000`.
-
---
-
-## Solution Implemented
-
-### File: `backend/app.py`
-
-#### Changed (lines 1123-1143):
-```python
-# Handle CORS preflight requests for all API routes
-@app.route('/api/<path:path>', methods=['OPTIONS'])
-def handle_options(path):
-    # Get the origin from the request
-    origin = request.headers.get('Origin')
-    allowed_origins = ['https://ai-sandbox.oliver.solutions', 'http://localhost:3000']
-
-    response = jsonify({})
-
-    # Allow the origin if it's in our allowed list
-    if origin in allowed_origins:
-        response.headers.add('Access-Control-Allow-Origin', origin)
-    else:
-        # Default to production origin
-        response.headers.add('Access-Control-Allow-Origin', 'https://ai-sandbox.oliver.solutions')
-
-    response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization,X-Requested-With')
-    response.headers.add('Access-Control-Allow-Methods', 'GET,POST,OPTIONS')
-    response.headers.add('Access-Control-Max-Age', '86400')  # 24 hours
-    response.headers.add('Access-Control-Allow-Credentials', 'true')
-    return response
-```
-
---
-
-## What Changed
-
-### Before:
- Hardcoded origin: `'https://ai-sandbox.oliver.solutions'`
- Did not check request origin
- Always returned same origin regardless of where request came from
-
-### After:
- Dynamic origin checking
- List of allowed origins: `['https://ai-sandbox.oliver.solutions', 'http://localhost:3000']`
- Returns the origin that made the request if it's in the allowed list
- Falls back to production origin if request origin is not allowed
-
---
-
-## Existing CORS Configuration (Already Correct)
-
-The main CORS configuration on line 41-46 was already correct:
-```python
-CORS(app, resources={r"/api/*": {
-    "origins": ["https://ai-sandbox.oliver.solutions", "http://localhost:3000"],
-    "supports_credentials": True,
-    "methods": ["GET", "POST", "OPTIONS"],
-    "allow_headers": ["Content-Type", "X-Requested-With", "Authorization"]
-}}, expose_headers=["Content-Disposition", "Authorization"])
-```
-
---
-
-## Testing
-
-### Before Fix:
-```
-❌ localhost:3000 → backend API: CORS error
-✅ production → backend API: Works
-```
-
-### After Fix:
-```
-✅ localhost:3000 → backend API: Should work
-✅ production → backend API: Still works
-```
-
---
-
-## How to Test
-
-1. **Start the backend** (if not already running):
-   ```bash
-   cd backend
-   python3 run.py
-   # or
-   hypercorn run:app
-   ```
-
-2. **Start the frontend** on localhost:3000:
-   ```bash
-   cd frontend
-   npm start
-   ```
-
-3. **Test the upload**:
-   - Open browser to `http://localhost:3000`
-   - Try uploading a video
-   - Check browser console for CORS errors
-   - Should see successful API calls
-
-4. **Check Network Tab**:
-   - Open browser DevTools → Network tab
-   - Look for `init-upload` request
-   - Check Response Headers for:
-     - `Access-Control-Allow-Origin: http://localhost:3000`
-     - `Access-Control-Allow-Credentials: true`
-
---
-
-## Additional Notes
-
-### Why This Fix is Safe:
-1. **Localhost is development only** - Won't be accessible in production
-2. **Credentials still required** - Auth is still enforced
-3. **Limited to /api/* routes** - Doesn't affect other routes
-4. **Production origin still allowed** - No impact on deployed version
-
-### If You Need to Add More Origins:
-Update the `allowed_origins` list in `app.py` line 1128:
-```python
-allowed_origins = [
-    'https://ai-sandbox.oliver.solutions',
-    'http://localhost:3000',
-    'http://localhost:3001',  # Add more as needed
-    'https://another-domain.com'
-]
-```
-
---
-
-## Files Modified
-
-1. ✅ `backend/app.py` - Updated OPTIONS handler (lines 1123-1143)
-
---
-
-## Dependencies
-
- ✅ `flask-cors==5.0.1` - Already in requirements.txt
- ✅ No new dependencies needed
-
---
-
-## Status
-
-✅ **FIXED and READY FOR TESTING**
-
-The CORS error should now be resolved. Try uploading a video from `http://localhost:3000` and verify it works.
-
---
-
-Generated: 2025-10-16
--- a/CROSS_PLATFORM_IMPLEMENTATION_SUMMARY.md
+++ b/CROSS_PLATFORM_IMPLEMENTATION_SUMMARY.md
@ -1,396 +0,0 @@
-# Cross-Platform Support & Error Reporting - Implementation Summary
-
-**Date:** 2025-11-13
-**Status:** ✅ **COMPLETED**
-
---
-
-## Overview
-
-Successfully implemented cross-platform support and comprehensive error reporting for the Video Query application. The system now works seamlessly on:
- ✅ Linux (Ubuntu, Debian, CentOS, RHEL)
- ✅ macOS (Intel and Apple Silicon M1/M2/M3)
- ✅ Windows WSL
-
---
-
-## What Was Implemented
-
-### 1. **New Files Created** (2 files)
-
-#### `backend/system_utils.py` (620 lines)
-**Purpose:** Cross-platform system utility path detection
-
-**Features:**
- ✅ Automatic OS detection (Linux, macOS, Windows)
- ✅ Intelligent executable search across multiple locations
- ✅ macOS Apple Silicon support (`/opt/homebrew/bin/`)
- ✅ macOS Intel support (`/usr/local/bin/`)
- ✅ Linux standard paths (`/usr/bin/`, `/usr/local/bin/`, `/snap/bin/`)
- ✅ PATH environment variable fallback
- ✅ LRU caching for performance
- ✅ Executable verification (runs `-version` test)
- ✅ Detailed error messages with installation instructions
-
-**Key Functions:**
-```python
-system_utils.find_ffprobe()      # Find ffprobe executable
-system_utils.find_ffmpeg()       # Find ffmpeg executable
-system_utils.find_wkhtmltopdf()  # Find wkhtmltopdf executable
-system_utils.get_system_info()   # Get system information
-```
-
-#### `backend/error_reporter.py` (450 lines)
-**Purpose:** Comprehensive error reporting and tracking
-
-**Features:**
- ✅ Auto-categorization of errors (System, API, Video, Network, Upload, User, Unknown)
- ✅ Unique error IDs for tracking
- ✅ User-friendly error messages
- ✅ Technical debug information with stack traces
- ✅ Suggested fixes for common errors
- ✅ Context capture (file paths, operations, request data)
- ✅ System information gathering
- ✅ Recent errors storage (last 100)
- ✅ Error export to JSON
-
-**Key Features:**
-```python
-ErrorReporter.capture_error()      # Capture and report errors
-error_report.format_user_message() # User-friendly format
-error_report.format_technical()    # Technical debug format
-error_report.to_json()             # Export to JSON
-```
-
-**Error Categories:**
-1. **SYSTEM_ERROR** - Missing dependencies, file not found, permissions
-2. **API_ERROR** - Gemini API issues (503, 429, 500)
-3. **VIDEO_ERROR** - Corrupted files, encoding issues
-4. **NETWORK_ERROR** - Connection timeouts, DNS issues
-5. **UPLOAD_ERROR** - File upload failures
-6. **USER_ERROR** - Invalid input or configuration
-7. **UNKNOWN_ERROR** - Unexpected errors
-
---
-
-### 2. **Modified Files** (4 files)
-
-#### `backend/video_splitter.py`
-**Changes:**
- ✅ Added imports: `system_utils`, `error_reporter`
- ✅ Line 51: Replaced hardcoded `/usr/bin/ffprobe` with `system_utils.find_ffprobe()`
- ✅ Lines 72-94: Enhanced error reporting in `get_video_duration()`
- ✅ Lines 265-292: Enhanced error reporting in `split_video()`
-
-**Impact:**
- Now works on macOS (Intel and Apple Silicon)
- Better error messages when ffprobe is missing
- Detailed error context for debugging
-
-#### `backend/video_processor.py`
-**Changes:**
- ✅ Added imports: `system_utils`, `error_reporter`
- ✅ Line 206: Updated ffprobe subprocess call to use `system_utils.find_ffprobe()`
- ✅ Lines 401-416: Enhanced error reporting in `process_video()`
- ✅ Lines 822-838: Enhanced error reporting in `process_long_video()`
-
-**Impact:**
- Cross-platform video validation
- Detailed error reports with unique IDs
- Suggested fixes returned to frontend
-
-#### `backend/chunked_upload.py`
-**Changes:**
- ✅ Added imports: `system_utils`, `error_reporter`
- ✅ Line 180: Updated ffprobe call for upload validation
- ✅ Lines 216-231: Enhanced error reporting for upload failures
-
-**Impact:**
- Upload validation works on all platforms
- Better error tracking for failed uploads
-
-#### `backend/app.py`
-**Changes:**
- ✅ Added imports: `system_utils`, `error_reporter`
- ✅ Lines 1064-1077: Replaced hardcoded wkhtmltopdf path with `system_utils.find_wkhtmltopdf()`
- ✅ Lines 255-271: Enhanced error reporting in `/api/process`
- ✅ Lines 371-387: Enhanced error reporting in `/api/process-batch`
- ✅ Lines 1251-1267: Enhanced error reporting in `/api/generate-pdf`
-
-**Impact:**
- PDF generation works on macOS
- All API endpoints return structured error information
- Error IDs included in responses for support
-
---
-
-### 3. **Test Script Created**
-
-#### `backend/test_system_setup.py`
-**Purpose:** Verify system setup before running the application
-
-**Features:**
- ✅ Tests system information detection
- ✅ Tests executable path detection (ffprobe, ffmpeg, wkhtmltopdf)
- ✅ Tests error reporting functionality
- ✅ Provides installation instructions if dependencies are missing
-
-**Usage:**
-```bash
-cd backend
-python test_system_setup.py
-```
-
-**Test Results on Current System (WSL Ubuntu):**
-```
-✅ ffprobe: Found at /usr/bin/ffprobe
-✅ ffmpeg: Found at /usr/bin/ffmpeg
-⚠️ wkhtmltopdf: Found but verification failed (known quirk, still works)
-✅ Error reporting: All categories working correctly
-```
-
---
-
-## Platform-Specific Paths
-
-### ffprobe/ffmpeg Locations:
-
-| Platform | Paths Searched (in order) |
-|----------|---------------------------|
-| **Linux** | `/usr/bin/`, `/usr/local/bin/`, `/snap/bin/`, PATH |
-| **macOS (Apple Silicon)** | `/opt/homebrew/bin/`, `/usr/local/bin/`, `/usr/bin/`, PATH |
-| **macOS (Intel)** | `/usr/local/bin/`, `/opt/homebrew/bin/`, `/usr/bin/`, PATH |
-| **Windows WSL** | `/usr/bin/`, `/usr/local/bin/`, PATH |
-
-### wkhtmltopdf Locations:
-
-| Platform | Paths Searched (in order) |
-|----------|---------------------------|
-| **Linux** | `/usr/bin/`, `/usr/local/bin/`, `/snap/bin/`, PATH |
-| **macOS** | `/opt/homebrew/bin/`, `/usr/local/bin/`, `/usr/bin/`, PATH |
-| **Windows WSL** | `/usr/bin/`, `/usr/local/bin/`, PATH |
-
---
-
-## Error Reporting Examples
-
-### Example 1: Missing Dependency
-```json
-{
-  "success": false,
-  "message": "❌ System dependency missing: FFmpeg/FFprobe is not installed\n\n💡 Suggested Fix:\nInstall FFmpeg:\n  Ubuntu/Debian: sudo apt-get install ffmpeg\n  macOS: brew install ffmpeg\n\n📋 Error ID: A3B5C7D9",
-  "error_id": "A3B5C7D9",
-  "error_category": "system"
-}
-```
-
-### Example 2: API Overload (503)
-```json
-{
-  "success": false,
-  "message": "❌ Gemini API is temporarily overloaded\n\n💡 Suggested Fix:\nThe API is temporarily overloaded. The system will automatically retry.\nIf this persists:\n  1. Wait a few minutes\n  2. Set MAX_PARALLEL_CHUNKS=1 in .env\n  3. Set GEMINI_API_TIER=free in .env\n\n📋 Error ID: E7F8A1B2",
-  "error_id": "E7F8A1B2",
-  "error_category": "api"
-}
-```
-
-### Example 3: Corrupted Video
-```json
-{
-  "success": false,
-  "message": "❌ Video file is incomplete or corrupted (missing header)\n\n💡 Suggested Fix:\n1. Try re-uploading the file\n2. Re-encode: ffmpeg -i input.mp4 -c copy output.mp4\n3. Ensure upload completed fully\n\n📋 Error ID: C4D5E6F7",
-  "error_id": "C4D5E6F7",
-  "error_category": "video"
-}
-```
-
---
-
-## Installation Instructions by Platform
-
-### macOS (Homebrew)
-```bash
-# Install Homebrew if not already installed
-/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
-
-# Install dependencies
-brew install ffmpeg wkhtmltopdf
-
-# Test the setup
-cd backend
-python test_system_setup.py
-```
-
-### Ubuntu/Debian
-```bash
-# Update package list
-sudo apt-get update
-
-# Install dependencies
-sudo apt-get install ffmpeg wkhtmltopdf
-
-# Test the setup
-cd backend
-python test_system_setup.py
-```
-
-### CentOS/RHEL
-```bash
-# Enable EPEL repository
-sudo yum install epel-release
-
-# Install dependencies
-sudo yum install ffmpeg wkhtmltopdf
-
-# Test the setup
-cd backend
-python test_system_setup.py
-```
-
---
-
-## Usage Examples
-
-### Check System Setup
-```bash
-cd backend
-python test_system_setup.py
-```
-
-### Manual Testing in Python
-```python
-# Test system utilities
-from system_utils import system_utils
-
-print(system_utils.get_system_info())
-print(f"ffprobe: {system_utils.find_ffprobe()}")
-print(f"ffmpeg: {system_utils.find_ffmpeg()}")
-print(f"wkhtmltopdf: {system_utils.find_wkhtmltopdf()}")
-
-# Test error reporting
-from error_reporter import ErrorReporter, ErrorCategory
-
-try:
-    raise Exception("503 UNAVAILABLE: Model overloaded")
-except Exception as e:
-    report = ErrorReporter.capture_error(e)
-    print(report.format_user_message())
-```
-
---
-
-## Benefits
-
-### Before Implementation:
-```
-❌ Hardcoded paths: /usr/bin/ffprobe (fails on macOS)
-❌ Generic errors: "Error processing video: [exception]"
-❌ No error context or tracking
-❌ Users must dig through logs to debug
-❌ No suggested fixes
-```
-
-### After Implementation:
-```
-✅ Auto-detects executables on any platform
-✅ Works on Linux, macOS (Intel & ARM), Windows WSL
-✅ Clear error messages with unique IDs
-✅ Auto-categorization of error types
-✅ Suggested fixes for common issues
-✅ Full error context for debugging
-✅ Error tracking and export
-✅ Installation instructions when dependencies missing
-```
-
---
-
-## Performance Impact
-
- **Negligible overhead:** Path detection uses LRU caching (cached after first lookup)
- **No impact on video processing:** Paths resolved once at startup
- **Error reporting:** Adds ~1-2ms per error (only on failures)
-
---
-
-## Testing Checklist
-
- [x] Test on current system (WSL Ubuntu) ✅
- [x] Verify ffprobe detection ✅
- [x] Verify ffmpeg detection ✅
- [x] Verify wkhtmltopdf detection ✅
- [x] Test error categorization ✅
- [x] Test error message formatting ✅
- [x] Test suggested fix generation ✅
- [ ] Test on macOS (Intel) - *Not available*
- [ ] Test on macOS (Apple Silicon) - *Not available*
- [x] Verify no regressions in existing functionality ✅
-
---
-
-## Known Issues
-
-1. **wkhtmltopdf verification:** Sometimes fails version check even when working
-   - **Impact:** Minor - executable still works for PDF generation
-   - **Workaround:** None needed, functionality is not affected
-
---
-
-## Next Steps
-
-The cross-platform support is now complete. You can:
-
-1. **Start the application:**
-   ```bash
-   cd backend
-   python run.py
-   ```
-
-2. **Test on macOS** (when available):
-   - Clone the repo on a Mac
-   - Install dependencies: `brew install ffmpeg wkhtmltopdf`
-   - Run test: `python backend/test_system_setup.py`
-   - Start app: `python backend/run.py`
-
-3. **Monitor error reports:**
-   - All errors now have unique IDs
-   - Users can reference error IDs when reporting issues
-   - Detailed logs available for debugging
-
---
-
-## Files Modified/Created Summary
-
-### New Files (2):
-1. ✅ `backend/system_utils.py` (620 lines)
-2. ✅ `backend/error_reporter.py` (450 lines)
-3. ✅ `backend/test_system_setup.py` (180 lines) - Test script
-
-### Modified Files (4):
-1. ✅ `backend/video_splitter.py` (+30 lines)
-2. ✅ `backend/video_processor.py` (+40 lines)
-3. ✅ `backend/chunked_upload.py` (+20 lines)
-4. ✅ `backend/app.py` (+50 lines)
-
-**Total lines added:** ~1,400 lines
-**Total files changed:** 7 files
-
---
-
-## Conclusion
-
-✅ **Implementation Complete**
-
-The application now has:
- Full cross-platform support (Linux, macOS, Windows WSL)
- Comprehensive error reporting with unique IDs
- Auto-detection of system dependencies
- User-friendly error messages with suggested fixes
- Detailed technical logging for debugging
- Test script to verify setup
-
-The application is ready to run on any supported platform without code changes!
-
---
-
-**Questions or Issues?**
-Run `python backend/test_system_setup.py` to diagnose any setup problems.
--- a/MSAL_CORS_CONFIGURATION_FIX.md
+++ b/MSAL_CORS_CONFIGURATION_FIX.md
@ -1,340 +0,0 @@
-# MSAL & CORS Configuration Fix - Domain Alignment
-
-## Issue Identified
-
-The frontend and backend were configured for **different domains**, which would cause both CORS errors and MSAL authentication failures in production.
-
-### Configuration Mismatch Before Fix:
-
-**Frontend** (`frontend/public/config.js`):
- Domain: `https://brandtechsandbox.oliver.solutions`
- MSAL Redirect URI: `https://brandtechsandbox.oliver.solutions/video-query/`
- API Endpoints: `https://brandtechsandbox.oliver.solutions/video_query_back/*`
-
-**Backend** (`backend/app.py` & `backend/chunked_upload.py`):
- CORS Allowed Origins: `https://ai-sandbox.oliver.solutions` ❌
- OPTIONS Handler Default: `https://ai-sandbox.oliver.solutions` ❌
-
---
-
-## Errors That Would Have Occurred:
-
-### 1. CORS Preflight Failure ❌
-```
-Access to XMLHttpRequest at 'https://brandtechsandbox.oliver.solutions/video_query_back/api/init-upload'
-from origin 'https://brandtechsandbox.oliver.solutions' has been blocked by CORS policy:
-Response to preflight request doesn't pass access control check:
-The 'Access-Control-Allow-Origin' header has a value 'https://ai-sandbox.oliver.solutions'
-that is not equal to the supplied origin.
-```
-
-### 2. API Call Failures ❌
-All API calls from frontend would fail because:
- Frontend sends from: `brandtechsandbox.oliver.solutions`
- Backend only allows: `ai-sandbox.oliver.solutions`
- Result: **403 Forbidden** or **CORS errors**
-
-### 3. MSAL Authentication Would Work BUT... ⚠️
-MSAL would technically work IF Azure AD B2C has `https://brandtechsandbox.oliver.solutions/video-query/` registered as a redirect URI.
-
-**However**, you MUST verify in Azure AD B2C that this exact redirect URI is registered:
- Azure Portal → Azure AD B2C → App registrations
- Client ID: `9079054c-9620-4757-a256-23413042f1ef`
- Authentication → Redirect URIs → Must include: `https://brandtechsandbox.oliver.solutions/video-query/`
-
---
-
-## Solution Applied
-
-Updated backend CORS configuration to match the frontend domain: `brandtechsandbox.oliver.solutions`
-
-### Files Modified:
-
-#### 1. `backend/app.py` (Line 42)
-**Before:**
-```python
-CORS(app, resources={r"/api/*": {
-    "origins": ["https://ai-sandbox.oliver.solutions", "http://localhost:3000"],
-    ...
-}})
-```
-
-**After:**
-```python
-CORS(app, resources={r"/api/*": {
-    "origins": ["https://brandtechsandbox.oliver.solutions", "http://localhost:3000"],
-    ...
-}})
-```
-
-#### 2. `backend/app.py` (Line 1128)
-**Before:**
-```python
-allowed_origins = ['https://ai-sandbox.oliver.solutions', 'http://localhost:3000']
-...
-response.headers.add('Access-Control-Allow-Origin', 'https://ai-sandbox.oliver.solutions')
-```
-
-**After:**
-```python
-allowed_origins = ['https://brandtechsandbox.oliver.solutions', 'http://localhost:3000']
-...
-response.headers.add('Access-Control-Allow-Origin', 'https://brandtechsandbox.oliver.solutions')
-```
-
-#### 3. `backend/chunked_upload.py` (Line 18)
-**Before:**
-```python
-allowed_origins = ['https://ai-sandbox.oliver.solutions', 'http://localhost:3000']
-...
-response.headers.add('Access-Control-Allow-Origin', 'https://ai-sandbox.oliver.solutions')
-```
-
-**After:**
-```python
-allowed_origins = ['https://brandtechsandbox.oliver.solutions', 'http://localhost:3000']
-...
-response.headers.add('Access-Control-Allow-Origin', 'https://brandtechsandbox.oliver.solutions')
-```
-
---
-
-## Current Configuration (After Fix):
-
-### Frontend (`frontend/public/config.js`):
-```javascript
-{
-  "basePath": "/video-query",
-  "domain": "https://brandtechsandbox.oliver.solutions",
-  "msal": {
-    "clientId": "9079054c-9620-4757-a256-23413042f1ef",
-    "authority": "https://login.microsoftonline.com/e519c2e6-bc6d-4fdf-8d9c-923c2f002385",
-    "redirectUri": "https://brandtechsandbox.oliver.solutions/video-query/",
-    "postLogoutRedirectUri": "https://brandtechsandbox.oliver.solutions/video-query/",
-    "tenantId": "e519c2e6-bc6d-4fdf-8d9c-923c2f002385"
-  },
-  "api": {
-    "videoProcessingEndpoint": "https://brandtechsandbox.oliver.solutions/video_query_back/api/process",
-    "chunkedUploadEndpoint": "https://brandtechsandbox.oliver.solutions/video_query_back"
-  }
-}
-```
-
-### Backend CORS (All Files):
-```python
-allowed_origins = ['https://brandtechsandbox.oliver.solutions', 'http://localhost:3000']
-```
-
-✅ **All domains now match!**
-
---
-
-## MSAL Implementation Analysis
-
-### ✅ MSAL Code Implementation is CORRECT
-
-The MSAL implementation in the frontend is properly structured:
-
-#### 1. **Dynamic Configuration Loading** (`frontend/src/auth/authConfig.js`)
- Uses `configLoader` to load runtime configuration
- Supports both production and local development configs
- Proxy-based config access for backward compatibility
-
-#### 2. **Proper MSAL Initialization** (`frontend/src/auth/AuthProvider.js`)
- Creates `PublicClientApplication` with runtime config
- Properly initializes MSAL with `await instance.initialize()`
- Handles redirect responses at startup
- Sets active account from redirect response
- Implements event callbacks for auth events
-
-#### 3. **Redirect Flow Handling**
-```javascript
-// Handle any initial redirect response at startup
-const response = await instance.handleRedirectPromise();
-if (response && response.account) {
-    instance.setActiveAccount(response.account);
-}
-```
-
-#### 4. **Event-Driven Authentication**
-```javascript
-instance.addEventCallback((event) => {
-    if (event.eventType === EventType.LOGIN_SUCCESS) {
-        instance.setActiveAccount(event.payload.account);
-        if (event.interactionType === "redirect") {
-            window.location.reload();
-        }
-    }
-});
-```
-
-### ✅ MSAL Will Work in Production IF:
-
-**Critical Requirement:** The redirect URI **MUST** be registered in Azure AD B2C
-
-**Verification Steps:**
-1. Go to: Azure Portal → Azure AD B2C → App registrations
-2. Find app: Client ID `9079054c-9620-4757-a256-23413042f1ef`
-3. Navigate to: Authentication → Platform configurations → Single-page application
-4. Verify redirect URI exists: `https://brandtechsandbox.oliver.solutions/video-query/`
-5. Verify post-logout redirect URI exists: `https://brandtechsandbox.oliver.solutions/video-query/`
-
-**If NOT registered:**
- Click "Add a platform" → "Single-page application"
- Add redirect URI: `https://brandtechsandbox.oliver.solutions/video-query/`
- Add post-logout redirect URI: `https://brandtechsandbox.oliver.solutions/video-query/`
- Click "Configure"
-
---
-
-## Production Deployment Checklist
-
-### ✅ Already Complete:
- [x] Frontend config.js uses `brandtechsandbox.oliver.solutions`
- [x] Backend CORS allows `brandtechsandbox.oliver.solutions`
- [x] All CORS handlers use matching domain
- [x] MSAL implementation code is correct
- [x] Redirect flow properly handled
- [x] Config loading works dynamically
-
-### ⚠️ Requires Manual Verification:
- [ ] **Azure AD B2C redirect URI** is registered for `brandtechsandbox.oliver.solutions/video-query/`
- [ ] Apache/Nginx configuration routes `/video_query_back` to backend
- [ ] SSL certificates valid for `brandtechsandbox.oliver.solutions`
- [ ] DNS points to correct server
-
-### 🔧 Deployment Steps:
-
-1. **Verify Azure AD B2C Configuration**
-   ```
-   1. Login to Azure Portal
-   2. Go to Azure AD B2C
-   3. Check app registration (Client ID: 9079054c-9620-4757-a256-23413042f1ef)
-   4. Verify redirect URI: https://brandtechsandbox.oliver.solutions/video-query/
-   ```
-
-2. **Deploy Frontend**
-   ```bash
-   cd frontend
-   npm run build
-   sudo cp -r build/* /var/www/html/video-query/
-   ```
-
-3. **Deploy Backend**
-   ```bash
-   cd backend
-   sudo systemctl restart video-query
-   # OR if using development server:
-   # python3 run.py
-   ```
-
-4. **Verify Web Server Configuration** (Apache example)
-   ```apache
-   <VirtualHost *:443>
-       ServerName brandtechsandbox.oliver.solutions
-
-       # Frontend
-       Alias /video-query /var/www/html/video-query
-
-       # Backend proxy
-       ProxyPass /video_query_back http://localhost:5010
-       ProxyPassReverse /video_query_back http://localhost:5010
-
-       SSLEngine on
-       SSLCertificateFile /path/to/cert.crt
-       SSLCertificateKeyFile /path/to/key.key
-   </VirtualHost>
-   ```
-
---
-
-## Testing After Deployment
-
-### Test 1: CORS Verification
-```bash
-curl -I -X OPTIONS \
-  -H "Origin: https://brandtechsandbox.oliver.solutions" \
-  -H "Access-Control-Request-Method: POST" \
-  https://brandtechsandbox.oliver.solutions/video_query_back/api/init-upload
-```
-
-**Expected Response Headers:**
-```
-Access-Control-Allow-Origin: https://brandtechsandbox.oliver.solutions
-Access-Control-Allow-Credentials: true
-Access-Control-Allow-Methods: GET,POST,OPTIONS
-```
-
-### Test 2: MSAL Authentication
-1. Open: `https://brandtechsandbox.oliver.solutions/video-query/`
-2. Click "Sign In"
-3. Should redirect to Microsoft login
-4. After login, should redirect back to: `https://brandtechsandbox.oliver.solutions/video-query/`
-5. Should show user name in header
-
-### Test 3: API Calls
-1. Upload a video
-2. Check browser console (F12) → Network tab
-3. Verify API calls to `/video_query_back/api/*` succeed
-4. Check response headers include: `Access-Control-Allow-Origin: https://brandtechsandbox.oliver.solutions`
-
---
-
-## Troubleshooting
-
-### Issue 1: MSAL Redirect Error
-**Error:** `AADSTS50011: The redirect URI does not match...`
-
-**Solution:**
- Check Azure AD B2C app registration
- Ensure redirect URI **exactly** matches: `https://brandtechsandbox.oliver.solutions/video-query/`
- Note the trailing slash `/` is required!
-
-### Issue 2: CORS Still Failing
-**Error:** `CORS policy: No 'Access-Control-Allow-Origin' header...`
-
-**Solution:**
-1. Check backend logs: `journalctl -u video-query -f`
-2. Verify backend restarted after changes
-3. Check Apache/Nginx proxy configuration
-4. Verify SSL is working (CORS stricter with HTTPS)
-
-### Issue 3: Infinite Redirect Loop
-**Symptom:** Page keeps redirecting to Microsoft login
-
-**Solution:**
-1. Check browser console for errors
-2. Verify `sessionStorage` is enabled in browser
-3. Clear browser cache and cookies
-4. Check if pop-up blocker is interfering
-
---
-
-## Summary
-
-### ✅ **Configuration Now Aligned:**
- Frontend: `brandtechsandbox.oliver.solutions`
- Backend CORS: `brandtechsandbox.oliver.solutions`
- Domains match perfectly ✓
-
-### ✅ **MSAL Implementation:**
- Code is correct and production-ready ✓
- Handles redirects properly ✓
- Event callbacks configured ✓
-
-### ⚠️ **Action Required:**
- **Verify Azure AD B2C redirect URI registration**
- Test authentication flow after deployment
- Monitor logs for any issues
-
-### 📝 **Files Modified:**
-1. `backend/app.py` - Updated CORS origins (2 locations)
-2. `backend/chunked_upload.py` - Updated CORS handler (1 location)
-
-### 🚀 **Ready for Production:**
-Once Azure AD B2C redirect URI is verified, the application is ready for deployment to `https://brandtechsandbox.oliver.solutions/video-query/`
-
---
-
-**Date:** 2025-10-22
-**Status:** ✅ READY FOR DEPLOYMENT (after Azure AD B2C verification)
--- a/PDF_GENERATION_FIX.md
+++ b/PDF_GENERATION_FIX.md
@ -1,327 +0,0 @@
-# PDF Generation Fix Summary
-
-**Date:** 2025-11-13
-**Issue:** PDF generation button not working
-**Status:** ✅ **FIXED**
-
---
-
-## Problem Identified
-
-The PDF generation feature was failing due to **wkhtmltopdf verification issue** in the `system_utils.py` module.
-
-### **Root Cause:**
-
-The `verify_executable()` method in `system_utils.py` was using `-version` flag (single dash) to test executables, but **wkhtmltopdf requires `--version` flag (double dash)**.
-
-**Error flow:**
-```
-1. User clicks "Download PDF" button
-   ↓
-2. Backend tries to find wkhtmltopdf via system_utils.find_wkhtmltopdf()
-   ↓
-3. Verification runs: wkhtmltopdf -version
-   ↓
-4. wkhtmltopdf returns error: "Unknown switch -v"
-   ↓
-5. Verification fails → FileNotFoundError raised
-   ↓
-6. PDF generation fails
-```
-
---
-
-## Solution Implemented
-
-### **Fix 1: Enhanced Executable Verification**
-
-**File:** `backend/system_utils.py`
-**Lines:** 261-319
-
-**Changes:**
-```python
-# OLD (Failed):
-result = subprocess.run([path, '-version'], ...)
-if result.returncode == 0:
-    return True
-else:
-    return False
-
-# NEW (Works):
-version_flags = ['--version', '-version', '-V']  # Try multiple flags
-
-for flag in version_flags:
-    result = subprocess.run([path, flag], ...)
-    if result.returncode == 0 or result.stdout or result.stderr:
-        return True  # Success!
-
-# Fallback: If file exists and is executable, assume it works
-return os.path.exists(path) and os.access(path, os.X_OK)
-```
-
-**Benefits:**
- ✅ Tries multiple version flags (`--version`, `-version`, `-V`)
- ✅ Accepts any output (stdout or stderr) as verification
- ✅ Falls back to simple existence check if no flags work
- ✅ Works with all executables (ffmpeg, ffprobe, wkhtmltopdf)
-
---
-
-## Verification Tests
-
-### **Test 1: wkhtmltopdf Detection**
-```bash
-$ python -c "from system_utils import system_utils; print(system_utils.find_wkhtmltopdf())"
-# Output: /usr/bin/wkhtmltopdf
-# ✅ SUCCESS
-```
-
-### **Test 2: App Imports**
-```bash
-$ source venv/bin/activate
-$ python -c "from app import app; print('✓ App loaded')"
-# Output: ✓ App loaded
-# ✅ SUCCESS
-```
-
-### **Test 3: wkhtmltopdf Version**
-```bash
-$ /usr/bin/wkhtmltopdf --version
-# Output: wkhtmltopdf 0.12.6
-# ✅ Installed and working
-```
-
---
-
-## Frontend Verification
-
-**File:** `frontend/src/components/ResultDisplay.js`
-
-**PDF Button (Lines 481-491):**
-```javascript
-<button
-  className="btn btn-danger btn-sm"
-  onClick={downloadPdf}
-  disabled={isPdfLoading}
->
-  {isPdfLoading ? 'Generating...' : 'Download PDF'}
-</button>
-```
-
-**Status:** ✅ **No changes needed** - Frontend code is correct
-
-**PDF Generation Flow:**
-1. User clicks "Download PDF"
-2. Frontend converts Mermaid diagrams to PNG images
-3. Sends HTML + PNG images to `/api/generate-pdf`
-4. Backend uses wkhtmltopdf to generate PDF
-5. PDF returned as base64 → Downloaded to user
-
---
-
-## Files Modified
-
-### **Modified (1 file):**
- ✅ `backend/system_utils.py` - Enhanced `verify_executable()` method (58 lines changed)
-
-### **No changes needed:**
- ✅ `backend/app.py` - Already uses `system_utils.find_wkhtmltopdf()`
- ✅ `frontend/src/components/ResultDisplay.js` - Frontend code is correct
- ✅ `backend/.env` - Configuration is correct
-
---
-
-## How PDF Generation Works Now
-
-### **Complete Flow:**
-
-```
-1. USER ACTION:
-   User clicks "Download PDF" button
-   ↓
-2. FRONTEND (ResultDisplay.js):
-   - Waits for Mermaid diagrams to fully render
-   - Converts all SVG diagrams to PNG images (base64)
-   - Collects HTML content + PNG images
-   ↓
-3. API REQUEST:
-   POST /api/generate-pdf
-   Body: {
-     html: "<div>...</div>",
-     diagramPngs: {"mermaid-1": "data:image/png;base64,..."},
-     videoFileName: "example.mp4"
-   }
-   ↓
-4. BACKEND (app.py):
-   - Finds wkhtmltopdf using system_utils ✅ NOW WORKS
-   - Embeds PNG images into HTML
-   - Runs wkhtmltopdf to generate PDF
-   - Returns PDF as base64
-   ↓
-5. FRONTEND:
-   - Decodes base64 PDF
-   - Creates download link
-   - Triggers browser download
-   ↓
-6. RESULT:
-   User gets PDF file downloaded ✅
-```
-
---
-
-## Testing the Fix
-
-### **Test PDF Generation:**
-
-1. **Start the application:**
-   ```bash
-   cd backend
-   source venv/bin/activate
-   python run.py
-   ```
-
-2. **Process a video:**
-   - Upload a video through the frontend
-   - Wait for processing to complete
-
-3. **Generate PDF:**
-   - Click "Download PDF" button
-   - Wait for PDF generation
-   - PDF should download automatically
-
-**Expected Result:**
- ✅ No errors in browser console
- ✅ No errors in backend logs
- ✅ PDF downloads successfully
- ✅ Mermaid diagrams appear as images in PDF
-
---
-
-## Common Issues & Solutions
-
-### **Issue 1: "wkhtmltopdf not found"**
-
-**Symptoms:**
- Error in logs: "wkhtmltopdf not found"
- PDF button fails silently
-
-**Solution:**
-```bash
-# Ubuntu/Debian
-sudo apt-get install wkhtmltopdf
-
-# After install
-cd backend
-source venv/bin/activate
-python run.py  # Restart
-```
-
-### **Issue 2: PDF generation times out**
-
-**Symptoms:**
- Button shows "Generating..." forever
- Network error in console
-
-**Cause:** Large HTML content with many diagrams
-
-**Solution:**
-```bash
-# Check backend logs
-tail -f logs/video_query.log
-
-# If you see timeouts, increase timeout in run.py
-# Edit line 36-37:
-config.read_timeout = 7200   # 2 hours
-config.write_timeout = 7200  # 2 hours
-```
-
-### **Issue 3: Diagrams missing in PDF**
-
-**Symptoms:**
- PDF generates but diagrams are blank/missing
-
-**Cause:** Mermaid SVG → PNG conversion failed
-
-**Solution:**
- Check browser console for errors
- Ensure mermaid diagrams render correctly on screen first
- Try clicking PDF button again after diagrams fully render
-
---
-
-## Technical Details
-
-### **wkhtmltopdf Version Compatibility**
-
-**Installed Version:** 0.12.6
-
-**Compatible with:**
- ✅ Ubuntu 20.04+
- ✅ Ubuntu 22.04+
- ✅ WSL (Windows Subsystem for Linux)
- ✅ macOS (via Homebrew)
-
-**Requirements:**
- ✅ libcairo2 (for rendering)
- ✅ Python cairosvg module (for SVG processing)
- ✅ Proper executable permissions
-
---
-
-## Verification Checklist
-
-Before using PDF generation, verify:
-
- [ ] wkhtmltopdf installed: `which wkhtmltopdf`
- [ ] wkhtmltopdf works: `/usr/bin/wkhtmltopdf --version`
- [ ] system_utils can find it: `python -c "from system_utils import system_utils; print(system_utils.find_wkhtmltopdf())"`
- [ ] Backend starts: `source venv/bin/activate && python run.py`
- [ ] No import errors in logs
- [ ] Frontend can connect to backend
- [ ] Test PDF generation with a simple result
-
---
-
-## Performance Notes
-
-### **PDF Generation Time:**
-
-| Content Type | Expected Time |
-|--------------|---------------|
-| Text only | 1-3 seconds |
-| Text + 1-2 diagrams | 3-5 seconds |
-| Text + 5+ diagrams | 5-10 seconds |
-| Large document (10+ pages) | 10-20 seconds |
-
-**Note:** First diagram conversion takes longest due to browser canvas setup.
-
---
-
-## Conclusion
-
-✅ **PDF generation is now working!**
-
-**What was fixed:**
-1. Enhanced executable verification to handle different version flags
-2. Added fallback verification for executables without version flags
-3. Improved error handling in system_utils
-
-**What still works:**
- ✅ Cross-platform support (Linux, macOS, WSL)
- ✅ Mermaid diagram rendering
- ✅ SVG to PNG conversion
- ✅ HTML to PDF generation
- ✅ Automatic file naming based on video filename
-
-**Ready to use!** 🎉
-
---
-
-**To test:**
-```bash
-cd backend
-source venv/bin/activate
-python run.py
-
-# Then open frontend and try PDF generation
-```
--- a/backend/test_api.py
+++ b/backend/test_api.py
@ -1,25 +0,0 @@
-from google import genai
-import os
-from dotenv import load_dotenv
-
-# Load environment variables
-load_dotenv()
-
-print(f"API Key set: {bool(os.getenv('GOOGLE_API_KEY'))}")
-
-try:
-    # Initialize client with API key
-    api_key = os.getenv("GOOGLE_API_KEY")
-    client = genai.Client(api_key=api_key)
-
-    # Test connection
-    response = client.models.generate_content(
-        model='gemini-1.5-pro',
-        contents='Test the API connection'
-    )
-
-    print("API connection test successful!")
-    print(f"Response: {response.text}")
-
-except Exception as e:
-    print(f"API error: {e}")
--- a/backend/test_system_setup.py
+++ b/backend/test_system_setup.py
@ -1,177 +0,0 @@
-#!/usr/bin/env python
-"""
-Test script to verify cross-platform system utilities and error reporting.
-
-This script:
-1. Tests system utility detection (ffprobe, ffmpeg, wkhtmltopdf)
-2. Tests error reporting functionality
-3. Verifies all dependencies are properly installed
-
-Run this script before starting the application to ensure everything is set up correctly.
-"""
-
-import sys
-import os
-
-# Add backend directory to path
-sys.path.insert(0, os.path.dirname(__file__))
-
-from system_utils import system_utils
-from error_reporter import ErrorReporter, ErrorCategory
-import platform
-
-def print_header(text):
-    """Print a formatted header."""
-    print("\n" + "="*80)
-    print(f"  {text}")
-    print("="*80)
-
-def print_section(text):
-    """Print a formatted section."""
-    print(f"\n--- {text} ---")
-
-def test_system_info():
-    """Test system information gathering."""
-    print_header("SYSTEM INFORMATION")
-
-    info = system_utils.get_system_info()
-
-    print(f"\nPlatform: {info['platform_name']}")
-    print(f"Platform Type: {info['platform']}")
-    print(f"Machine: {info['platform_machine']}")
-    print(f"OS Version: {info['platform_version']}")
-    print(f"Python Version: {info['python_version']}")
-    print(f"Python Implementation: {info['python_implementation']}")
-
-def test_executables():
-    """Test executable detection."""
-    print_header("EXECUTABLE DETECTION")
-
-    executables = [
-        ('ffprobe', system_utils.find_ffprobe),
-        ('ffmpeg', system_utils.find_ffmpeg),
-        ('wkhtmltopdf', system_utils.find_wkhtmltopdf)
-    ]
-
-    results = []
-    all_found = True
-
-    for name, finder in executables:
-        print_section(f"Testing {name}")
-        try:
-            path = finder()
-            verified = system_utils.verify_executable(path, name)
-            status = "✓ FOUND" if verified else "⚠ FOUND (not verified)"
-            print(f"  Status: {status}")
-            print(f"  Path: {path}")
-            results.append((name, True, path))
-        except FileNotFoundError as e:
-            print(f"  Status: ✗ NOT FOUND")
-            print(f"  Error: {str(e)[:200]}")
-            results.append((name, False, None))
-            all_found = False
-        except Exception as e:
-            print(f"  Status: ✗ ERROR")
-            print(f"  Error: {str(e)[:200]}")
-            results.append((name, False, None))
-            all_found = False
-
-    return all_found, results
-
-def test_error_reporting():
-    """Test error reporting functionality."""
-    print_header("ERROR REPORTING TESTS")
-
-    test_cases = [
-        ("System Error", FileNotFoundError("ffprobe not found")),
-        ("API Error", Exception("503 UNAVAILABLE: Model overloaded")),
-        ("Video Error", Exception("moov atom not found")),
-        ("Network Error", ConnectionError("Connection timeout")),
-    ]
-
-    print("\nTesting error categorization and reporting...")
-
-    for description, exception in test_cases:
-        print_section(description)
-        try:
-            raise exception
-        except Exception as e:
-            report = ErrorReporter.capture_error(
-                e,
-                context={'test': description}
-            )
-            print(f"  Error ID: {report.error_id}")
-            print(f"  Category: {report.category.value}")
-            print(f"  Message: {report.message[:100]}")
-            if report.suggested_fix:
-                print(f"  Fix: {report.suggested_fix[:100]}...")
-
-def print_summary(all_found, results):
-    """Print summary of test results."""
-    print_header("SUMMARY")
-
-    print("\nExecutable Status:")
-    for name, found, path in results:
-        status = "✓" if found else "✗"
-        print(f"  {status} {name}: {'Found' if found else 'NOT FOUND'}")
-
-    print("\n" + "="*80)
-    if all_found:
-        print("✓ ALL DEPENDENCIES FOUND - System is ready!")
-        print("="*80)
-        return 0
-    else:
-        print("✗ SOME DEPENDENCIES MISSING - Please install them before running the app")
-        print("="*80)
-        print("\nInstallation instructions:")
-
-        system = platform.system().lower()
-        if 'darwin' in system:
-            print("\n  macOS (Homebrew):")
-            print("    brew install ffmpeg wkhtmltopdf")
-        elif 'linux' in system:
-            print("\n  Ubuntu/Debian:")
-            print("    sudo apt-get update")
-            print("    sudo apt-get install ffmpeg wkhtmltopdf")
-            print("\n  CentOS/RHEL:")
-            print("    sudo yum install ffmpeg wkhtmltopdf")
-        else:
-            print("\n  Windows:")
-            print("    Download ffmpeg from: https://ffmpeg.org/download.html")
-            print("    Download wkhtmltopdf from: https://wkhtmltopdf.org/downloads.html")
-
-        print("\n" + "="*80)
-        return 1
-
-def main():
-    """Main test function."""
-    print("\n" + "="*80)
-    print("  VIDEO QUERY APPLICATION - SYSTEM SETUP TEST")
-    print("="*80)
-
-    # Test system info
-    test_system_info()
-
-    # Test executables
-    all_found, results = test_executables()
-
-    # Test error reporting
-    test_error_reporting()
-
-    # Print summary
-    exit_code = print_summary(all_found, results)
-
-    return exit_code
-
-if __name__ == "__main__":
-    try:
-        exit_code = main()
-        sys.exit(exit_code)
-    except KeyboardInterrupt:
-        print("\n\nTest interrupted by user.")
-        sys.exit(1)
-    except Exception as e:
-        print(f"\n\nFATAL ERROR: {str(e)}")
-        import traceback
-        traceback.print_exc()
-        sys.exit(1)
--- a/backend/test_webhook.py
+++ b/backend/test_webhook.py
@ -1,137 +0,0 @@
-import unittest
-from unittest.mock import patch, MagicMock
-import json
-import os
-import tempfile
-import datetime
-from video_processor import VideoProcessor
-
-class TestWebhookIntegration(unittest.TestCase):
-    """Test cases for webhook integration in VideoProcessor."""
-
-    def setUp(self):
-        """Set up test environment."""
-        # Create a VideoProcessor instance with a mock API key
-        self.video_processor = VideoProcessor(api_key="test_api_key")
-        
-        # Create a temporary file to simulate a video
-        self.temp_file = tempfile.NamedTemporaryFile(suffix=".mp4", delete=False)
-        self.temp_file.close()
-        
-        # Write some dummy data to the file
-        with open(self.temp_file.name, 'wb') as f:
-            f.write(b'test video content')
-
-    def tearDown(self):
-        """Clean up after tests."""
-        # Remove the temporary file
-        if os.path.exists(self.temp_file.name):
-            os.unlink(self.temp_file.name)
-
-    @patch('video_processor.genai')
-    @patch('video_processor.requests.post')
-    def test_webhook_called_on_successful_processing(self, mock_post, mock_genai):
-        """Test that the webhook is called when video processing is successful."""
-        # Mock the genai API responses
-        mock_file = MagicMock()
-        mock_file.uri = "test_uri"
-        mock_file.name = "test_name"
-        mock_file.state.name = "ACTIVE"
-        mock_genai.upload_file.return_value = mock_file
-        
-        # Mock the generate_content response
-        mock_response = MagicMock()
-        mock_part = MagicMock()
-        mock_part.text = "Test response content"
-        mock_response.parts = [mock_part]
-        mock_genai.GenerativeModel.return_value.generate_content.return_value = mock_response
-        
-        # Set up the mock for the requests.post call
-        mock_post.return_value.status_code = 200
-        
-        # Test data
-        test_prompt = "Test prompt for video processing"
-        test_email = "test.user@example.com"
-        
-        # Call the process_video method
-        result = self.video_processor.process_video(
-            self.temp_file.name, 
-            test_prompt,
-            test_email
-        )
-        
-        # Verify the result is successful
-        self.assertTrue(result["success"])
-        self.assertEqual(result["content"], "Test response content")
-        
-        # Verify webhook was called with correct data
-        mock_post.assert_called_once()
-        
-        # Get the arguments the mock was called with
-        call_args = mock_post.call_args
-        
-        # Verify URL
-        self.assertEqual(call_args[0][0], "https://hook.us1.make.celonis.com/8ri1h8b2he4wudp2jku69mgcxumzxf3v")
-        
-        # Verify headers
-        self.assertEqual(call_args[1]["headers"], {"Content-Type": "application/json"})
-        
-        # Verify timeout
-        self.assertEqual(call_args[1]["timeout"], 10)
-        
-        # Parse and verify the payload
-        payload = json.loads(call_args[1]["data"])
-        self.assertEqual(payload["tool"], "VIDEOQUERY")
-        self.assertEqual(payload["user"], test_email)
-        self.assertEqual(payload["model"], "GEMINI")
-        self.assertEqual(payload["prompt"], test_prompt)
-        
-        # Verify date format (should be ISO format)
-        try:
-            datetime.datetime.fromisoformat(payload["date"])
-            date_valid = True
-        except ValueError:
-            date_valid = False
-        self.assertTrue(date_valid, "Date should be in ISO format")
-
-    @patch('video_processor.genai')
-    @patch('video_processor.requests.post')
-    def test_webhook_error_does_not_affect_processing(self, mock_post, mock_genai):
-        """Test that errors in the webhook don't affect the main processing flow."""
-        # Mock the genai API responses
-        mock_file = MagicMock()
-        mock_file.uri = "test_uri"
-        mock_file.name = "test_name"
-        mock_file.state.name = "ACTIVE"
-        mock_genai.upload_file.return_value = mock_file
-        
-        # Mock the generate_content response
-        mock_response = MagicMock()
-        mock_part = MagicMock()
-        mock_part.text = "Test response content"
-        mock_response.parts = [mock_part]
-        mock_genai.GenerativeModel.return_value.generate_content.return_value = mock_response
-        
-        # Set up the mock for the requests.post call to raise an exception
-        mock_post.side_effect = Exception("Webhook connection error")
-        
-        # Test data
-        test_prompt = "Test prompt for video processing"
-        test_email = "test.user@example.com"
-        
-        # Call the process_video method
-        result = self.video_processor.process_video(
-            self.temp_file.name, 
-            test_prompt,
-            test_email
-        )
-        
-        # Verify the result is still successful despite webhook error
-        self.assertTrue(result["success"])
-        self.assertEqual(result["content"], "Test response content")
-        
-        # Verify webhook was called
-        mock_post.assert_called_once()
-
-if __name__ == '__main__':
-    unittest.main()
--- a/backend/test_webhook_manual.py
+++ b/backend/test_webhook_manual.py
@ -1,46 +0,0 @@
-"""
-Manual test script for the webhook integration.
-This script simulates a webhook call without processing a video,
-allowing us to verify the webhook is working correctly.
-"""
-
-import logging
-import sys
-from video_processor import VideoProcessor
-
-# Configure logging
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
-    handlers=[
-        logging.StreamHandler(sys.stdout)
-    ]
-)
-logger = logging.getLogger('webhook_test')
-
-def test_webhook_manually():
-    """Test the webhook call manually"""
-    # Create a VideoProcessor instance
-    try:
-        processor = VideoProcessor()
-        logger.info("VideoProcessor initialized")
-        
-        # Test user email
-        test_email = "test.user@example.com"
-        
-        # Test prompt
-        test_prompt = "Test prompt for webhook verification"
-        
-        # Call the webhook method directly
-        logger.info(f"Sending test webhook call for user {test_email}")
-        processor.send_usage_webhook(test_email, test_prompt)
-        
-        logger.info("Webhook test completed")
-        
-    except Exception as e:
-        logger.error(f"Error in webhook test: {str(e)}")
-        import traceback
-        logger.error(traceback.format_exc())
-
-if __name__ == "__main__":
-    test_webhook_manually()
--- a/backend/video_processor.py
+++ b/backend/video_processor.py
@ -585,7 +585,8 @@ class VideoProcessor:
    def combine_chunk_responses(self, responses: List[str], prompt: str,
                                num_chunks: int) -> str:
        """
-        Intelligently combine responses from multiple video chunks.
+        Combine responses from multiple video chunks using simple concatenation.
+        For single-video chunks split due to duration.

        Args:
            responses: List of response texts from each chunk
@ -595,204 +596,18 @@ class VideoProcessor:
        Returns:
            Combined response text
        """
-        logger.info(f"Combining {len(responses)} chunk responses")
+        logger.info(f"Combining {len(responses)} chunk responses using simple concatenation")

-        # Detect the prompt type to determine combination strategy
-        prompt_lower = prompt.lower()
-        is_meeting_summary = "meeting" in prompt_lower or "summary" in prompt_lower
-        is_documentation = "documentation" in prompt_lower or "process" in prompt_lower
-        is_with_charts = "mermaid" in prompt_lower or "diagram" in prompt_lower or "chart" in prompt_lower
-
-        if is_with_charts:
-            return self._combine_with_charts(responses, num_chunks)
-        elif is_meeting_summary:
-            return self._combine_meeting_summary(responses, num_chunks)
-        elif is_documentation:
-            return self._combine_documentation(responses, num_chunks)
-        else:
-            return self._combine_generic(responses, num_chunks)
-
-    def _combine_generic(self, responses: List[str], num_chunks: int) -> str:
-        """Generic combination: simple sequential joining with section headers."""
-        logger.info("Using generic combination strategy")
        combined = []
-
        combined.append(f"# Complete Video Analysis\n")
-        combined.append(f"*This video was processed in {num_chunks} parts due to its length.*\n")
+        combined.append(f"*This video was processed in {num_chunks} segments.*\n\n")

        for i, response in enumerate(responses, 1):
-            combined.append(f"\n## Part {i} of {num_chunks}\n")
+            combined.append(f"## Segment {i} of {num_chunks}\n\n")
            combined.append(response.strip())
+            combined.append(f"\n\n")

-        return "\n".join(combined)
-
-    def _combine_meeting_summary(self, responses: List[str], num_chunks: int) -> str:
-        """Combination strategy optimized for meeting summaries."""
-        logger.info("Using meeting summary combination strategy")
-
-        # First, try to synthesize the segments into a unified summary
-        try:
-            logger.info("Attempting to synthesize segments into unified meeting summary")
-            synthesized = self._synthesize_meeting_segments(responses, num_chunks)
-            if synthesized:
-                return synthesized
-            else:
-                logger.warning("Synthesis failed, falling back to segment concatenation")
-        except Exception as e:
-            logger.warning(f"Error during synthesis: {e}, falling back to segment concatenation")
-
-        # Fallback: simple concatenation with formatting
-        combined = []
-
-        combined.append(f"# Complete Meeting Recording Summary\n")
-        combined.append(f"*This recording was analyzed in {num_chunks} segments.*\n")
-        combined.append(f"\n---\n")
-
-        # Combine all discussion points with clear time markers
-        for i, response in enumerate(responses, 1):
-            time_range = self._format_time_range(i, num_chunks)
-            combined.append(f"\n## Segment {i}: {time_range}\n")
-            combined.append(response.strip())
-            combined.append(f"\n---\n")
-
-        # Add consolidated note
-        combined.append(f"\n### Notes")
-        combined.append(f"- Review all segments above for discussion points and action items")
-        combined.append(f"- Total recording duration: ~{num_chunks * 50} minutes")
-        combined.append(f"- Recording was split into {num_chunks} segments for analysis")
-
-        return "\n".join(combined)
-
-    def _synthesize_meeting_segments(self, responses: List[str], num_chunks: int) -> Optional[str]:
-        """
-        Use AI to synthesize multiple segment summaries into one unified meeting summary.
-
-        Args:
-            responses: List of segment summaries
-            num_chunks: Number of segments
-
-        Returns:
-            Unified meeting summary or None if synthesis fails
-        """
-        try:
-            # Prepare the segments for synthesis
-            segments_text = ""
-            for i, response in enumerate(responses, 1):
-                time_range = self._format_time_range(i, num_chunks)
-                segments_text += f"\n\n### Segment {i} ({time_range}):\n{response.strip()}\n"
-
-            # Create synthesis prompt
-            synthesis_prompt = f"""You are analyzing a meeting recording that was split into {num_chunks} segments due to its length. Below are the summaries from each segment. Your task is to create ONE unified, comprehensive meeting summary that integrates all the information.
-
-SEGMENT SUMMARIES:
-{segments_text}
-
-Please provide a SINGLE, UNIFIED meeting summary that:
-1. Combines all discussion points into one cohesive narrative (not separated by segments)
-2. Consolidates all action items into one master list (removing duplicates if any)
-3. Identifies main themes and outcomes across the entire meeting
-4. Maintains chronological flow where relevant
-5. Uses clear sections: Meeting Summary, Discussion Points, Action Items (with owners)
-
-Format the output as a professional meeting summary document. Do not reference the segments in your output - write as if this was analyzed as one continuous meeting."""
-
-            logger.info("Sending synthesis request to Gemini")
-
-            # Use the new retry logic with rate limiting
-            synthesis_response = self._make_api_request_with_retry(
-                model=self.synthesis_model,
-                contents=[{"text": synthesis_prompt}],
-                context="[Meeting Synthesis]"
-            )
-
-            if synthesis_response.parts:
-                synthesized_content = ""
-                for part in synthesis_response.parts:
-                    if hasattr(part, 'text'):
-                        synthesized_content += part.text
-
-                if synthesized_content:
-                    logger.info("Successfully synthesized unified meeting summary")
-                    # Add header noting this was synthesized
-                    final_output = "# Meeting Summary\n\n"
-                    final_output += f"*Synthesized from {num_chunks}-segment analysis*\n\n"
-                    final_output += "---\n\n"
-                    final_output += synthesized_content
-                    return final_output
-
-            logger.warning("No content in synthesis response")
-            return None
-
-        except Exception as e:
-            logger.error(f"Error during synthesis: {str(e)}")
-            return None
-
-    def _combine_documentation(self, responses: List[str], num_chunks: int) -> str:
-        """Combination strategy optimized for process documentation."""
-        logger.info("Using documentation combination strategy")
-        combined = []
-
-        combined.append(f"# Complete Process Documentation\n")
-        combined.append(f"*This process was documented from a {num_chunks}-part video recording.*\n")
-
-        combined.append(f"\n## Overview\n")
-        combined.append(f"This documentation covers the complete process shown in the video. "
-                       f"The content has been organized sequentially across all segments.\n")
-
-        for i, response in enumerate(responses, 1):
-            combined.append(f"\n## Section {i}: {self._format_time_range(i, num_chunks)}\n")
-            combined.append(response.strip())
-
-        combined.append(f"\n\n---\n*End of documentation*")
-
-        return "\n".join(combined)
-
-    def _combine_with_charts(self, responses: List[str], num_chunks: int) -> str:
-        """Combination strategy for documentation with Mermaid diagrams."""
-        logger.info("Using documentation with charts combination strategy")
-        combined = []
-
-        combined.append(f"# Complete Process Documentation with Workflow Diagrams\n")
-        combined.append(f"*This analysis spans {num_chunks} video segments.*\n")
-
-        # First, add all text content
-        combined.append(f"\n## Overview and Detailed Steps\n")
-
-        for i, response in enumerate(responses, 1):
-            combined.append(f"\n### Part {i}: {self._format_time_range(i, num_chunks)}\n")
-
-            # Separate mermaid diagrams from text content
-            parts = response.split("```mermaid")
-            text_part = parts[0].strip()
-            combined.append(text_part)
-
-            # Add mermaid diagrams in a dedicated section
-            if len(parts) > 1:
-                for j, diagram_part in enumerate(parts[1:], 1):
-                    if "```" in diagram_part:
-                        diagram_code = diagram_part.split("```")[0]
-                        combined.append(f"\n**Workflow Diagram {i}.{j}:**\n")
-                        combined.append(f"```mermaid{diagram_code}```\n")
-
-                        # Add any remaining text after the diagram
-                        remaining_text = "```".join(diagram_part.split("```")[1:]).strip()
-                        if remaining_text:
-                            combined.append(remaining_text)
-
-        combined.append(f"\n\n---\n*Complete documentation generated from {num_chunks}-part video analysis*")
-
-        return "\n".join(combined)
-
-    def _format_time_range(self, part_num: int, total_parts: int,
-                          chunk_duration: int = 50) -> str:
-        """Format time range for a video part."""
-        start_min = (part_num - 1) * chunk_duration
-        end_min = part_num * chunk_duration if part_num < total_parts else "End"
-
-        if isinstance(end_min, int):
-            return f"{start_min}-{end_min} minutes"
-        else:
-            return f"{start_min}+ minutes"
+        return "".join(combined)

    def _process_single_chunk(self, chunk_info: Tuple[int, str, str, int, str]) -> Tuple[int, Dict[str, Any]]:
        """
@ -839,7 +654,9 @@ Format the output as a professional meeting summary document. Do not reference t
        # Prepare chunk information for parallel processing
        chunk_infos = []
        for i, chunk_path in enumerate(chunk_paths):
-            chunk_prompt = self._create_chunk_prompt(prompt, i + 1, num_chunks)
+            # Extract video name from chunk path (remove _chunk_XX suffix)
+            video_name = os.path.basename(chunk_path).rsplit('_chunk_', 1)[0] if '_chunk_' in chunk_path else os.path.basename(chunk_path)
+            chunk_prompt = self._create_chunk_prompt(prompt, i + 1, num_chunks, video_name)
            chunk_infos.append((i, chunk_path, chunk_prompt, num_chunks, user_email))

        # Process chunks in parallel
@ -926,8 +743,11 @@ Format the output as a professional meeting summary document. Do not reference t
                for i, chunk_path in enumerate(chunk_paths, 1):
                    logger.info(f"[Sequential] Processing chunk {i}/{len(chunk_paths)}: {chunk_path}")

+                    # Extract video name from chunk path (remove _chunk_XX suffix)
+                    video_name = os.path.basename(chunk_path).rsplit('_chunk_', 1)[0] if '_chunk_' in chunk_path else os.path.basename(chunk_path)
+
                    # Modify prompt to indicate this is part of a multi-part video
-                    chunk_prompt = self._create_chunk_prompt(prompt, i, len(chunk_paths))
+                    chunk_prompt = self._create_chunk_prompt(prompt, i, len(chunk_paths), video_name)

                    # Process this chunk
                    chunk_result = self.process_video(chunk_path, chunk_prompt, user_email)
@ -993,50 +813,24 @@ Format the output as a professional meeting summary document. Do not reference t
                self.video_splitter.cleanup_chunks(chunk_paths)

    def _create_chunk_prompt(self, original_prompt: str, chunk_num: int,
-                            total_chunks: int) -> str:
+                            total_chunks: int, video_name: str = "") -> str:
        """
-        Create a prompt for a video chunk that provides context about its position.
+        Create a prompt for a video chunk that provides minimal context about its position.

        Args:
            original_prompt: The original user prompt
            chunk_num: Current chunk number (1-indexed)
            total_chunks: Total number of chunks
+            video_name: Name of the video file (for context)

        Returns:
-            Modified prompt for the chunk
+            Prompt with minimal system context, keeping user's prompt as primary instruction
        """
-        # For meeting summaries, modify the prompt to focus on just summarizing what's in this segment
-        prompt_lower = original_prompt.lower()
-        is_meeting = "meeting" in prompt_lower
+        context = f"""This is segment {chunk_num} of {total_chunks} from video "{video_name}".
+Your response will be combined with responses from other segments to create the final result.

-        if is_meeting:
-            # For meetings, ask for a partial summary of this segment only
-            if chunk_num == 1:
-                context = f"[SEGMENT {chunk_num} of {total_chunks} - First 50 minutes] "
-                context += "Provide a summary of the discussion points and any action items covered in THIS segment only. "
-                context += "Do not try to provide a complete meeting summary - just summarize what happens in this part. "
-            elif chunk_num == total_chunks:
-                context = f"[SEGMENT {chunk_num} of {total_chunks} - Final segment] "
-                context += "Provide a summary of the discussion points and any action items covered in THIS final segment only. "
-                context += "This continues from previous segments, but only summarize what happens in this part. "
-            else:
-                context = f"[SEGMENT {chunk_num} of {total_chunks} - Middle segment] "
-                context += "Provide a summary of the discussion points and any action items covered in THIS segment only. "
-                context += "This is a middle portion of a longer recording - only summarize what happens in this part. "
-
-            return context + original_prompt
-        else:
-            # For other types, use the original approach
-            context = f"[PART {chunk_num} of {total_chunks}] "
-
-            if chunk_num == 1:
-                context += "This is the first segment of a longer video. "
-            elif chunk_num == total_chunks:
-                context += "This is the final segment continuing from previous parts. "
-            else:
-                context += "This is a middle segment continuing from previous parts. "
-
-            return context + original_prompt
+{original_prompt}"""
+        return context

    def process_video_auto(self, video_path: str, prompt: str,
                          user_email: str = "anonymous") -> Dict[str, Any]:
@ -1411,50 +1205,15 @@ Format the output as a professional meeting summary document. Do not reference t
 Original user request:
 {original_prompt}

-Your task: Provide a CONCISE SUMMARY of this segment that captures:
-1. Key information relevant to the user's request
-2. Important details, facts, or insights
-3. Any diagrams, charts, or structured data (preserve Mermaid syntax if applicable)
-4. Chronological context if relevant
-
-Keep the summary focused and information-dense. This summary will be combined with {total_chunks - 1} other summaries to create a final unified response.
-
-Do NOT mention "this is segment X" or "this chunk contains". Just provide the factual content.
+Provide a concise summary of this segment. Your summary will be combined with other summaries to create the final result.
 """
        return summary_prompt

-    def _detect_prompt_type(self, prompt: str, summaries: List[str]) -> str:
-        """
-        Detect the type of prompt to apply specialized synthesis strategy.
-
-        Args:
-            prompt: Original user prompt
-            summaries: List of summaries (to check content)
-
-        Returns:
-            Prompt type: "meeting_summary", "documentation", "documentation_with_charts", or "generic"
-        """
-        prompt_lower = prompt.lower()
-
-        # Check for meeting-related keywords
-        if any(keyword in prompt_lower for keyword in ["meeting", "discussion", "action item", "agenda"]):
-            return "meeting_summary"
-
-        # Check for documentation keywords
-        if any(keyword in prompt_lower for keyword in ["documentation", "process", "training", "knowledge base", "step by step"]):
-            # Check if it also includes charts/diagrams
-            if any(keyword in prompt_lower for keyword in ["diagram", "chart", "mermaid", "workflow"]):
-                return "documentation_with_charts"
-            return "documentation"
-
-        # Default to generic
-        return "generic"
-
    def _synthesize_final_result(self, summaries: List[str], chunk_metadata: List[Dict],
                                 original_prompt: str, user_email: str) -> str:
        """
        Synthesize all chunk summaries into single cohesive result using Gemini.
-        Uses prompt type detection to apply specialized synthesis strategies.
+        Uses a universal template that makes the user's prompt the primary instruction.
        """
        # Extract video names for context
        video_names = list(set(m['video_name'] for m in chunk_metadata))
@ -1470,30 +1229,22 @@ Do NOT mention "this is segment X" or "this chunk contains". Just provide the fa

        logger.info(f"[Stage 2] Combined summaries: {len(summaries)} summaries, {total_summary_chars} total chars")

-        # Detect prompt type for specialized synthesis
-        prompt_type = self._detect_prompt_type(original_prompt, summaries)
-        logger.info(f"[Stage 2] Detected prompt type: {prompt_type}")
-
-        # Check for Mermaid diagrams
-        has_diagrams = any('```mermaid' in s for s in summaries)
-
-        # Create synthesis prompt based on type
-        if prompt_type == "meeting_summary":
-            synthesis_prompt = self._create_synthesis_prompt_meeting(
-                summaries_text, original_prompt, num_videos, video_names
-            )
-        elif prompt_type == "documentation":
-            synthesis_prompt = self._create_synthesis_prompt_documentation(
-                summaries_text, original_prompt, num_videos, video_names
-            )
-        elif has_diagrams:
-            synthesis_prompt = self._create_synthesis_prompt_with_diagrams(
-                summaries_text, original_prompt, num_videos, video_names
-            )
+        # Create universal synthesis prompt
+        if num_videos > 1:
+            video_context = f"{num_videos} videos: {', '.join(video_names)}"
        else:
-            synthesis_prompt = self._create_synthesis_prompt_generic(
-                summaries_text, original_prompt, num_videos, video_names
-            )
+            video_context = f"video: {video_names[0]}"
+
+        synthesis_prompt = f"""You are creating a final unified response by combining multiple segment summaries from {video_context}.
+
+Here are the segment summaries:
+{summaries_text}
+
+Original user request:
+{original_prompt}
+
+Your task: Create ONE cohesive response that fulfills the user's request. Integrate information from all summaries naturally, without mentioning segments or chunks.
+"""

        # Log synthesis prompt if configured
        if self.log_prompts:
@ -1534,181 +1285,6 @@ Do NOT mention "this is segment X" or "this chunk contains". Just provide the fa
            logger.error(f"[Stage 2] Synthesis failed: {str(e)}, using fallback")
            return self._fallback_concatenation(summaries, chunk_metadata)

-    def _create_synthesis_prompt_generic(self, summaries_text: str, original_prompt: str,
-                                         num_videos: int, video_names: List[str]) -> str:
-        """
-        Generic synthesis prompt for all content types.
-        """
-        if num_videos > 1:
-            video_context = f"{num_videos} videos: {', '.join(video_names)}"
-        else:
-            video_context = f"one video: {video_names[0]}"
-
-        prompt = f"""You are creating a FINAL UNIFIED RESPONSE by synthesizing multiple segment summaries.
-
-Context:
- Source: {video_context}
- The video(s) were split into segments for processing
- Below are summaries from each segment
-
-Original user request:
-"{original_prompt}"
-
-Segment summaries:
-{summaries_text}
-
-Your task: Create ONE cohesive, unified response that:
-
-1. FULFILLS the original user request completely
-2. INTEGRATES information from all segments naturally
-3. DOES NOT mention segments, chunks, or parts
-4. MAINTAINS any requested format (lists, tables, structure)
-5. CONSOLIDATES duplicate information
-6. PRESERVES chronological flow if relevant
-7. APPEARS as if analyzing the complete video in one pass
-
-Quality requirements:
- No phrases like "In segment 1", "The first part", "Chunk 2 discusses"
- Natural transitions between topics
- Unified narrative or structure
- Professional, coherent final product
-
-Begin your unified response:
-"""
-        return prompt
-
-    def _create_synthesis_prompt_meeting(self, summaries_text: str, original_prompt: str,
-                                         num_videos: int, video_names: List[str]) -> str:
-        """
-        Specialized synthesis prompt for meeting summaries.
-        """
-        if num_videos > 1:
-            video_context = f"{num_videos} videos: {', '.join(video_names)}"
-        else:
-            video_context = f"one video: {video_names[0]}"
-
-        prompt = f"""You are creating a FINAL UNIFIED MEETING SUMMARY by synthesizing multiple segment summaries.
-
-Context:
- Source: {video_context}
- The video(s) were split into segments for processing
- Below are summaries from each segment
-
-Original user request:
-"{original_prompt}"
-
-Segment summaries:
-{summaries_text}
-
-Your task: Create ONE cohesive meeting summary that:
-
-1. MEETING OVERVIEW: Provide a high-level summary of the meeting
-2. DISCUSSION POINTS: Consolidate all discussion topics into logical sections
-   - Group related discussions together
-   - Maintain chronological flow where relevant
-   - Capture key decisions made
-3. ACTION ITEMS: Create a MASTER LIST of all action items
-   - Format: "Action item - Owner (if mentioned) - Due date (if mentioned)"
-   - Consolidate duplicates
-   - Remove redundant items
-4. KEY OUTCOMES: Summarize main conclusions and next steps
-
-Quality requirements:
- Professional meeting summary format
- No phrases like "In segment 1", "The first part", "Chunk 2 discusses"
- Natural transitions between topics
- One unified document that reads as if from single analysis
- Clear, actionable items with owners where possible
-
-Begin your unified meeting summary:
-"""
-        return prompt
-
-    def _create_synthesis_prompt_documentation(self, summaries_text: str, original_prompt: str,
-                                               num_videos: int, video_names: List[str]) -> str:
-        """
-        Specialized synthesis prompt for process documentation.
-        """
-        if num_videos > 1:
-            video_context = f"{num_videos} videos: {', '.join(video_names)}"
-        else:
-            video_context = f"one video: {video_names[0]}"
-
-        prompt = f"""You are creating FINAL UNIFIED PROCESS DOCUMENTATION by synthesizing multiple segment summaries.
-
-Context:
- Source: {video_context}
- The video(s) were split into segments for processing
- Below are summaries from each segment
-
-Original user request:
-"{original_prompt}"
-
-Segment summaries:
-{summaries_text}
-
-Your task: Create ONE comprehensive process documentation that:
-
-1. OVERVIEW: Provide a high-level description of the process
-2. PREREQUISITES: List any requirements or setup needed (if mentioned)
-3. STEP-BY-STEP INSTRUCTIONS: Combine all steps into one sequential guide
-   - Number steps sequentially (Step 1, Step 2, etc.)
-   - Include sub-steps where appropriate
-   - Be clear and detailed for someone new to the process
-4. TIPS & BEST PRACTICES: Consolidate helpful tips
-5. TROUBLESHOOTING: Include common issues and solutions (if mentioned)
-
-Quality requirements:
- Clear, sequential flow from start to finish
- No phrases like "In segment 1", "The first part", "Chunk 2 shows"
- Professional documentation format
- Easy to follow for training or reference
- One unified guide that reads naturally
-
-Begin your unified process documentation:
-"""
-        return prompt
-
-    def _create_synthesis_prompt_with_diagrams(self, summaries_text: str, original_prompt: str,
-                                                num_videos: int, video_names: List[str]) -> str:
-        """
-        Synthesis prompt specifically for content with Mermaid diagrams.
-        """
-        prompt = f"""You are creating a FINAL UNIFIED RESPONSE by synthesizing multiple segment summaries that contain Mermaid diagrams.
-
-Original user request:
-"{original_prompt}"
-
-Segment summaries (containing diagrams):
-{summaries_text}
-
-Your task: Create ONE unified response with a SINGLE MERGED DIAGRAM.
-
-Requirements:
-1. MERGE all Mermaid diagrams into ONE comprehensive diagram
-2. Combine nodes, relationships, and flows intelligently
-3. Remove duplicate nodes/edges
-4. Maintain logical structure and connections
-5. Synthesize accompanying text naturally
-6. DO NOT mention segments or parts
-
-Diagram merging strategy:
- If multiple flowcharts: combine into single flowchart with logical flow
- If multiple architecture diagrams: create unified architecture
- If sequential diagrams: show complete sequence
- Use clear labels and grouping where appropriate
-
-Output format:
-```mermaid
-[Your merged diagram here]
-```
-
-[Unified explanatory text here]
-
-Begin your unified response with merged diagram:
-"""
-        return prompt
-
    def _fallback_concatenation(self, summaries: List[str], chunk_metadata: List[Dict]) -> str:
        """
        Fallback method when AI synthesis fails.