video-query/BATCH_PROCESSING_IMPROVEMENTS.md

11 KiB

Batch Processing Improvements - Implementation Summary

Date: 2025-11-10 Status: All Phases Completed

Overview

Implemented comprehensive improvements to batch video processing including model consistency fixes, specialized synthesis strategies, enhanced logging, and configurable options. All videos in a batch are now processed with the same prompt and synthesized intelligently based on content type.


Changes Implemented

Phase 1: Enhanced Logging

File Modified: backend/video_processor.py

Changes:

  • Added structured logging with [Stage 1], [Stage 2], [Traceability], and [Metrics] prefixes
  • Implemented configurable debug-level logging for prompts and summaries
  • Added performance metrics tracking (stage times, avg time per video, API call count)
  • Added video-to-summary-to-result traceability logging

New Log Output:

Batch abc123: [Stage 1] Processing video 1/3: meeting1.mp4
Batch abc123: [Stage 1] Video 1 complete: 1,245 chars in 45.2s
Batch abc123: [Stage 2] Detected prompt type: meeting_summary
Batch abc123: [Stage 2] Synthesis complete: 3,456 chars in 15.3s
Batch abc123: [Traceability] Video-to-summary mapping:
Batch abc123:   - Video 1: meeting1.mp4 → Summary 1
Batch abc123: [Metrics] Stage 1: 135.6s, Stage 2: 15.3s, Total: 150.9s

Lines Modified: 987-1055, 1123-1247


Phase 2: Model Consistency Fix

File Modified: backend/video_processor.py

Changes:

  • Changed synthesis model from gemini-2.0-flash-exp to gemini-2.5-pro
  • Added model configuration constants at class level
  • Made models configurable via environment variables

Before:

# Individual processing
model="gemini-2.5-pro"

# Batch synthesis
model="gemini-2.0-flash-exp"  # ❌ INCONSISTENT

After:

# Both use same model
self.processing_model = "gemini-2.5-pro"
self.synthesis_model = "gemini-2.5-pro"  # ✅ CONSISTENT

Lines Modified: 48-50, 82-88, 339, 553, 1252


Phase 3: Specialized Synthesis Strategies

File Modified: backend/video_processor.py

Changes:

  • Added _detect_prompt_type() method to classify prompts
  • Added _create_synthesis_prompt_meeting() for meeting summaries
  • Added _create_synthesis_prompt_documentation() for process docs
  • Updated _synthesize_final_result() to route to specialized strategies

Prompt Type Detection:

def _detect_prompt_type(self, prompt: str, summaries: List[str]) -> str:
    """
    Detects: meeting_summary | documentation | documentation_with_charts | generic
    """
    # Keywords: meeting, discussion, action item → meeting_summary
    # Keywords: documentation, process, training → documentation
    # Keywords: diagram, chart, mermaid → documentation_with_charts

Meeting Synthesis Strategy:

  • Consolidates discussion points across all videos
  • Creates master action items list (removes duplicates)
  • Formats with clear sections: Overview, Discussion, Action Items, Outcomes

Documentation Synthesis Strategy:

  • Combines steps into sequential guide
  • Numbers steps continuously (Step 1, Step 2, ...)
  • Includes Prerequisites, Tips, Troubleshooting sections

Lines Added: 1195-1441


Phase 4: Configuration Options

Files Modified:

  • backend/video_processor.py
  • backend/.env.example (created)

New Environment Variables:

Variable Default Description
VIDEO_PROCESSOR_MODEL gemini-2.5-pro Model for individual video processing
VIDEO_SYNTHESIS_MODEL gemini-2.5-pro Model for batch synthesis
BATCH_PROCESSING_LOG_PROMPTS false Enable prompt logging (debug)
BATCH_PROCESSING_LOG_SUMMARIES false Enable summary preview logging (debug)

Usage Example:

# Enable detailed logging for debugging
export BATCH_PROCESSING_LOG_PROMPTS=true
export BATCH_PROCESSING_LOG_SUMMARIES=true

# Use different model for synthesis (optional)
export VIDEO_SYNTHESIS_MODEL=gemini-2.0-flash-exp

Lines Modified: 82-88, 1003-1004, 1016-1017, 1150-1151, 1170-1171, 1190-1192, 1204-1205, 1240-1242, 1272-1273


Documentation Updates

File Modified: CLAUDE.md

Sections Added/Updated:

  1. Backend Setup: Added .env example with all configuration options
  2. Production Deployment: Updated environment configuration section
  3. Key Architecture Components: Added comprehensive Batch Processing Architecture section
  4. Configuration Files: Documented all environment variables
  5. Troubleshooting: Added Batch Processing Issues section with debugging guide

New Documentation Sections:

  • Batch Processing Architecture
  • Batch Processing Flow (4-stage explanation)
  • Logging Levels guide
  • Troubleshooting: Inconsistent summaries
  • Troubleshooting: Prompt visibility
  • Troubleshooting: Video-to-result mapping
  • Troubleshooting: Performance issues

How to Use

Normal Operation (Default)

# No changes needed - works out of the box
GOOGLE_API_KEY=your_key

Enable Debugging

# In backend/.env
GOOGLE_API_KEY=your_key
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true

# Restart backend
sudo systemctl restart video-query

# View logs with filtering
journalctl -u video-query -f | grep "Batch"

View Traceability (Always Enabled)

# See which video contributed to which part of result
journalctl -u video-query -f | grep "Traceability"

View Performance Metrics (Always Enabled)

# See timing breakdown and API call counts
journalctl -u video-query -f | grep "Metrics"

Verification

Test Batch Processing

# Process multiple videos as batch
curl -X POST http://localhost:5010/api/process-batch \
  -H "Content-Type: application/json" \
  -d '{
    "videos": [
      {"file_path": "/tmp/video1.mp4", "filename": "meeting_part1.mp4", "order": 1},
      {"file_path": "/tmp/video2.mp4", "filename": "meeting_part2.mp4", "order": 2}
    ],
    "prompt": "Generate a detailed meeting summary with action items",
    "batch_id": "test-batch-001"
  }'

# Check logs for:
# 1. Prompt type detection: "Detected prompt type: meeting_summary"
# 2. Model consistency: "model: gemini-2.5-pro" for both stages
# 3. Traceability: Video-to-summary mapping
# 4. Performance: Stage 1/2 timing

Expected Log Output

2025-11-10 10:30:00 - Batch test-batch-001: Processing 2 videos (meeting_part1.mp4, meeting_part2.mp4)
2025-11-10 10:30:00 - Batch test-batch-001: [Stage 1] Direct processing of 2 videos
2025-11-10 10:30:05 - Batch test-batch-001: [Stage 1] Processing video 1/2: meeting_part1.mp4
2025-11-10 10:30:50 - Batch test-batch-001: [Stage 1] Video 1 complete: 1,234 chars in 45.2s
2025-11-10 10:30:55 - Batch test-batch-001: [Stage 1] Processing video 2/2: meeting_part2.mp4
2025-11-10 10:31:40 - Batch test-batch-001: [Stage 1] Video 2 complete: 1,567 chars in 45.1s
2025-11-10 10:31:40 - Batch test-batch-001: [Stage 1] Complete - 2 summaries in 95.3s
2025-11-10 10:31:40 - Batch test-batch-001: [Traceability] Video-to-summary mapping:
2025-11-10 10:31:40 - Batch test-batch-001:   - Video 1: meeting_part1.mp4 → Summary 1
2025-11-10 10:31:40 - Batch test-batch-001:   - Video 2: meeting_part2.mp4 → Summary 2
2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Synthesizing 2 summaries
2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Combined summaries: 2 summaries, 2801 total chars
2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Detected prompt type: meeting_summary
2025-11-10 10:31:40 - Batch test-batch-001: [Stage 2] Sending synthesis request to Gemini API (model: gemini-2.5-pro)
2025-11-10 10:31:55 - Batch test-batch-001: [Stage 2] Synthesis complete: 3,456 chars in 15.2s
2025-11-10 10:31:55 - Batch test-batch-001: [Metrics] Stage 1: 95.3s, Stage 2: 15.2s, Total: 110.5s
2025-11-10 10:31:55 - Batch test-batch-001: [Metrics] Avg time per video: 47.7s

Benefits

1. Model Consistency

  • Before: Different models for processing vs synthesis
  • After: Same model (gemini-2.5-pro) ensures consistent quality
  • Impact: More predictable and reliable results

2. Specialized Synthesis

  • Before: Generic synthesis for all content types
  • After: Tailored strategies for meetings, documentation, diagrams
  • Impact: Better quality summaries that match user intent

3. Enhanced Visibility

  • Before: Limited logging, hard to debug issues
  • After: Comprehensive logging with traceability and metrics
  • Impact: Easy troubleshooting and performance optimization

4. Configurability

  • Before: Models and logging hardcoded
  • After: Configurable via environment variables
  • Impact: Flexible for different use cases and debugging

Files Changed

File Lines Modified Changes
backend/video_processor.py ~200 lines Model config, logging, synthesis strategies
backend/.env.example New file Configuration documentation
CLAUDE.md ~100 lines Architecture docs, troubleshooting guide
BATCH_PROCESSING_IMPROVEMENTS.md New file This summary document

Rollback Instructions

If issues arise, rollback is simple:

Option 1: Use Git

cd /path/to/video-query
git checkout HEAD~1 backend/video_processor.py
sudo systemctl restart video-query

Option 2: Disable New Features

# In backend/.env
VIDEO_SYNTHESIS_MODEL=gemini-2.0-flash-exp  # Revert to old model
BATCH_PROCESSING_LOG_PROMPTS=false
BATCH_PROCESSING_LOG_SUMMARIES=false

sudo systemctl restart video-query

Next Steps

  1. Test with meeting videos: Verify meeting-specific synthesis
  2. Test with documentation videos: Verify documentation synthesis
  3. Test with diagrams: Verify diagram merging
  4. Load test: Process batch with 5+ videos
  5. Performance test: Compare stage 1 vs stage 2 times

Future Enhancements (Optional)

  1. Add structured JSON logging for log aggregation tools
  2. Add Prometheus metrics for monitoring
  3. Add batch processing status webhooks
  4. Add configurable synthesis strategies per user/tenant
  5. Add caching for similar prompts

Support

Enable Debug Logging

# In backend/.env
BATCH_PROCESSING_LOG_PROMPTS=true
BATCH_PROCESSING_LOG_SUMMARIES=true

# View filtered logs
journalctl -u video-query -f | grep -E "(Batch|Stage|Traceability|Metrics)"

Common Issues

See CLAUDE.md → Troubleshooting → Batch Processing Issues

Questions

Refer to updated documentation in CLAUDE.md:

  • Batch Processing Architecture section
  • Configuration Files section
  • Troubleshooting section

Implementation Summary

Phase 1: Enhanced Logging - COMPLETE Phase 2: Model Consistency - COMPLETE Phase 3: Specialized Synthesis - COMPLETE Phase 4: Configuration Options - COMPLETE Documentation: Updated CLAUDE.md - COMPLETE

Total Implementation Time: ~3 hours Testing Recommended: 1-2 hours Production Risk: Low (backward compatible, configurable)


End of Implementation Summary