video-master-adapt/AI_FALLBACK_GUIDE.md
nickviljoen 891c36bbfb Add standalone desktop application with web interface
Major Features:
- 🖥️ Standalone desktop app (VideoMatcher.app) - double-click to run
- 🎨 Black & gold branded UI (Montserrat font, #FFC407 accent)
- 📁 Local file browser for master/adaptation folders
-  Fast mode processing (10-20x faster, disables AKAZE/AI Vision)
- 🤖 Smart AI Vision fallback (auto-retry when no matches found)
- 📊 Real-time progress bars (fingerprinting & matching)
- 💾 Local processing (no cloud, no authentication)
- 📤 CSV export with master filenames

Web Application (Enterprise):
- 🌐 Flask web app with Azure AD authentication
- 📦 Box.com integration for cloud storage
- 🐳 Docker support for deployment
- 🔐 JWT validation with httpOnly cookies
- 🎯 REST API endpoints

Enhancements:
- Fixed master filename lookup (was showing "Unknown")
- Automatic fingerprint recovery (detects missing files)
- Improved CSV format (master file next to adaptation)
- Port conflict handling (auto-finds available port)
- Environment variable fixes for standalone mode

Documentation:
- Updated README with standalone app section
- Added 10+ guide documents (UI improvements, fingerprint recovery, etc.)
- Build instructions with PyInstaller
- Comprehensive troubleshooting guide

Technical:
- PyInstaller build configuration (video_matcher.spec)
- Launcher with environment setup (launcher.py)
- Mock authentication for standalone mode
- Video matcher service layer
- Metadata parser and AKAZE video matching

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-31 09:49:04 +02:00

7 KiB
Raw Permalink Blame History

AI Vision Fallback - Smart Matching Guide

Overview

The Video Matcher now features smart fallback matching that combines the speed of fast mode with the accuracy of AI vision when needed.

How It Works

Two-Stage Matching Process

┌─────────────────────────────────────┐
│  Stage 1: Fast Mode (Default)      │
│  - Frame hashing                     │
│  - Audio fingerprinting              │
│  - ~5-10 seconds per video          │
└─────────────────────────────────────┘
              ↓
        Match Found? ──YES──> ✅ Done (Fast)
              ↓ NO
┌─────────────────────────────────────┐
│  Stage 2: AI Vision Fallback       │
│  - OpenAI GPT-4V analysis           │
│  - Cross-aspect ratio detection     │
│  - ~30-60 seconds per video         │
└─────────────────────────────────────┘
              ↓
        Match Found? ──YES──> ✅ Done (AI Vision)
              ↓ NO
         ❌ No Match Found

When AI Fallback Activates

AI fallback automatically kicks in when:

  • Fast mode finds no match
  • Video has different aspect ratio than masters
  • Examples:
    • 1x1 adaptation from 16:9 master (letterboxed/cropped)
    • 9:16 adaptation from 16:9 master
    • Heavy visual edits or effects

AI fallback does NOT activate when:

  • Fast mode already found a match
  • First attempt succeeded
  • Video has same aspect ratio as master

Performance Impact

Typical Batch (39 videos)

Scenario 1: All Same Aspect Ratio

  • Fast mode matches: 39/39
  • AI fallback used: 0
  • Total time: ~6-8 minutes (5-10 sec each)

Scenario 2: 1 Cross-Aspect Video

  • Fast mode matches: 38/39
  • AI fallback used: 1
  • Total time: ~7-9 minutes (38 fast + 1 slow)

Scenario 3: 10 Cross-Aspect Videos

  • Fast mode matches: 29/39
  • AI fallback used: 10
  • Total time: ~10-15 minutes (29 fast + 10 slow)

UI Indicators

Progress Bar

Real-time progress shown during matching:

━━━━━━━━━━━━━━━━━━━━━ 15 / 39
Processing: adaptation_video_15.mp4

Results Summary

38 matched, 1 unmatched out of 39 total videos
🤖 1 matched using AI Vision fallback (cross-aspect ratio)

Individual Results

Videos matched via AI fallback show a badge:

✅ video_name.mp4  🤖 AI Vision
Matched Master: master_name.mp4
Confidence: 85.3%
Audio Score: 92.1%
Matched using AI Vision (likely cross-aspect ratio)

CSV Export

Exported results include match method:

Adaptation,Matched,Master,Confidence,Audio Score,Match Method
video1.mp4,Yes,master1.mp4,95.2%,94.1%,Fast
video2.mp4,Yes,master2.mp4,85.3%,92.1%,AI Vision
video3.mp4,No,,,0.0%,No Match

Requirements for AI Fallback

OpenAI API Key

AI fallback requires an OpenAI API key in your .env file:

OPENAI_API_KEY=sk-...your-key-here...

Cost Considerations

  • Per video: ~$0.01-0.05 (GPT-4V pricing)
  • Typical batch: 1-2 cross-aspect videos = ~$0.02-0.10 total
  • Worst case: All 39 videos = ~$0.40-2.00 total

No API Key?

If no API key is configured:

  • Fast mode still works normally
  • AI fallback will be skipped with a warning in logs
  • Cross-aspect videos may not match

Disabling AI Fallback

If you want to disable the AI fallback feature:

Option 1: Environment Variable

Add to your .env file:

DISABLE_AI_FALLBACK=1

Option 2: Code Change

In app.py, modify the match call:

match_result = matcher.match_video(
    video_path=adaptation_path,
    enable_ai_fallback=False  # Disable AI fallback
)

Monitoring in Terminal

Watch the terminal for fallback activity:

INFO - Matching video1.mp4 (mode: FAST)
INFO - Found 1 matches for video1.mp4

INFO - Matching video2.mp4 (mode: FAST)
INFO - No match found in fast mode for video2.mp4, trying AI vision fallback...
INFO - ✓ AI vision fallback found match for video2.mp4

Troubleshooting

AI Fallback Not Working

Check 1: API Key Set?

# In .env file
OPENAI_API_KEY=sk-...

# Verify it's loaded
echo $OPENAI_API_KEY

Check 2: Internet Connection? AI fallback requires internet to call OpenAI API.

Check 3: Terminal Logs? Look for errors like:

WARNING - AI vision fallback failed for video.mp4: No API key found

AI Fallback Takes Forever

Check 1: How Many Videos? Each AI fallback takes 30-60 seconds. If many videos need fallback:

  • 5 videos = 2-5 minutes
  • 10 videos = 5-10 minutes

Check 2: API Rate Limits? OpenAI may rate limit if many requests:

  • Wait a moment and retry
  • Check OpenAI dashboard for limits

False Positives from AI

If AI fallback matches incorrectly:

Option 1: Adjust Thresholds

// In standalone.html or API call
{
  "threshold": 0.85,           // Increase from 0.80
  "min_avg_similarity": 0.92   // Increase from 0.90
}

Option 2: Disable AI Fallback See "Disabling AI Fallback" section above.

Best Practices

1. Group by Aspect Ratio

Process videos with same aspect ratio together:

  • First batch: 16:9 adaptations (all fast mode)
  • Second batch: 1x1 adaptations (may need AI fallback)

2. Check Results

Review videos matched via AI fallback:

  • Look for 🤖 AI Vision badge
  • Verify confidence scores are high (>85%)
  • Manually check if uncertain

3. Monitor Costs

If processing many cross-aspect videos:

  • Track AI fallback usage in results
  • Estimate costs: count × $0.02-0.05
  • Set OpenAI billing limits

4. Use Terminal Logs

Keep terminal visible to see:

  • Which videos trigger fallback
  • Success/failure of AI matching
  • Any errors or warnings

Technical Details

Match Methods

  • fast: Matched using frame hashing + audio fingerprinting
  • ai_vision_fallback: Matched using OpenAI GPT-4V after fast mode failed
  • none: No match found in either mode

Confidence Scores

  • Fast mode: Based on frame hash similarity + audio score
  • AI vision: Based on GPT-4V similarity assessment + audio score
  • Both modes: Higher score = more confident match

Why AI Vision for Cross-Aspect?

GPT-4V can "understand" that a 1x1 letterboxed video is the same content as a 16:9 master, even though the pixels are completely different. Traditional frame hashing can't detect this.

Summary

Feature Fast Mode Only With AI Fallback
Speed Very Fast Fast (most videos)
Accuracy Good Excellent
Cross-Aspect Limited Yes
Cost $0 ~$0.02-0.05 per fallback
Internet Not needed Required (fallback only)
API Key Not needed Required (fallback only)

Bottom Line: AI fallback gives you the best of both worlds - fast processing for most videos, with intelligent fallback for tricky cross-aspect ratio cases.