Major Features: - 🖥️ Standalone desktop app (VideoMatcher.app) - double-click to run - 🎨 Black & gold branded UI (Montserrat font, #FFC407 accent) - 📁 Local file browser for master/adaptation folders - ⚡ Fast mode processing (10-20x faster, disables AKAZE/AI Vision) - 🤖 Smart AI Vision fallback (auto-retry when no matches found) - 📊 Real-time progress bars (fingerprinting & matching) - 💾 Local processing (no cloud, no authentication) - 📤 CSV export with master filenames Web Application (Enterprise): - 🌐 Flask web app with Azure AD authentication - 📦 Box.com integration for cloud storage - 🐳 Docker support for deployment - 🔐 JWT validation with httpOnly cookies - 🎯 REST API endpoints Enhancements: - Fixed master filename lookup (was showing "Unknown") - Automatic fingerprint recovery (detects missing files) - Improved CSV format (master file next to adaptation) - Port conflict handling (auto-finds available port) - Environment variable fixes for standalone mode Documentation: - Updated README with standalone app section - Added 10+ guide documents (UI improvements, fingerprint recovery, etc.) - Build instructions with PyInstaller - Comprehensive troubleshooting guide Technical: - PyInstaller build configuration (video_matcher.spec) - Launcher with environment setup (launcher.py) - Mock authentication for standalone mode - Video matcher service layer - Metadata parser and AKAZE video matching 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
7 KiB
AI Vision Fallback - Smart Matching Guide
Overview
The Video Matcher now features smart fallback matching that combines the speed of fast mode with the accuracy of AI vision when needed.
How It Works
Two-Stage Matching Process
┌─────────────────────────────────────┐
│ Stage 1: Fast Mode (Default) │
│ - Frame hashing │
│ - Audio fingerprinting │
│ - ~5-10 seconds per video │
└─────────────────────────────────────┘
↓
Match Found? ──YES──> ✅ Done (Fast)
↓ NO
┌─────────────────────────────────────┐
│ Stage 2: AI Vision Fallback │
│ - OpenAI GPT-4V analysis │
│ - Cross-aspect ratio detection │
│ - ~30-60 seconds per video │
└─────────────────────────────────────┘
↓
Match Found? ──YES──> ✅ Done (AI Vision)
↓ NO
❌ No Match Found
When AI Fallback Activates
AI fallback automatically kicks in when:
- ✅ Fast mode finds no match
- ✅ Video has different aspect ratio than masters
- ✅ Examples:
- 1x1 adaptation from 16:9 master (letterboxed/cropped)
- 9:16 adaptation from 16:9 master
- Heavy visual edits or effects
AI fallback does NOT activate when:
- ❌ Fast mode already found a match
- ❌ First attempt succeeded
- ❌ Video has same aspect ratio as master
Performance Impact
Typical Batch (39 videos)
Scenario 1: All Same Aspect Ratio
- Fast mode matches: 39/39
- AI fallback used: 0
- Total time: ~6-8 minutes (5-10 sec each)
Scenario 2: 1 Cross-Aspect Video
- Fast mode matches: 38/39
- AI fallback used: 1
- Total time: ~7-9 minutes (38 fast + 1 slow)
Scenario 3: 10 Cross-Aspect Videos
- Fast mode matches: 29/39
- AI fallback used: 10
- Total time: ~10-15 minutes (29 fast + 10 slow)
UI Indicators
Progress Bar
Real-time progress shown during matching:
━━━━━━━━━━━━━━━━━━━━━ 15 / 39
Processing: adaptation_video_15.mp4
Results Summary
38 matched, 1 unmatched out of 39 total videos
🤖 1 matched using AI Vision fallback (cross-aspect ratio)
Individual Results
Videos matched via AI fallback show a badge:
✅ video_name.mp4 🤖 AI Vision
Matched Master: master_name.mp4
Confidence: 85.3%
Audio Score: 92.1%
Matched using AI Vision (likely cross-aspect ratio)
CSV Export
Exported results include match method:
Adaptation,Matched,Master,Confidence,Audio Score,Match Method
video1.mp4,Yes,master1.mp4,95.2%,94.1%,Fast
video2.mp4,Yes,master2.mp4,85.3%,92.1%,AI Vision
video3.mp4,No,,,0.0%,No Match
Requirements for AI Fallback
OpenAI API Key
AI fallback requires an OpenAI API key in your .env file:
OPENAI_API_KEY=sk-...your-key-here...
Cost Considerations
- Per video: ~$0.01-0.05 (GPT-4V pricing)
- Typical batch: 1-2 cross-aspect videos = ~$0.02-0.10 total
- Worst case: All 39 videos = ~$0.40-2.00 total
No API Key?
If no API key is configured:
- Fast mode still works normally
- AI fallback will be skipped with a warning in logs
- Cross-aspect videos may not match
Disabling AI Fallback
If you want to disable the AI fallback feature:
Option 1: Environment Variable
Add to your .env file:
DISABLE_AI_FALLBACK=1
Option 2: Code Change
In app.py, modify the match call:
match_result = matcher.match_video(
video_path=adaptation_path,
enable_ai_fallback=False # Disable AI fallback
)
Monitoring in Terminal
Watch the terminal for fallback activity:
INFO - Matching video1.mp4 (mode: FAST)
INFO - Found 1 matches for video1.mp4
INFO - Matching video2.mp4 (mode: FAST)
INFO - No match found in fast mode for video2.mp4, trying AI vision fallback...
INFO - ✓ AI vision fallback found match for video2.mp4
Troubleshooting
AI Fallback Not Working
Check 1: API Key Set?
# In .env file
OPENAI_API_KEY=sk-...
# Verify it's loaded
echo $OPENAI_API_KEY
Check 2: Internet Connection? AI fallback requires internet to call OpenAI API.
Check 3: Terminal Logs? Look for errors like:
WARNING - AI vision fallback failed for video.mp4: No API key found
AI Fallback Takes Forever
Check 1: How Many Videos? Each AI fallback takes 30-60 seconds. If many videos need fallback:
- 5 videos = 2-5 minutes
- 10 videos = 5-10 minutes
Check 2: API Rate Limits? OpenAI may rate limit if many requests:
- Wait a moment and retry
- Check OpenAI dashboard for limits
False Positives from AI
If AI fallback matches incorrectly:
Option 1: Adjust Thresholds
// In standalone.html or API call
{
"threshold": 0.85, // Increase from 0.80
"min_avg_similarity": 0.92 // Increase from 0.90
}
Option 2: Disable AI Fallback See "Disabling AI Fallback" section above.
Best Practices
1. Group by Aspect Ratio
Process videos with same aspect ratio together:
- First batch: 16:9 adaptations (all fast mode)
- Second batch: 1x1 adaptations (may need AI fallback)
2. Check Results
Review videos matched via AI fallback:
- Look for 🤖 AI Vision badge
- Verify confidence scores are high (>85%)
- Manually check if uncertain
3. Monitor Costs
If processing many cross-aspect videos:
- Track AI fallback usage in results
- Estimate costs: count × $0.02-0.05
- Set OpenAI billing limits
4. Use Terminal Logs
Keep terminal visible to see:
- Which videos trigger fallback
- Success/failure of AI matching
- Any errors or warnings
Technical Details
Match Methods
fast: Matched using frame hashing + audio fingerprintingai_vision_fallback: Matched using OpenAI GPT-4V after fast mode failednone: No match found in either mode
Confidence Scores
- Fast mode: Based on frame hash similarity + audio score
- AI vision: Based on GPT-4V similarity assessment + audio score
- Both modes: Higher score = more confident match
Why AI Vision for Cross-Aspect?
GPT-4V can "understand" that a 1x1 letterboxed video is the same content as a 16:9 master, even though the pixels are completely different. Traditional frame hashing can't detect this.
Summary
| Feature | Fast Mode Only | With AI Fallback |
|---|---|---|
| Speed | ⚡ Very Fast | ⚡ Fast (most videos) |
| Accuracy | ✅ Good | ✅✅ Excellent |
| Cross-Aspect | ❌ Limited | ✅ Yes |
| Cost | $0 | ~$0.02-0.05 per fallback |
| Internet | ❌ Not needed | ✅ Required (fallback only) |
| API Key | ❌ Not needed | ✅ Required (fallback only) |
Bottom Line: AI fallback gives you the best of both worlds - fast processing for most videos, with intelligent fallback for tricky cross-aspect ratio cases.