video-master-adapt/CHANGELOG.md
2025-10-15 16:25:04 +02:00

4.2 KiB
Raw Permalink Blame History

Changelog

All notable changes to this project will be documented in this file.

[2.0.1] - 2025-10-10

🚀 Performance Optimization

Smart AI Triggering

  • Intelligent AI activation - Only triggers when truly needed:
    • No matches found (likely cross-aspect)
    • Incomplete frame coverage (< 100%)
    • Skipped for perfect matches (100% coverage)
  • 97% cost reduction - Typical batches: 1-2/39 adaptations use AI
  • Faster processing - Seconds instead of minutes for perfect matches
  • Cost transparency - Shows savings when AI is skipped

📚 Documentation

  • Updated README with smart triggering examples
  • Enhanced AI Vision guide with cost optimization
  • Added real-world batch processing examples

💰 Cost Impact

Before optimization:

  • 39 adaptations × 50 masters = $11.70 (all use AI)

After optimization:

  • 38 perfect matches: $0.00 (AI skipped)
  • 1 cross-aspect: $0.30 (AI used)
  • Total: $0.30 (97% savings!)

[2.0.0] - 2025-10-10

🚀 Major Features

AI Vision Integration (Tier 2 Matching)

  • Added GPT-4o vision model for semantic video comparison
  • Cross-aspect-ratio detection - Matches 16:9 masters to 1:1, 9:16, 4:5 adaptations
  • Intelligent text/logo ignoring - Focuses on people, products, settings
  • Crop detection - Identifies when adaptations are cropped/zoomed from masters
  • Human-readable explanations - AI provides reasoning for matches
  • Automatic fallback - Triggers when perceptual hashing fails or confidence < 90%
  • Cost tracking - Shows estimated OpenAI API cost per comparison (~$0.005-0.007)

Enhancements

  • Improved CLI output - Added "Method" column showing "Hash" or "AI Vision"
  • AI Vision analysis display - Shows crop detection and reasoning in results
  • Enhanced prompts - Optimized GPT-4o prompt for better cross-aspect detection
  • Environment configuration - Added .env support with python-dotenv
  • Comprehensive documentation - Updated README with AI Vision setup and usage

🐛 Bug Fixes

  • Fixed ffmpeg frame extraction - Corrected scale filter syntax for ffmpeg-python
  • Updated to gpt-4o model - Replaced deprecated gpt-4-vision-preview
  • Removed ORB matching - Eliminated false positives from feature matching

📦 Dependencies

  • Added openai>=1.12.0 - OpenAI GPT-4o integration
  • Added python-dotenv>=1.0.0 - Environment variable management
  • Removed opencv-python - No longer needed after removing ORB

📚 Documentation

  • Updated README.md with AI Vision features and setup
  • Enhanced .env.example with detailed configuration guide
  • Added privacy and security notes for AI Vision
  • Updated architecture diagram to show 3-tier system
  • Added cost estimates and performance metrics

🔧 Technical Changes

  • Created src/video_matcher/ai_vision.py module
  • Integrated AI Vision into matcher.py as Tier 2 fallback
  • Updated CLI to display AI Vision results
  • Modified fingerprinter to remove ORB code
  • Simplified matching to perceptual hash + AI Vision only

💰 Cost Information

AI Vision Pricing (GPT-4o):

  • ~$0.005-0.007 per comparison (10 images)
  • 50 masters: ~$0.25-0.35 per adaptation
  • Very affordable for production use!

🎯 What's Fixed

  • Removed: ORB feature matching (caused false positives)
  • Fixed: Cross-aspect-ratio matching (16:9 → 1:1, 9:16)
  • Fixed: Text/logo variations no longer cause mismatches
  • Fixed: Cropped adaptations now correctly match source masters

🚀 Migration Guide

From v1.x to v2.0:

  1. Update dependencies:

    pip install -r requirements.txt
    
  2. (Optional) Set up AI Vision:

    cp .env.example .env
    # Edit .env and add your OpenAI API key
    
  3. Re-test your matches - results will be more accurate!

Breaking Changes:

  • None - v2.0 is fully backward compatible
  • ORB matching removed, but spatial matching remains
  • AI Vision is optional (gracefully disabled without API key)

[1.0.0] - 2025-10-08

Initial Release

  • Spatial-only perceptual hash matching
  • Audio fingerprinting with Chromaprint
  • Multi-master detection
  • Batch processing with HTML reports
  • Rich CLI interface
  • ORB feature matching (later removed in v2.0)