master_adapt_detect/README.md
2025-10-01 14:32:55 -05:00

3 KiB

Master Image Detection Application

This application uses Google Gemini 2.5 Pro API to detect which master images appear in layout images.

Features

  • Filename-based IDs: Master images are identified by their filenames (without .jpg extension)
  • Comprehensive Detection: Finds exact matches, cropped versions, scaled/rotated images
  • Detailed Results: JSON output with layout filenames and detected master filenames
  • Optimized Processing: Sequential processing with master images uploaded only once
  • Progress Tracking: Real-time progress updates and periodic saves during batch processing
  • Error Handling: Automatic retries and graceful error recovery

Setup

  1. Install Dependencies:

    python3 -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    
  2. Configure API Key:

    • API key is already set in .env file
    • Ensure .env file exists with your Gemini API key

Usage

Activate the virtual environment first:

source venv/bin/activate

Command Line Options

# Test with 1 layout
python image_detector.py --test

# Process first 10 layouts
python image_detector.py --limit 10

# Process all layouts
python image_detector.py --all

# Custom output filename
python image_detector.py --limit 50 --output my_batch_results

# Process all layouts (sequential but optimized)
python image_detector.py --all

# Custom paths
python image_detector.py --all --master-path /path/to/masters --layout-path /path/to/layouts

Help

python image_detector.py --help

Common Commands

# Quick test
python image_detector.py --test

# Small batch
python image_detector.py --limit 10

# Full processing (all 306 layouts) - optimized sequential
python image_detector.py --all

Output Format

Results are saved as JSON with this structure:

{
  "metadata": {
    "total_layouts_processed": 1,
    "total_master_images": 41,
    "master_images_available": ["1011A_1011_05", "1011A_1011_06", ...]
  },
  "results": {
    "6814786": {
      "layout_filename": "6814786.jpg",
      "detected_master_ids": ["1011A_1011_05"],
      "detected_master_filenames": ["1011A_1011_05.jpg"],
      "analysis": "Detailed analysis of what was found..."
    }
  }
}

Key Output Fields

  • layout_filename: The layout image filename
  • detected_master_ids: Master image IDs (filenames without .jpg)
  • detected_master_filenames: Full master image filenames with .jpg extension
  • analysis: Gemini's detailed explanation of the detection

Directory Structure

├── master_images/     # 41 master images to detect
├── layouts/          # 299+ layout images to analyze
├── results/          # JSON output files
├── venv/            # Python virtual environment
├── image_detector.py # Main application
├── test_simple.py   # API connection tester
├── requirements.txt # Dependencies
└── .env            # API configuration

Example Results

Layout 6814786.jpg contains master image 1011A_1011_05.jpg (cropped version).