14 KiB
14 KiB
Master Adapt Detector Architecture Diagram
This diagram illustrates the architecture and flow of the Master Adapt Detector application, which uses multiple AI models and computer vision techniques to detect master images within layout images.
High-Level Architecture
graph TB
subgraph "Entry Point"
CLI[cli.py - Command Line Interface]
end
subgraph "Core Detection Engines"
GD[Gemini Detector<br/>gemini_detector.py]
OD[OpenAI Detector<br/>openai_detector.py]
VD[Vector Detector<br/>vector_detector.py]
HD[Hybrid Detector<br/>hybrid_detector.py]
end
subgraph "Panel Splitting System"
PS[Panel Splitter<br/>panel_splitter.py]
AS[Advanced Splitter<br/>advanced_splitter.py]
SS[Simple Splitter<br/>simple_splitter.py]
end
subgraph "Support Systems"
MM[Memory Manager<br/>memory_manager.py]
LC[Logging Config<br/>logging_config.py]
PD[Process Detection<br/>process_detection.py]
end
subgraph "AI APIs"
GEMINI[Google Gemini 2.5 Pro]
OPENAI[OpenAI o3]
VERTEX[Google Vertex AI<br/>Vector Embeddings]
end
subgraph "Computer Vision"
OPENCV[OpenCV<br/>Feature Detection]
AKAZE[AKAZE Features]
RANSAC[RANSAC Homography]
end
subgraph "Data Storage"
MI[Master Images<br/>master_images/]
LI[Layout Images<br/>layouts/]
RES[Results<br/>results/]
EMB[Embeddings Cache<br/>embeddings_cache/]
end
CLI --> GD
CLI --> OD
CLI --> VD
CLI --> HD
HD --> OD
HD --> VD
HD --> PS
HD --> AS
HD --> SS
GD --> GEMINI
OD --> OPENAI
VD --> VERTEX
HD --> OPENCV
HD --> AKAZE
HD --> RANSAC
PS --> OPENCV
AS --> OPENCV
SS --> OPENCV
GD --> MM
OD --> MM
VD --> MM
HD --> MM
CLI --> LC
PD --> LC
GD --> MI
OD --> MI
VD --> MI
HD --> MI
GD --> LI
OD --> LI
VD --> LI
HD --> LI
GD --> RES
OD --> RES
VD --> RES
HD --> RES
VD --> EMB
HD --> EMB
Detailed Application Flow
flowchart TD
START([Application Start]) --> PARSE[Parse CLI Arguments]
PARSE --> MODE{Select Mode}
MODE -->|--hybrid| HYBRID[Hybrid Mode]
MODE -->|--openai| OPENAI_MODE[OpenAI Mode]
MODE -->|--vector-mode| VECTOR_MODE[Vector Mode]
MODE -->|default| GEMINI_MODE[Gemini Mode]
subgraph "Hybrid Mode Processing"
HYBRID --> LOAD_MASTERS[Load Master Images]
LOAD_MASTERS --> INIT_EMBED{Vector Mode?}
INIT_EMBED -->|Yes| GEN_EMBED[Generate Master Embeddings]
INIT_EMBED -->|No| INIT_CV[Initialize OpenCV Components]
GEN_EMBED --> PROCESS_LAYOUT[Process Layout]
INIT_CV --> PROCESS_LAYOUT
PROCESS_LAYOUT --> COUNT_PANELS[Count Panels with OpenAI o3]
COUNT_PANELS --> DETECT_CENSOR[Detect Censorship with OpenAI o3]
DETECT_CENSOR --> PANEL_CHECK{Panel Count ≤ Threshold?}
PANEL_CHECK -->|Yes| LOCAL_ANALYSIS[Local Analysis]
PANEL_CHECK -->|No| SPLIT_ANALYSIS[Split + Analysis]
LOCAL_ANALYSIS --> VECTOR_CHECK{Vector Mode?}
VECTOR_CHECK -->|Yes| VECTOR_SIM[Vector Similarity]
VECTOR_CHECK -->|No| INLIER_ANALYSIS[Inlier Analysis]
SPLIT_ANALYSIS --> SPLIT_PANELS[Split Panels]
SPLIT_PANELS --> SPLIT_VECTOR_CHECK{Vector Mode?}
SPLIT_VECTOR_CHECK -->|Yes| SPLIT_VECTOR[Split + Vector Similarity]
SPLIT_VECTOR_CHECK -->|No| SPLIT_INLIER[Split + Inlier Analysis]
VECTOR_SIM --> APPLY_REFINEMENT
INLIER_ANALYSIS --> APPLY_REFINEMENT
SPLIT_VECTOR --> APPLY_REFINEMENT
SPLIT_INLIER --> APPLY_REFINEMENT
APPLY_REFINEMENT[Apply CEN Refinement] --> DEDUP[Deduplication]
DEDUP --> TRUNCATE[Truncate to Panel Count]
TRUNCATE --> FALLBACK_CHECK{Fallback Enabled?}
FALLBACK_CHECK -->|Yes & Needed| FALLBACK[OpenAI One-at-a-Time Fallback]
FALLBACK_CHECK -->|No| SAVE_RESULTS
FALLBACK --> SAVE_RESULTS[Save Results]
end
subgraph "OpenAI Mode Processing"
OPENAI_MODE --> LOAD_MASTERS_O[Load Master Images]
LOAD_MASTERS_O --> ONE_AT_TIME{One-at-a-Time?}
ONE_AT_TIME -->|Yes| PARALLEL_MASTERS[Parallel Master Processing]
ONE_AT_TIME -->|No| BATCH_PROCESS[Batch Processing]
PARALLEL_MASTERS --> PANEL_AWARE{Panel-Aware Refinement?}
PANEL_AWARE -->|Yes| COUNT_PANELS_O[Count Panels] --> INLIER_REFINE[Inlier Refinement]
PANEL_AWARE -->|No| APPLY_CEN_O[Apply CEN Refinement]
INLIER_REFINE --> APPLY_CEN_O
BATCH_PROCESS --> APPLY_CEN_O
APPLY_CEN_O --> SAVE_RESULTS_O[Save Results]
end
subgraph "Vector Mode Processing"
VECTOR_MODE --> LOAD_MASTERS_V[Load Master Images]
LOAD_MASTERS_V --> GEN_EMBED_V[Generate Master Embeddings]
GEN_EMBED_V --> SPLITTING_CHECK{Splitting Enabled?}
SPLITTING_CHECK -->|Yes| SPLIT_LAYOUT[Split Layout]
SPLITTING_CHECK -->|No| COMPARE_EMBED[Compare Embeddings]
SPLIT_LAYOUT --> COMPARE_SPLITS[Compare Split Embeddings]
COMPARE_SPLITS --> SAVE_RESULTS_V[Save Results]
COMPARE_EMBED --> SAVE_RESULTS_V
end
subgraph "Gemini Mode Processing"
GEMINI_MODE --> LOAD_MASTERS_G[Load Master Images]
LOAD_MASTERS_G --> GEMINI_ONE_AT_TIME{One-at-a-Time?}
GEMINI_ONE_AT_TIME -->|Yes| PARALLEL_MASTERS_G[Parallel Master Processing]
GEMINI_ONE_AT_TIME -->|No| BATCH_PROCESS_G[Batch Processing]
PARALLEL_MASTERS_G --> APPLY_CEN_G[Apply CEN Refinement]
BATCH_PROCESS_G --> APPLY_CEN_G
APPLY_CEN_G --> SAVE_RESULTS_G[Save Results]
end
SAVE_RESULTS --> END([End])
SAVE_RESULTS_O --> END
SAVE_RESULTS_V --> END
SAVE_RESULTS_G --> END
Panel Splitting Architecture
graph TB
subgraph "Panel Splitting System"
INPUT[Layout Image] --> DETECTOR{Splitter Type}
DETECTOR -->|Basic| PANEL_SPLITTER[PanelSplitter]
DETECTOR -->|Advanced| ADVANCED_SPLITTER[AdvancedPanelSplitter]
DETECTOR -->|Simple| SIMPLE_SPLITTER[SimplePanelSplitter]
subgraph "PanelSplitter Methods"
PANEL_SPLITTER --> EDGE_DETECT[Edge Detection]
PANEL_SPLITTER --> CONTOUR_FIND[Contour Finding]
PANEL_SPLITTER --> HIST_ANALYSIS[Histogram Analysis]
PANEL_SPLITTER --> KMEANS[K-Means Clustering]
end
subgraph "AdvancedPanelSplitter Methods"
ADVANCED_SPLITTER --> SOBEL[Sobel Edge Detection]
ADVANCED_SPLITTER --> GUTTER_DETECT[Gutter Detection]
ADVANCED_SPLITTER --> ENERGY_ANALYSIS[Energy Analysis]
ADVANCED_SPLITTER --> PERCENTILE_THRESH[Percentile Thresholding]
end
subgraph "SimplePanelSplitter Methods"
SIMPLE_SPLITTER --> EVEN_SPLIT[Even Division]
SIMPLE_SPLITTER --> PANEL_COUNT[Use Panel Count]
end
EDGE_DETECT --> SPLIT_RESULTS[Split Results]
CONTOUR_FIND --> SPLIT_RESULTS
HIST_ANALYSIS --> SPLIT_RESULTS
KMEANS --> SPLIT_RESULTS
SOBEL --> SPLIT_RESULTS
GUTTER_DETECT --> SPLIT_RESULTS
ENERGY_ANALYSIS --> SPLIT_RESULTS
PERCENTILE_THRESH --> SPLIT_RESULTS
EVEN_SPLIT --> SPLIT_RESULTS
PANEL_COUNT --> SPLIT_RESULTS
end
SPLIT_RESULTS --> INDIVIDUAL_PANELS[Individual Panel Images]
INDIVIDUAL_PANELS --> MATCH_PROCESS[Match Each Panel to Masters]
Memory Management and Multiprocessing
graph TB
subgraph "Memory Management System"
MEMORY_MANAGER[Memory Manager] --> MONITOR[Monitor Usage]
MONITOR --> THRESH_CHECK{Usage > Threshold?}
THRESH_CHECK -->|Yes| THROTTLE[Throttle Processes]
THRESH_CHECK -->|No| CONTINUE[Continue Processing]
THROTTLE --> WAIT[Wait for Memory]
WAIT --> REDUCE_WORKERS[Reduce Worker Count]
REDUCE_WORKERS --> CONTINUE
CONTINUE --> PROCESS_POOL[Process Pool Executor]
PROCESS_POOL --> WORKER1[Worker Process 1]
PROCESS_POOL --> WORKER2[Worker Process 2]
PROCESS_POOL --> WORKERN[Worker Process N]
subgraph "Worker Process"
WORKER1 --> ISOLATED_ENV[Isolated Environment]
ISOLATED_ENV --> LOAD_MODELS[Load Models]
LOAD_MODELS --> PROCESS_TASK[Process Task]
PROCESS_TASK --> CLEANUP[Cleanup]
end
WORKER2 --> ISOLATED_ENV
WORKERN --> ISOLATED_ENV
end
subgraph "Feature Limiting"
PROCESS_TASK --> FEATURE_COUNT[Count Features]
FEATURE_COUNT --> FEATURE_CHECK{Features > Limit?}
FEATURE_CHECK -->|Yes| LIMIT_FEATURES[Limit Features]
FEATURE_CHECK -->|No| PROCEED[Proceed]
LIMIT_FEATURES --> PROCEED
end
Data Flow and Storage
graph LR
subgraph "Input Data"
MI[Master Images<br/>41 images]
LI[Layout Images<br/>299+ images]
end
subgraph "Processing Cache"
TEMP[Temp Processed Images]
EMB_CACHE[Embeddings Cache]
SPLITS[Split Panel Images]
end
subgraph "Output Data"
JSON[JSON Results]
LOGS[Log Files]
DEBUG[Debug Images]
CROPS[Crop Images]
end
MI --> TEMP
LI --> TEMP
TEMP --> EMB_CACHE
TEMP --> SPLITS
EMB_CACHE --> JSON
SPLITS --> JSON
JSON --> LOGS
JSON --> DEBUG
JSON --> CROPS
subgraph "Result Structure"
JSON --> METADATA[Metadata]
JSON --> LAYOUT_RESULTS[Layout Results]
METADATA --> TOTAL_LAYOUTS[Total Layouts]
METADATA --> MASTER_COUNT[Master Count]
METADATA --> PROVIDER[Provider Info]
METADATA --> PROCESSING_MODE[Processing Mode]
LAYOUT_RESULTS --> DETECTED_MASTERS[Detected Masters]
LAYOUT_RESULTS --> ANALYSIS[Analysis Text]
LAYOUT_RESULTS --> CONFIDENCE[Confidence Score]
LAYOUT_RESULTS --> PANEL_INFO[Panel Information]
end
Key Components and Their Roles
1. CLI Interface (cli.py)
- Purpose: Command-line interface for the application
- Features: Argument parsing, mode selection, batch processing options
- Modes: Gemini, OpenAI, Vector, Hybrid
- Options: Test mode, batch processing, custom outputs, splitting options
2. Detection Engines
Hybrid Detector (hybrid_detector.py)
- Purpose: Cost-efficient detection combining OpenAI panel counting with local analysis
- Features:
- Panel threshold-based routing
- Vector similarity or inlier analysis
- Automatic fallback to OpenAI one-at-a-time
- CEN refinement and deduplication
- Workflow: Panel count → Route to local/split analysis → Apply refinements
OpenAI Detector (openai_detector.py)
- Purpose: Uses OpenAI o3 model for image matching
- Features:
- One-at-a-time processing with multiprocessing
- Panel-aware refinement
- Image preprocessing (greyscale, contrast)
- API: OpenAI o3 vision model
Vector Detector (vector_detector.py)
- Purpose: Uses Google Vertex AI embeddings for similarity matching
- Features:
- 1408-dimensional embeddings
- Cosine similarity matching
- Embedding caching
- API: Google Vertex AI Multimodal Embeddings
Gemini Detector (gemini_detector.py)
- Purpose: Uses Google Gemini 2.5 Pro for image analysis
- Features:
- Batch processing
- Safety settings handling
- Image preprocessing
- API: Google Gemini 2.5 Pro
3. Panel Splitting System
Panel Splitter (panel_splitter.py)
- Purpose: Basic multi-method panel splitting
- Methods: Edge detection, contour finding, histogram analysis, K-means clustering
Advanced Splitter (advanced_splitter.py)
- Purpose: Advanced edge detection and gutter analysis
- Methods: Sobel edge detection, energy analysis, percentile thresholding
Simple Splitter (simple_splitter.py)
- Purpose: Simple even division based on panel count
- Methods: Even division, panel count-based splitting
4. Support Systems
Memory Manager (memory_manager.py)
- Purpose: Prevents memory exhaustion during processing
- Features: Memory monitoring, worker throttling, safe execution decorators
Logging Config (logging_config.py)
- Purpose: Dual logging to terminal and file
- Features: System info logging, exception tracking, memory usage logging
Process Detection (process_detection.py)
- Purpose: Standalone functions for multiprocessing
- Features: Process isolation, error handling, resource cleanup
5. Key Algorithms
Inlier Analysis (OpenCV)
- Purpose: Local feature matching using computer vision
- Algorithm: AKAZE features → RANSAC homography → Inlier counting
- Advantage: No API costs, fast processing
Vector Similarity (Vertex AI)
- Purpose: Semantic similarity using embeddings
- Algorithm: Image embeddings → Cosine similarity → Threshold matching
- Advantage: Semantic understanding, good for transformed images
Panel Detection (OpenAI o3)
- Purpose: Intelligent panel counting and censorship detection
- Algorithm: Vision model analysis → Panel count + censorship status
- Advantage: Accurate panel analysis, handles complex layouts
6. Processing Modes
Hybrid Mode (Recommended)
- Strategy: OpenAI panel counting + local analysis for efficiency
- Routing: ≤2 panels → local analysis, ≥3 panels → split + analysis
- Fallback: OpenAI one-at-a-time if insufficient matches
- Cost: ~1 API call per layout vs ~41 for pure OpenAI
OpenAI Mode
- Strategy: Pure OpenAI o3 processing
- Options: Batch or one-at-a-time with panel-aware refinement
- Cost: High API usage but highest accuracy
Vector Mode
- Strategy: Pure vector embedding similarity
- Options: Splitting modes for multi-panel layouts
- Cost: No API costs after embedding generation
Gemini Mode
- Strategy: Google Gemini 2.5 Pro processing
- Options: Batch or one-at-a-time processing
- Cost: Lower than OpenAI but higher than vector
This architecture provides a flexible, scalable system for master image detection with multiple processing strategies optimized for different use cases and cost requirements.