master_adapt_detect/MEMORY_FIX_SUMMARY.md
2025-10-01 14:32:55 -05:00

4 KiB
Raw Permalink Blame History

Memory Management Fix Summary

Problem Analysis

The application was crashing due to memory exhaustion when processing images with high feature counts (64,509 features detected). The issue occurred in the hybrid detector's local inlier analysis when 14 concurrent processes were trying to process 41 masters simultaneously, causing massive memory usage and swap thrashing.

Root Cause

  • High feature count: 64,509 features in layout image
  • Concurrent processing: 14 processes × 41 masters = 574 concurrent operations
  • Memory multiplication: Each process holding large feature sets in memory
  • No memory limits: No safeguards against memory exhaustion

Solutions Implemented

1. Memory Manager (memory_manager.py)

  • Real-time monitoring: Tracks memory and swap usage percentages
  • Safety checks: Prevents execution when memory > 80% (swap usage only warns, does not block)
  • Dynamic process limiting: Adjusts worker count based on available memory
  • Memory-safe execution decorator: Ensures functions run only when memory is safe

2. Feature Limiting

  • Maximum features per image: Limited to 10,000 features max
  • Smart reduction: Keeps best features based on response strength
  • Dynamic adjustment: Reduces features based on total count (e.g., 64K → 32K → 10K)

3. Dynamic Worker Adjustment

  • Feature-based scaling:
    • 50,000 features: workers ÷ 2

    • 30,000 features: workers × 0.75

    • <30,000 features: normal workers
  • Memory-based limiting: Further reduces based on available memory
  • Conservative defaults: Assumes 2GB per process for safety

4. Enhanced Monitoring

  • Progress with memory: Shows memory usage every 10 completed masters
  • Early warnings: Alerts when memory > 80% or swap > 20%
  • Detailed crash logging: Logs system and process memory at crash time

5. Memory Cleanup

  • Forced garbage collection: Runs gc.collect() after processing
  • Process isolation: Each master processed in separate process
  • Resource cleanup: Proper cleanup of temporary files and objects

Key Changes Made

hybrid_detector.py

  • Added memory manager initialization
  • Modified process_single_master_inlier_analysis() to limit features
  • Updated detect_with_local_inlier_analysis() for dynamic worker adjustment
  • Added memory monitoring during processing
  • Added memory cleanup after processing

memory_manager.py (NEW)

  • MemoryManager class for monitoring and control
  • memory_safe_execution decorator
  • reduce_feature_count() function for feature limiting
  • Dynamic process count calculation

logging_config.py

  • Enhanced crash logging with system memory details
  • Added memory warning logging function
  • Improved resource usage reporting

Memory Protection Features

Before Processing

  • Check if memory usage is safe (< 75%)
  • Wait for memory to return to safe levels if needed
  • Dynamically adjust worker count based on available memory

During Processing

  • Monitor memory usage every 10 completed masters
  • Log warnings when memory > 80% or swap > 20%
  • Limit features to prevent memory explosion

After Processing

  • Force garbage collection to free memory
  • Clean up temporary files and objects
  • Log final memory usage

Expected Results

  • No more crashes: Memory usage stays within safe limits
  • Better performance: Reduced memory pressure = less swap usage
  • Graceful degradation: Automatically reduces parallelism when needed
  • Better monitoring: Real-time memory usage reporting

Usage

The fixes are automatically applied when using the hybrid detector. No changes needed to command line usage:

python cli.py --all --hybrid  # Will now use memory-safe processing

Testing

Run the test suite to verify fixes:

python test_memory_fix.py

Memory Thresholds

  • Maximum memory: 75% (was unlimited)
  • Maximum swap: 30% (was unlimited)
  • Feature limit: 10,000 per image (was unlimited)
  • Dynamic workers: Based on feature count and memory availability