4 KiB
4 KiB
Memory Management Fix Summary
Problem Analysis
The application was crashing due to memory exhaustion when processing images with high feature counts (64,509 features detected). The issue occurred in the hybrid detector's local inlier analysis when 14 concurrent processes were trying to process 41 masters simultaneously, causing massive memory usage and swap thrashing.
Root Cause
- High feature count: 64,509 features in layout image
- Concurrent processing: 14 processes × 41 masters = 574 concurrent operations
- Memory multiplication: Each process holding large feature sets in memory
- No memory limits: No safeguards against memory exhaustion
Solutions Implemented
1. Memory Manager (memory_manager.py)
- Real-time monitoring: Tracks memory and swap usage percentages
- Safety checks: Prevents execution when memory > 80% (swap usage only warns, does not block)
- Dynamic process limiting: Adjusts worker count based on available memory
- Memory-safe execution decorator: Ensures functions run only when memory is safe
2. Feature Limiting
- Maximum features per image: Limited to 10,000 features max
- Smart reduction: Keeps best features based on response strength
- Dynamic adjustment: Reduces features based on total count (e.g., 64K → 32K → 10K)
3. Dynamic Worker Adjustment
- Feature-based scaling:
-
50,000 features: workers ÷ 2
-
30,000 features: workers × 0.75
- <30,000 features: normal workers
-
- Memory-based limiting: Further reduces based on available memory
- Conservative defaults: Assumes 2GB per process for safety
4. Enhanced Monitoring
- Progress with memory: Shows memory usage every 10 completed masters
- Early warnings: Alerts when memory > 80% or swap > 20%
- Detailed crash logging: Logs system and process memory at crash time
5. Memory Cleanup
- Forced garbage collection: Runs
gc.collect()after processing - Process isolation: Each master processed in separate process
- Resource cleanup: Proper cleanup of temporary files and objects
Key Changes Made
hybrid_detector.py
- Added memory manager initialization
- Modified
process_single_master_inlier_analysis()to limit features - Updated
detect_with_local_inlier_analysis()for dynamic worker adjustment - Added memory monitoring during processing
- Added memory cleanup after processing
memory_manager.py (NEW)
MemoryManagerclass for monitoring and controlmemory_safe_executiondecoratorreduce_feature_count()function for feature limiting- Dynamic process count calculation
logging_config.py
- Enhanced crash logging with system memory details
- Added memory warning logging function
- Improved resource usage reporting
Memory Protection Features
Before Processing
- Check if memory usage is safe (< 75%)
- Wait for memory to return to safe levels if needed
- Dynamically adjust worker count based on available memory
During Processing
- Monitor memory usage every 10 completed masters
- Log warnings when memory > 80% or swap > 20%
- Limit features to prevent memory explosion
After Processing
- Force garbage collection to free memory
- Clean up temporary files and objects
- Log final memory usage
Expected Results
- No more crashes: Memory usage stays within safe limits
- Better performance: Reduced memory pressure = less swap usage
- Graceful degradation: Automatically reduces parallelism when needed
- Better monitoring: Real-time memory usage reporting
Usage
The fixes are automatically applied when using the hybrid detector. No changes needed to command line usage:
python cli.py --all --hybrid # Will now use memory-safe processing
Testing
Run the test suite to verify fixes:
python test_memory_fix.py
Memory Thresholds
- Maximum memory: 75% (was unlimited)
- Maximum swap: 30% (was unlimited)
- Feature limit: 10,000 per image (was unlimited)
- Dynamic workers: Based on feature count and memory availability