7.5 KiB
One-at-a-Time Cost Tracking Implementation
This document describes the implementation of detailed cost tracking for the one-at-a-time detection mode, which makes individual API calls for each master image.
Implementation Overview
The one-at-a-time mode now tracks the cost of each individual API call made to the OpenAI o3 model, providing detailed insights into the cost structure of this high-accuracy detection method.
Key Features Implemented
1. Token Usage Extraction in Multiprocessing
Each worker process now extracts token usage data from the OpenAI API response:
# In process_single_master_detection_openai()
token_usage_data = None
if hasattr(response, 'usage') and response.usage:
token_usage_data = {
'prompt_tokens': response.usage.prompt_tokens,
'completion_tokens': response.usage.completion_tokens,
'total_tokens': response.usage.total_tokens,
'cached_tokens': getattr(response.usage, 'cached_tokens', 0)
}
# Include in return value
result['token_usage'] = token_usage_data
2. Cost Tracking in Main Process
The main process collects token usage data from all worker processes and tracks costs:
# Track cost for this API call if token usage data is available
if 'token_usage' in result and result['token_usage']:
token_data = result['token_usage']
api_call_cost = cost_calculator.track_api_call(
operation_type="one_at_a_time_detection",
prompt_tokens=token_data['prompt_tokens'],
completion_tokens=token_data['completion_tokens'],
cached_tokens=token_data['cached_tokens'],
layout_name=layout_name,
master_id=master_id
)
3. Real-time Cost Progress
During processing, the system shows cost progress every 10 completed masters:
Processing 10/41 masters...
→ API call cost: $0.0234 (Running total: $0.2340)
Processing 20/41 masters...
→ API call cost: $0.0198 (Running total: $0.4538)
4. Detailed Cost Analysis
The final results include comprehensive cost information:
'analysis': 'Process-based one-at-a-time analysis completed. Made 41 separate API calls (one per master). Found 2 exact matches out of 41 masters checked using 8 concurrent processes.',
'api_calls_made': 41, # One API call per master
Cost Comparison: One-at-a-Time vs Hybrid Mode
One-at-a-Time Mode
- API Calls: 41 calls (one per master image)
- Typical Cost: $0.50 - $2.00 per layout
- Accuracy: Highest (individual comparison)
- Use Case: When maximum accuracy is required
Hybrid Mode
- API Calls: 1 call (panel counting + censorship)
- Typical Cost: $0.01 - $0.05 per layout
- Accuracy: Very good (local analysis for simple layouts)
- Use Case: Cost-efficient processing of large batches
Cost Savings
Hybrid mode provides approximately 95-98% cost savings compared to one-at-a-time mode while maintaining good accuracy for most layouts.
Usage Examples
Enable One-at-a-Time Cost Tracking
# Basic one-at-a-time with cost tracking
python cli.py --test --openai --one-at-a-time --enable-cost-tracking
# With detailed cost report
python cli.py --test --openai --one-at-a-time --enable-cost-tracking --cost-report
# With lower concurrency for better cost monitoring
python cli.py --test --openai --one-at-a-time --concurrent-workers 3 --enable-cost-tracking
Hybrid Mode with Fallback
# Hybrid mode with fallback to one-at-a-time when needed
python cli.py --test --hybrid --fallback-one-at-a-time --enable-cost-tracking
Cost Tracking Output
Session Summary
COST TRACKING SUMMARY
============================================================
Total cost: $1.2345
Total tokens: 45,678
- Input tokens: 28,456
- Output tokens: 17,222
- Cached tokens: 3,456
Total API calls: 41
Layouts processed: 1
Averages:
- Cost per layout: $1.2345
- Tokens per layout: 45,678.0
- API calls per layout: 41.0
- Cost per 1K tokens: $0.0270
Operation breakdown:
- one_at_a_time_detection: 41 calls
============================================================
Cost Report JSON
{
"session_summary": {
"operation_breakdown": {
"one_at_a_time_detection": 41
}
},
"detailed_api_calls": [
{
"operation_type": "one_at_a_time_detection",
"master_id": "1011A_1011_05",
"token_usage": {
"prompt_tokens": 1200,
"completion_tokens": 150,
"total_tokens": 1350,
"cached_tokens": 0
},
"total_cost": 0.0036,
"layout_name": "test_layout.jpg"
}
]
}
Integration with Hybrid Mode
The one-at-a-time cost tracking also works with the hybrid mode's fallback mechanism:
Hybrid Mode Fallback
When hybrid mode uses the fallback to one-at-a-time detection:
- Initial API call: Panel counting + censorship detection
- Fallback API calls: 41 individual master comparisons
- Total API calls: 42 (1 + 41)
- Cost tracking: Tracks both operation types separately
Example Hybrid Fallback Cost Breakdown
Operation breakdown:
- panel_counting_censorship: 1 call
- one_at_a_time_detection: 41 calls
Testing
Run the comprehensive test to see one-at-a-time cost tracking in action:
python test_one_at_a_time_cost_tracking.py
This test will:
- Run one-at-a-time mode with cost tracking
- Show real-time cost progress
- Generate detailed cost report
- Compare costs with hybrid mode
Technical Details
Multiprocessing Architecture
- Worker processes: Extract token usage from API responses
- Main process: Collects token data and tracks costs
- No shared state: Each process handles its own API calls
- Thread-safe: Cost tracking is done in the main process only
Error Handling
- Missing token data: Warns when API response lacks usage information
- API failures: Handles cases where individual API calls fail
- Graceful degradation: Cost tracking failure doesn't break processing
Performance Impact
- Minimal overhead: Token extraction adds negligible processing time
- Memory efficient: Token data is small and temporary
- No API rate impact: No additional API calls are made
Benefits
1. Detailed Cost Visibility
- See exact cost per master image comparison
- Identify expensive vs. cheap operations
- Track cost trends over time
2. Cost Optimization
- Compare one-at-a-time vs. hybrid mode costs
- Make informed decisions about detection method
- Optimize concurrent workers for cost efficiency
3. Budget Planning
- Accurate cost estimates for large batches
- Understand cost implications of different modes
- Set appropriate spending limits
4. Performance Analysis
- Correlate cost with accuracy
- Identify optimal worker counts
- Monitor API efficiency
Future Enhancements
Potential improvements include:
- Per-master cost optimization: Identify which masters are most expensive
- Dynamic worker adjustment: Reduce workers when costs are high
- Cost-based fallback: Use cost thresholds to decide between modes
- Master image prioritization: Process cheaper masters first
Conclusion
The one-at-a-time cost tracking implementation provides complete visibility into the cost structure of the most accurate detection method. Combined with hybrid mode cost tracking, users can make informed decisions about the trade-offs between accuracy and cost.
The implementation maintains the existing performance characteristics while adding comprehensive cost monitoring capabilities that help optimize both accuracy and budget.