220 lines
No EOL
8.3 KiB
Markdown
220 lines
No EOL
8.3 KiB
Markdown
# OpenAI API Cost Tracking Verification Report
|
|
|
|
## Executive Summary
|
|
|
|
✅ **All OpenAI API calls are properly instrumented with cost tracking**
|
|
|
|
After comprehensive code examination, I can confirm that **all 4 OpenAI API calls** in the codebase have been properly instrumented with token usage extraction and cost tracking.
|
|
|
|
## Complete API Call Inventory
|
|
|
|
### API Call 1: One-at-a-Time Detection (Multiprocessing)
|
|
- **Location**: `openai_detector.py:140`
|
|
- **Function**: `process_single_master_detection_openai()`
|
|
- **Operation Type**: `"one_at_a_time_detection"`
|
|
- **Cost Tracking**: ✅ **IMPLEMENTED**
|
|
- **Method**: Token usage extracted in worker process, cost tracked in main process
|
|
- **Usage**: Individual master image comparisons with multiprocessing
|
|
|
|
```python
|
|
# Line 140: API call in worker process
|
|
response = client.chat.completions.create(...)
|
|
|
|
# Lines 167-173: Token usage extraction
|
|
token_usage_data = {
|
|
'prompt_tokens': response.usage.prompt_tokens,
|
|
'completion_tokens': response.usage.completion_tokens,
|
|
'total_tokens': response.usage.total_tokens,
|
|
'cached_tokens': getattr(response.usage, 'cached_tokens', 0)
|
|
}
|
|
|
|
# Lines 617-626: Cost tracking in main process
|
|
cost_calculator.track_api_call(
|
|
operation_type="one_at_a_time_detection",
|
|
prompt_tokens=token_data['prompt_tokens'],
|
|
completion_tokens=token_data['completion_tokens'],
|
|
cached_tokens=token_data['cached_tokens'],
|
|
layout_name=layout_name,
|
|
master_id=master_id
|
|
)
|
|
```
|
|
|
|
### API Call 2: Regular Detection (Batch)
|
|
- **Location**: `openai_detector.py:424`
|
|
- **Function**: `make_robust_api_call()`
|
|
- **Operation Type**: `"detection"`
|
|
- **Cost Tracking**: ✅ **IMPLEMENTED**
|
|
- **Method**: Direct cost tracking in same process
|
|
- **Usage**: Batch comparison of all masters against layout
|
|
|
|
```python
|
|
# Line 424: API call
|
|
response = self.client.chat.completions.create(...)
|
|
|
|
# Lines 436-444: Cost tracking
|
|
if hasattr(response, 'usage') and response.usage:
|
|
token_usage = extract_token_usage_from_response(response)
|
|
cost_calculator.track_api_call(
|
|
operation_type="detection",
|
|
prompt_tokens=token_usage.prompt_tokens,
|
|
completion_tokens=token_usage.completion_tokens,
|
|
cached_tokens=token_usage.cached_tokens,
|
|
layout_name=operation_name
|
|
)
|
|
```
|
|
|
|
### API Call 3: Censorship Detection (Standalone)
|
|
- **Location**: `openai_detector.py:1012`
|
|
- **Function**: `detect_layout_censorship()`
|
|
- **Operation Type**: `"censorship_detection"`
|
|
- **Cost Tracking**: ✅ **IMPLEMENTED**
|
|
- **Method**: Direct cost tracking in same process
|
|
- **Usage**: Standalone censorship analysis
|
|
|
|
```python
|
|
# Line 1012: API call
|
|
response = self.client.chat.completions.create(...)
|
|
|
|
# Lines 1034-1041: Cost tracking
|
|
if hasattr(response, 'usage') and response.usage:
|
|
token_usage = extract_token_usage_from_response(response)
|
|
cost_calculator.track_api_call(
|
|
operation_type="censorship_detection",
|
|
prompt_tokens=token_usage.prompt_tokens,
|
|
completion_tokens=token_usage.completion_tokens,
|
|
cached_tokens=token_usage.cached_tokens,
|
|
layout_name=Path(layout_path).name
|
|
)
|
|
```
|
|
|
|
### API Call 4: Combined Panel Counting + Censorship
|
|
- **Location**: `openai_detector.py:1283`
|
|
- **Function**: `count_panels_and_detect_censorship()`
|
|
- **Operation Type**: `"panel_counting_censorship"`
|
|
- **Cost Tracking**: ✅ **IMPLEMENTED**
|
|
- **Method**: Direct cost tracking in same process
|
|
- **Usage**: Hybrid mode primary API call
|
|
|
|
```python
|
|
# Line 1283: API call
|
|
response = self.client.chat.completions.create(...)
|
|
|
|
# Lines 1304-1312: Cost tracking
|
|
if hasattr(response, 'usage') and response.usage:
|
|
token_usage = extract_token_usage_from_response(response)
|
|
cost_calculator.track_api_call(
|
|
operation_type="panel_counting_censorship",
|
|
prompt_tokens=token_usage.prompt_tokens,
|
|
completion_tokens=token_usage.completion_tokens,
|
|
cached_tokens=token_usage.cached_tokens,
|
|
layout_name=layout_name
|
|
)
|
|
```
|
|
|
|
## Cost Tracking Architecture
|
|
|
|
### Operation Types Tracked
|
|
1. **`one_at_a_time_detection`**: Individual master comparisons (41 calls per layout)
|
|
2. **`detection`**: Batch master comparisons (1 call per layout)
|
|
3. **`censorship_detection`**: Standalone censorship analysis (1 call per layout)
|
|
4. **`panel_counting_censorship`**: Combined analysis for hybrid mode (1 call per layout)
|
|
|
|
### Multiprocessing Handling
|
|
- **Worker processes**: Extract token usage data from API responses
|
|
- **Main process**: Collects token data and performs cost calculations
|
|
- **Thread-safe**: No shared state between processes
|
|
- **Error handling**: Graceful handling of missing token data
|
|
|
|
### Cost Tracking Features
|
|
- **Real-time tracking**: Cost calculated immediately after each API call
|
|
- **Per-layout breakdown**: Cost associated with specific layout files
|
|
- **Master-level granularity**: Individual costs for one-at-a-time mode
|
|
- **Session summaries**: Comprehensive cost reporting across all operations
|
|
|
|
## Verification Methods Used
|
|
|
|
### 1. **Code Search**
|
|
- Searched for all `client.chat.completions.create` calls
|
|
- Verified each call has corresponding cost tracking
|
|
- Confirmed no orphaned API calls exist
|
|
|
|
### 2. **Manual Code Review**
|
|
- Examined each API call location
|
|
- Verified token extraction implementation
|
|
- Confirmed cost tracking integration
|
|
|
|
### 3. **Architecture Analysis**
|
|
- Analyzed multiprocessing token data flow
|
|
- Verified main process cost collection
|
|
- Confirmed operation type categorization
|
|
|
|
## Cost Tracking Coverage Summary
|
|
|
|
| API Call Location | Function | Operation Type | Cost Tracking | Status |
|
|
|------------------|----------|----------------|---------------|---------|
|
|
| `openai_detector.py:140` | `process_single_master_detection_openai()` | `one_at_a_time_detection` | ✅ | Complete |
|
|
| `openai_detector.py:424` | `make_robust_api_call()` | `detection` | ✅ | Complete |
|
|
| `openai_detector.py:1012` | `detect_layout_censorship()` | `censorship_detection` | ✅ | Complete |
|
|
| `openai_detector.py:1283` | `count_panels_and_detect_censorship()` | `panel_counting_censorship` | ✅ | Complete |
|
|
|
|
## Usage Mode Coverage
|
|
|
|
### ✅ **OpenAI Mode (Regular)**
|
|
- **API Call**: `detection` (1 call per layout)
|
|
- **Cost Tracking**: Fully implemented
|
|
- **Usage**: `--openai`
|
|
|
|
### ✅ **OpenAI Mode (One-at-a-Time)**
|
|
- **API Call**: `one_at_a_time_detection` (41 calls per layout)
|
|
- **Cost Tracking**: Fully implemented with multiprocessing support
|
|
- **Usage**: `--openai --one-at-a-time`
|
|
|
|
### ✅ **Hybrid Mode**
|
|
- **API Call**: `panel_counting_censorship` (1 call per layout)
|
|
- **Cost Tracking**: Fully implemented
|
|
- **Usage**: `--hybrid`
|
|
|
|
### ✅ **Hybrid Mode with Fallback**
|
|
- **API Calls**: `panel_counting_censorship` + `one_at_a_time_detection` (1 + 41 calls)
|
|
- **Cost Tracking**: Both operation types tracked separately
|
|
- **Usage**: `--hybrid --fallback-one-at-a-time`
|
|
|
|
### ✅ **CEN Refinement**
|
|
- **API Call**: `censorship_detection` (additional call when needed)
|
|
- **Cost Tracking**: Fully implemented
|
|
- **Usage**: `--refinement-mode`
|
|
|
|
## Token Usage Data Captured
|
|
|
|
For each API call, the following token data is captured:
|
|
- **Prompt tokens**: Input tokens sent to the API
|
|
- **Completion tokens**: Output tokens generated by the API
|
|
- **Total tokens**: Sum of prompt and completion tokens
|
|
- **Cached tokens**: Tokens from cached input (if applicable)
|
|
|
|
## Cost Calculation
|
|
|
|
Using OpenAI o3 pricing:
|
|
- **Input tokens**: $2.00 per million tokens
|
|
- **Cached input**: $0.50 per million tokens
|
|
- **Output tokens**: $8.00 per million tokens
|
|
|
|
## Error Handling
|
|
|
|
All API calls include proper error handling for cost tracking:
|
|
- **Missing usage data**: Graceful handling when API response lacks token information
|
|
- **API failures**: Cost tracking doesn't interfere with error handling
|
|
- **Multiprocessing errors**: Worker process failures don't break cost tracking
|
|
|
|
## Testing Coverage
|
|
|
|
Cost tracking can be tested with:
|
|
- **Unit tests**: `test_cost_calculator.py`
|
|
- **Integration tests**: `test_cost_tracking_integration.py`
|
|
- **One-at-a-time tests**: `test_one_at_a_time_cost_tracking.py`
|
|
|
|
## Conclusion
|
|
|
|
✅ **VERIFICATION COMPLETE**: All OpenAI API calls in the codebase are properly instrumented with comprehensive cost tracking. The implementation covers all usage modes, operation types, and edge cases including multiprocessing and error handling.
|
|
|
|
The cost tracking system provides complete visibility into OpenAI API usage costs across all detection modes and operational scenarios. |