ai_qc/backend/PRICING_GUIDE.md
nickviljoen 8bc1256e82 Add usage tracking reports, profile versioning, and token tracking
Implements three major feature enhancements:

1. Usage Tracking Reports
   - Command-line tool (generate_usage_report.py) for comprehensive usage reports
   - Supports text, JSON, and CSV output formats
   - Filters by date range, client, and user
   - Aggregates statistics by client, user, profile, and date
   - Automated report generation via cron jobs

2. Profile Auto-Versioning & Visibility Control
   - Automatic version control: edits create new versions (v2, v3, etc.)
   - Original profiles preserved for rollback capability
   - Profile visibility control (all clients vs client-specific)
   - Client-profile relationship management with dynamic updates
   - Audit trail with timestamps and user tracking

3. Actual Token Usage Tracking
   - Captures real token counts from OpenAI and Gemini APIs
   - Precise cost calculations instead of estimates (99% accuracy)
   - Per-check and per-provider token breakdowns
   - Pricing validation tool (validate_pricing.py)
   - Token usage optimization recommendations

Key Files Added:
- backend/generate_usage_report.py - Usage report generator
- backend/validate_pricing.py - Pricing validation tool
- backend/USAGE_REPORTS.md - Usage reports documentation
- backend/PROFILE_MANAGEMENT.md - Profile versioning guide
- backend/TOKEN_TRACKING_ENHANCEMENT.md - Token tracking guide
- backend/PRICING_GUIDE.md - Pricing validation guide
- backend/NEW_FEATURES_QUICKSTART.md - Quick start guide
- IMPLEMENTATION_SUMMARY.md - Complete implementation overview

Key Files Modified:
- backend/api_server.py - Profile versioning, token passthrough
- backend/client_config.py - Visibility-aware profile filtering
- backend/llm_config.py - Token usage extraction from APIs
- backend/usage_tracker.py - Actual token tracking and cost calculation
- CLAUDE.md - Updated documentation with new features

Benefits:
- Accurate cost tracking with real token usage
- Safe profile editing with version history
- Flexible profile visibility for multi-tenant setup
- Comprehensive usage analytics for optimization
- Better budget forecasting and client billing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 13:22:33 +02:00

356 lines
9.5 KiB
Markdown

# Pricing Validation and Update Guide
## Overview
This guide explains how to validate and update LLM pricing for accurate cost tracking.
## Current Pricing
### OpenAI GPT-4o
- **Input**: $2.50 per 1M tokens ($0.0025 per 1K)
- **Output**: $10.00 per 1M tokens ($0.0100 per 1K)
- **Last Verified**: 2026-02-02
- **Source**: https://openai.com/api/pricing/
### Google Gemini 2.5 Pro
- **Input**: $1.25 per 1M tokens ($0.00125 per 1K)
- **Output**: $5.00 per 1M tokens ($0.0050 per 1K)
- **Last Verified**: 2026-02-02
- **Source**: https://ai.google.dev/pricing
## Validation Tool
### Quick Start
Run the pricing validation tool to check current configuration:
```bash
cd backend
python validate_pricing.py
```
### What It Shows
1. **Current Pricing Configuration**
- Displays configured pricing for each provider
- Shows example cost calculations
- Indicates when pricing was last verified
2. **Actual Token Usage Analysis** (once you have data)
- Total analyses with token data
- Average tokens per analysis by provider
- Per-check token usage breakdown
- Actual costs from real usage
3. **Estimate Accuracy Analysis** (once you have data)
- Compares default estimates vs actual usage
- Shows percentage differences
- Recommends updates if estimates are off by >20%
### Example Output
```
================================================================================
PRICING VALIDATION REPORT
================================================================================
Generated: 2026-02-02 13:19:26
CURRENT PRICING CONFIGURATION
--------------------------------------------------------------------------------
OpenAI:
Model: gpt-4o
Input: $0.0025 per 1K tokens ($2.50 per 1M)
Output: $0.0100 per 1K tokens ($10.00 per 1M)
Last Verified: 2026-02-02
Example: 1000 input + 200 output tokens = $0.0045
Gemini:
Model: gemini-2.5-pro
Input: $0.0013 per 1K tokens ($1.25 per 1M)
Output: $0.0050 per 1K tokens ($5.00 per 1M)
Last Verified: 2026-02-02
Example: 1000 input + 200 output tokens = $0.0023
================================================================================
ACTUAL TOKEN USAGE ANALYSIS
================================================================================
OPENAI
--------------------------------------------------------------------------------
Analyses: 45
Total Tokens: 567,234
Prompt Tokens: 478,123
Completion Tokens: 89,111
Total Cost: $2.45
Average per Analysis:
Total Tokens: 12,605
Prompt Tokens: 10,625
Completion Tokens: 1,980
Cost: $0.0544
Per-Check Averages (Top 10 by token usage):
logo_visibility_general:
Count: 45
Avg Tokens: 1,456 (Prompt: 1,234, Completion: 222)
product_visibility_general:
Count: 45
Avg Tokens: 1,389 (Prompt: 1,178, Completion: 211)
================================================================================
ESTIMATE ACCURACY ANALYSIS
================================================================================
DEFAULT ESTIMATES (used when actual data unavailable):
Prompt Tokens: 1000
Completion Tokens: 200
Total Tokens: 1200
OPENAI - ACTUAL vs ESTIMATE
--------------------------------------------------------------------------------
Actual Average per Analysis:
Prompt: 10625 tokens
Completion: 1980 tokens
Total: 12605 tokens
Difference from Estimate:
Prompt: +962.5%
Completion: +890.0%
Total: +950.4%
Cost Comparison:
Estimated: $0.0045 per analysis
Actual: $0.0544 per analysis
Difference: +1108.9%
⚠️ RECOMMENDATION: Update default estimates for OpenAI
Suggested values:
Prompt Tokens: 10625
Completion Tokens: 1980
================================================================================
```
## Collecting Validation Data
### Step 1: Run Test Analyses
Run a few test analyses to collect actual token usage data:
```bash
# Upload and analyze 3-5 different files through the web UI
# Use different profiles to get varied data
# Make sure to use both OpenAI and Gemini checks
```
### Step 2: Run Validation
After running several analyses:
```bash
python validate_pricing.py
```
### Step 3: Review Results
Look for:
-**Good**: Estimates within ±20% of actual usage
- ⚠️ **Update Needed**: Estimates off by >20%
- 🔴 **Urgent**: Estimates off by >50% or pricing outdated
## Updating Pricing
### Method 1: Direct Edit (Recommended)
Edit `usage_tracker.py` directly:
```python
COST_PER_1K_TOKENS = {
'OpenAI': {
'input': 0.0025, # Update this value
'output': 0.010, # Update this value
'model': 'gpt-4o',
'last_verified': '2026-02-02' # Update date
},
'Gemini': {
'input': 0.00125, # Update this value
'output': 0.005, # Update this value
'model': 'gemini-2.5-pro',
'last_verified': '2026-02-02' # Update date
}
}
```
### Method 2: Programmatic Update
Update via Python:
```python
from usage_tracker import update_pricing
# Update OpenAI pricing
update_pricing('OpenAI', input_cost_per_1k=0.0025, output_cost_per_1k=0.010)
# Update Gemini pricing
update_pricing('Gemini', input_cost_per_1k=0.00125, output_cost_per_1k=0.005)
```
### After Updating
1. Restart the application
2. Run validation again: `python validate_pricing.py`
3. Verify new pricing is shown correctly
## Updating Default Estimates
If the validation tool recommends updating estimates (when actual usage differs significantly):
### Edit usage_tracker.py
Find the fallback logic in `_calculate_analysis_cost()`:
```python
# If no token data available, use estimates as fallback
if total_tokens == 0:
prompt_tokens = 1000 # Update this based on validation
completion_tokens = 200 # Update this based on validation
total_tokens = prompt_tokens + completion_tokens
```
Update these values based on the validation report's recommendations.
## Monitoring Pricing Changes
### Set Up Monthly Checks
Create a cron job to remind you to check pricing:
```bash
# Add to crontab (crontab -e)
# Check pricing on 1st of each month at 9 AM
0 9 1 * * cd /opt/ai_qc/backend && python validate_pricing.py > /tmp/pricing_check.txt && mail -s "Monthly Pricing Check" your-email@company.com < /tmp/pricing_check.txt
```
### Subscribe to Updates
- **OpenAI**: Subscribe to pricing announcements at https://openai.com/pricing
- **Google**: Follow Gemini updates at https://ai.google.dev/pricing
## Cost Optimization Tips
### Based on Validation Data
Once you have actual token usage data:
1. **Identify High-Token Checks**
- Look at per-check averages in validation report
- Consider optimizing prompts for checks using >2000 tokens
2. **Compare Provider Costs**
- Check which provider is more cost-effective for your usage
- Consider switching checks to the more economical provider
3. **Optimize by Check Type**
- Simple checks → Gemini (lower cost)
- Complex analysis → OpenAI (if higher accuracy needed)
### Example Analysis
```bash
# Generate detailed usage report
python generate_usage_report.py --last-days 30 --format json > usage.json
# Find most expensive checks
jq '.by_provider | to_entries | .[] | {provider: .key, cost: .value.cost}' usage.json | sort -k2 -rn
```
## Frequently Asked Questions
### Q: How often should I validate pricing?
**A:** Monthly or when you notice unexpected cost increases.
### Q: What if actual usage is much higher than estimates?
**A:**
1. Update the default estimates in the code
2. Investigate if prompts can be optimized
3. Consider if model selection is appropriate
### Q: Should I use OpenAI or Gemini?
**A:**
- **Gemini**: ~50% lower cost, good for standard checks
- **OpenAI**: Higher accuracy for complex analysis
Run validation to see actual costs for your specific checks.
### Q: How accurate are the cost calculations?
**A:**
- **With token tracking**: ~99% accurate (actual API token counts)
- **With estimates only**: ~70-80% accurate (depends on prompt variation)
### Q: Can I set different pricing for different models?
**A:** Currently tracks by provider (OpenAI/Gemini). If you switch to different models (e.g., GPT-4 Turbo), update the pricing accordingly.
## Troubleshooting
### Issue: Validation shows no token data
**Solution**:
1. Run some analyses through the web UI
2. Token tracking only works on NEW analyses (after the enhancement)
3. Old log entries won't have token data
### Issue: Pricing seems incorrect
**Solution**:
1. Check official pricing pages (links above)
2. Verify you're using the correct model (gpt-4o, gemini-2.5-pro)
3. Update pricing in `usage_tracker.py`
4. Restart application
### Issue: Estimates are way off
**Solution**:
1. Collect more data (run 10-20 analyses)
2. Check if you're using unusually complex/simple prompts
3. Update default estimates based on validation recommendations
## Quick Reference
### Check Current Pricing
```bash
python validate_pricing.py
```
### Verify Latest API Pricing
- OpenAI: https://openai.com/api/pricing/
- Gemini: https://ai.google.dev/pricing
### Update Pricing
Edit `backend/usage_tracker.py``COST_PER_1K_TOKENS`
### Generate Cost Report
```bash
python generate_usage_report.py --last-days 30
```
### Monthly Checklist
- [ ] Run `python validate_pricing.py`
- [ ] Check official pricing pages for changes
- [ ] Update pricing if changed
- [ ] Review actual vs estimated accuracy
- [ ] Update estimates if off by >20%
- [ ] Generate monthly cost report
## Support
For pricing-related questions:
1. Run the validation tool first
2. Review this guide
3. Check official provider pricing pages
4. Contact the development team with validation report output