ai_qc/backend/PRICING_GUIDE.md
nickviljoen 8bc1256e82 Add usage tracking reports, profile versioning, and token tracking
Implements three major feature enhancements:

1. Usage Tracking Reports
   - Command-line tool (generate_usage_report.py) for comprehensive usage reports
   - Supports text, JSON, and CSV output formats
   - Filters by date range, client, and user
   - Aggregates statistics by client, user, profile, and date
   - Automated report generation via cron jobs

2. Profile Auto-Versioning & Visibility Control
   - Automatic version control: edits create new versions (v2, v3, etc.)
   - Original profiles preserved for rollback capability
   - Profile visibility control (all clients vs client-specific)
   - Client-profile relationship management with dynamic updates
   - Audit trail with timestamps and user tracking

3. Actual Token Usage Tracking
   - Captures real token counts from OpenAI and Gemini APIs
   - Precise cost calculations instead of estimates (99% accuracy)
   - Per-check and per-provider token breakdowns
   - Pricing validation tool (validate_pricing.py)
   - Token usage optimization recommendations

Key Files Added:
- backend/generate_usage_report.py - Usage report generator
- backend/validate_pricing.py - Pricing validation tool
- backend/USAGE_REPORTS.md - Usage reports documentation
- backend/PROFILE_MANAGEMENT.md - Profile versioning guide
- backend/TOKEN_TRACKING_ENHANCEMENT.md - Token tracking guide
- backend/PRICING_GUIDE.md - Pricing validation guide
- backend/NEW_FEATURES_QUICKSTART.md - Quick start guide
- IMPLEMENTATION_SUMMARY.md - Complete implementation overview

Key Files Modified:
- backend/api_server.py - Profile versioning, token passthrough
- backend/client_config.py - Visibility-aware profile filtering
- backend/llm_config.py - Token usage extraction from APIs
- backend/usage_tracker.py - Actual token tracking and cost calculation
- CLAUDE.md - Updated documentation with new features

Benefits:
- Accurate cost tracking with real token usage
- Safe profile editing with version history
- Flexible profile visibility for multi-tenant setup
- Comprehensive usage analytics for optimization
- Better budget forecasting and client billing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 13:22:33 +02:00

9.5 KiB

Pricing Validation and Update Guide

Overview

This guide explains how to validate and update LLM pricing for accurate cost tracking.

Current Pricing

OpenAI GPT-4o

  • Input: $2.50 per 1M tokens ($0.0025 per 1K)
  • Output: $10.00 per 1M tokens ($0.0100 per 1K)
  • Last Verified: 2026-02-02
  • Source: https://openai.com/api/pricing/

Google Gemini 2.5 Pro

  • Input: $1.25 per 1M tokens ($0.00125 per 1K)
  • Output: $5.00 per 1M tokens ($0.0050 per 1K)
  • Last Verified: 2026-02-02
  • Source: https://ai.google.dev/pricing

Validation Tool

Quick Start

Run the pricing validation tool to check current configuration:

cd backend
python validate_pricing.py

What It Shows

  1. Current Pricing Configuration

    • Displays configured pricing for each provider
    • Shows example cost calculations
    • Indicates when pricing was last verified
  2. Actual Token Usage Analysis (once you have data)

    • Total analyses with token data
    • Average tokens per analysis by provider
    • Per-check token usage breakdown
    • Actual costs from real usage
  3. Estimate Accuracy Analysis (once you have data)

    • Compares default estimates vs actual usage
    • Shows percentage differences
    • Recommends updates if estimates are off by >20%

Example Output

================================================================================
PRICING VALIDATION REPORT
================================================================================
Generated: 2026-02-02 13:19:26

CURRENT PRICING CONFIGURATION
--------------------------------------------------------------------------------

OpenAI:
  Model: gpt-4o
  Input: $0.0025 per 1K tokens ($2.50 per 1M)
  Output: $0.0100 per 1K tokens ($10.00 per 1M)
  Last Verified: 2026-02-02
  Example: 1000 input + 200 output tokens = $0.0045

Gemini:
  Model: gemini-2.5-pro
  Input: $0.0013 per 1K tokens ($1.25 per 1M)
  Output: $0.0050 per 1K tokens ($5.00 per 1M)
  Last Verified: 2026-02-02
  Example: 1000 input + 200 output tokens = $0.0023

================================================================================
ACTUAL TOKEN USAGE ANALYSIS
================================================================================

OPENAI
--------------------------------------------------------------------------------
Analyses: 45
Total Tokens: 567,234
Prompt Tokens: 478,123
Completion Tokens: 89,111
Total Cost: $2.45

Average per Analysis:
  Total Tokens: 12,605
  Prompt Tokens: 10,625
  Completion Tokens: 1,980
  Cost: $0.0544

Per-Check Averages (Top 10 by token usage):
  logo_visibility_general:
    Count: 45
    Avg Tokens: 1,456 (Prompt: 1,234, Completion: 222)
  product_visibility_general:
    Count: 45
    Avg Tokens: 1,389 (Prompt: 1,178, Completion: 211)

================================================================================
ESTIMATE ACCURACY ANALYSIS
================================================================================

DEFAULT ESTIMATES (used when actual data unavailable):
  Prompt Tokens: 1000
  Completion Tokens: 200
  Total Tokens: 1200

OPENAI - ACTUAL vs ESTIMATE
--------------------------------------------------------------------------------
Actual Average per Analysis:
  Prompt: 10625 tokens
  Completion: 1980 tokens
  Total: 12605 tokens

Difference from Estimate:
  Prompt: +962.5%
  Completion: +890.0%
  Total: +950.4%

Cost Comparison:
  Estimated: $0.0045 per analysis
  Actual: $0.0544 per analysis
  Difference: +1108.9%

⚠️  RECOMMENDATION: Update default estimates for OpenAI
   Suggested values:
     Prompt Tokens: 10625
     Completion Tokens: 1980

================================================================================

Collecting Validation Data

Step 1: Run Test Analyses

Run a few test analyses to collect actual token usage data:

# Upload and analyze 3-5 different files through the web UI
# Use different profiles to get varied data
# Make sure to use both OpenAI and Gemini checks

Step 2: Run Validation

After running several analyses:

python validate_pricing.py

Step 3: Review Results

Look for:

  • Good: Estimates within ±20% of actual usage
  • ⚠️ Update Needed: Estimates off by >20%
  • 🔴 Urgent: Estimates off by >50% or pricing outdated

Updating Pricing

Edit usage_tracker.py directly:

COST_PER_1K_TOKENS = {
    'OpenAI': {
        'input': 0.0025,   # Update this value
        'output': 0.010,   # Update this value
        'model': 'gpt-4o',
        'last_verified': '2026-02-02'  # Update date
    },
    'Gemini': {
        'input': 0.00125,  # Update this value
        'output': 0.005,   # Update this value
        'model': 'gemini-2.5-pro',
        'last_verified': '2026-02-02'  # Update date
    }
}

Method 2: Programmatic Update

Update via Python:

from usage_tracker import update_pricing

# Update OpenAI pricing
update_pricing('OpenAI', input_cost_per_1k=0.0025, output_cost_per_1k=0.010)

# Update Gemini pricing
update_pricing('Gemini', input_cost_per_1k=0.00125, output_cost_per_1k=0.005)

After Updating

  1. Restart the application
  2. Run validation again: python validate_pricing.py
  3. Verify new pricing is shown correctly

Updating Default Estimates

If the validation tool recommends updating estimates (when actual usage differs significantly):

Edit usage_tracker.py

Find the fallback logic in _calculate_analysis_cost():

# If no token data available, use estimates as fallback
if total_tokens == 0:
    prompt_tokens = 1000  # Update this based on validation
    completion_tokens = 200  # Update this based on validation
    total_tokens = prompt_tokens + completion_tokens

Update these values based on the validation report's recommendations.

Monitoring Pricing Changes

Set Up Monthly Checks

Create a cron job to remind you to check pricing:

# Add to crontab (crontab -e)
# Check pricing on 1st of each month at 9 AM
0 9 1 * * cd /opt/ai_qc/backend && python validate_pricing.py > /tmp/pricing_check.txt && mail -s "Monthly Pricing Check" your-email@company.com < /tmp/pricing_check.txt

Subscribe to Updates

Cost Optimization Tips

Based on Validation Data

Once you have actual token usage data:

  1. Identify High-Token Checks

    • Look at per-check averages in validation report
    • Consider optimizing prompts for checks using >2000 tokens
  2. Compare Provider Costs

    • Check which provider is more cost-effective for your usage
    • Consider switching checks to the more economical provider
  3. Optimize by Check Type

    • Simple checks → Gemini (lower cost)
    • Complex analysis → OpenAI (if higher accuracy needed)

Example Analysis

# Generate detailed usage report
python generate_usage_report.py --last-days 30 --format json > usage.json

# Find most expensive checks
jq '.by_provider | to_entries | .[] | {provider: .key, cost: .value.cost}' usage.json | sort -k2 -rn

Frequently Asked Questions

Q: How often should I validate pricing?

A: Monthly or when you notice unexpected cost increases.

Q: What if actual usage is much higher than estimates?

A:

  1. Update the default estimates in the code
  2. Investigate if prompts can be optimized
  3. Consider if model selection is appropriate

Q: Should I use OpenAI or Gemini?

A:

  • Gemini: ~50% lower cost, good for standard checks
  • OpenAI: Higher accuracy for complex analysis

Run validation to see actual costs for your specific checks.

Q: How accurate are the cost calculations?

A:

  • With token tracking: ~99% accurate (actual API token counts)
  • With estimates only: ~70-80% accurate (depends on prompt variation)

Q: Can I set different pricing for different models?

A: Currently tracks by provider (OpenAI/Gemini). If you switch to different models (e.g., GPT-4 Turbo), update the pricing accordingly.

Troubleshooting

Issue: Validation shows no token data

Solution:

  1. Run some analyses through the web UI
  2. Token tracking only works on NEW analyses (after the enhancement)
  3. Old log entries won't have token data

Issue: Pricing seems incorrect

Solution:

  1. Check official pricing pages (links above)
  2. Verify you're using the correct model (gpt-4o, gemini-2.5-pro)
  3. Update pricing in usage_tracker.py
  4. Restart application

Issue: Estimates are way off

Solution:

  1. Collect more data (run 10-20 analyses)
  2. Check if you're using unusually complex/simple prompts
  3. Update default estimates based on validation recommendations

Quick Reference

Check Current Pricing

python validate_pricing.py

Verify Latest API Pricing

Update Pricing

Edit backend/usage_tracker.pyCOST_PER_1K_TOKENS

Generate Cost Report

python generate_usage_report.py --last-days 30

Monthly Checklist

  • Run python validate_pricing.py
  • Check official pricing pages for changes
  • Update pricing if changed
  • Review actual vs estimated accuracy
  • Update estimates if off by >20%
  • Generate monthly cost report

Support

For pricing-related questions:

  1. Run the validation tool first
  2. Review this guide
  3. Check official provider pricing pages
  4. Contact the development team with validation report output