ai_qc/backend/USAGE_REPORTS.md
nickviljoen 8bc1256e82 Add usage tracking reports, profile versioning, and token tracking
Implements three major feature enhancements:

1. Usage Tracking Reports
   - Command-line tool (generate_usage_report.py) for comprehensive usage reports
   - Supports text, JSON, and CSV output formats
   - Filters by date range, client, and user
   - Aggregates statistics by client, user, profile, and date
   - Automated report generation via cron jobs

2. Profile Auto-Versioning & Visibility Control
   - Automatic version control: edits create new versions (v2, v3, etc.)
   - Original profiles preserved for rollback capability
   - Profile visibility control (all clients vs client-specific)
   - Client-profile relationship management with dynamic updates
   - Audit trail with timestamps and user tracking

3. Actual Token Usage Tracking
   - Captures real token counts from OpenAI and Gemini APIs
   - Precise cost calculations instead of estimates (99% accuracy)
   - Per-check and per-provider token breakdowns
   - Pricing validation tool (validate_pricing.py)
   - Token usage optimization recommendations

Key Files Added:
- backend/generate_usage_report.py - Usage report generator
- backend/validate_pricing.py - Pricing validation tool
- backend/USAGE_REPORTS.md - Usage reports documentation
- backend/PROFILE_MANAGEMENT.md - Profile versioning guide
- backend/TOKEN_TRACKING_ENHANCEMENT.md - Token tracking guide
- backend/PRICING_GUIDE.md - Pricing validation guide
- backend/NEW_FEATURES_QUICKSTART.md - Quick start guide
- IMPLEMENTATION_SUMMARY.md - Complete implementation overview

Key Files Modified:
- backend/api_server.py - Profile versioning, token passthrough
- backend/client_config.py - Visibility-aware profile filtering
- backend/llm_config.py - Token usage extraction from APIs
- backend/usage_tracker.py - Actual token tracking and cost calculation
- CLAUDE.md - Updated documentation with new features

Benefits:
- Accurate cost tracking with real token usage
- Safe profile editing with version history
- Flexible profile visibility for multi-tenant setup
- Comprehensive usage analytics for optimization
- Better budget forecasting and client billing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 13:22:33 +02:00

302 lines
7.3 KiB
Markdown

# Usage Reports Guide
This guide explains how to generate and interpret usage reports from the AI QC system.
## Quick Start
### Generate a Basic Report
```bash
# From the backend directory
cd backend
# Generate report for all time
python generate_usage_report.py
# Generate report for last 7 days
python generate_usage_report.py --last-days 7
# Generate report for last 30 days
python generate_usage_report.py --last-days 30
```
### Generate Reports by Date Range
```bash
# Specific date range
python generate_usage_report.py --start-date 2026-02-01 --end-date 2026-02-28
# From a specific date to today
python generate_usage_report.py --start-date 2026-02-01
```
### Filter by Client
```bash
# Only show data for a specific client
python generate_usage_report.py --client diageo --last-days 30
python generate_usage_report.py --client unilever --last-days 7
python generate_usage_report.py --client loreal --last-days 30
python generate_usage_report.py --client general --last-days 30
```
## Output Formats
### Text Format (Default)
Human-readable format with sections for summary, by client, by user, and by profile.
```bash
python generate_usage_report.py --format text
```
### JSON Format
Machine-readable JSON format for integration with other tools.
```bash
python generate_usage_report.py --format json --output report.json
```
### CSV Format
Spreadsheet-compatible format for analysis in Excel or Google Sheets.
```bash
python generate_usage_report.py --format csv --output report.csv
```
## Save Report to File
```bash
# Save text report
python generate_usage_report.py --last-days 30 --output monthly_report.txt
# Save JSON report
python generate_usage_report.py --last-days 30 --format json --output monthly_report.json
# Save CSV report
python generate_usage_report.py --last-days 30 --format csv --output monthly_report.csv
```
## Example Reports
### Example 1: Monthly Report for All Clients
```bash
python generate_usage_report.py --last-days 30 --output reports/$(date +%Y-%m)_monthly_report.txt
```
### Example 2: Weekly Report by Client
```bash
# Generate individual reports for each client
for client in diageo unilever loreal general; do
python generate_usage_report.py --client $client --last-days 7 \
--output reports/${client}_weekly_$(date +%Y%m%d).txt
done
```
### Example 3: JSON Report for API Integration
```bash
python generate_usage_report.py --last-days 30 --format json | curl -X POST \
-H "Content-Type: application/json" \
-d @- \
https://your-analytics-api.com/reports
```
## Report Sections
### Summary
- **Total Analyses**: Total number of files analyzed
- **Total QC Checks**: Total number of individual QC checks performed
- **Total Estimated Cost**: Estimated cost in USD based on LLM usage
- **Average Checks per Analysis**: Average number of checks per file
- **Average Cost per Analysis**: Average cost per file analyzed
### Usage by Client
For each client:
- Number of analyses
- Total QC checks performed
- Number of unique users
- Average quality score
- Estimated cost
- Top 5 most-used profiles
### Usage by User
For each user:
- Name and email
- Number of analyses
- Total QC checks
- Average quality score
- Estimated cost
- Breakdown of clients used
### Usage by Profile
For each QC profile:
- Number of times used
- Total checks performed
- Average quality score
- List of clients using this profile
### Usage by Date
- Daily breakdown showing:
- Number of analyses per day
- Cost per day
## Cost Estimation
Cost estimates are based on:
- **OpenAI GPT-4**: ~$2.50 per 1M input tokens, ~$10.00 per 1M output tokens
- **Google Gemini 2.5 Pro**: ~$1.25 per 1M input tokens, ~$5.00 per 1M output tokens
Average estimates per check:
- ~1000 input tokens (prompt + image description)
- ~200 output tokens (analysis results)
**Note**: These are estimates and may vary based on actual usage patterns.
## Automation
### Daily Report Cron Job
Add to your crontab (`crontab -e`):
```bash
# Generate daily report at 8 AM
0 8 * * * cd /opt/ai_qc/backend && python generate_usage_report.py --last-days 1 --output reports/daily_$(date +\%Y\%m\%d).txt
# Generate weekly report every Monday at 9 AM
0 9 * * 1 cd /opt/ai_qc/backend && python generate_usage_report.py --last-days 7 --output reports/weekly_$(date +\%Y\%m\%d).txt
# Generate monthly report on 1st of each month at 10 AM
0 10 1 * * cd /opt/ai_qc/backend && python generate_usage_report.py --last-days 30 --output reports/monthly_$(date +\%Y\%m).txt
```
### Email Report Script
Create a bash script to email reports:
```bash
#!/bin/bash
# email_report.sh
REPORT_DATE=$(date +%Y-%m-%d)
REPORT_FILE="/tmp/ai_qc_report_${REPORT_DATE}.txt"
# Generate report
cd /opt/ai_qc/backend
python generate_usage_report.py --last-days 7 --output "$REPORT_FILE"
# Email report
mail -s "AI QC Weekly Report - ${REPORT_DATE}" \
your-email@company.com < "$REPORT_FILE"
# Clean up
rm "$REPORT_FILE"
```
## Troubleshooting
### No Data Found
If you see "No usage data found", check:
1. Usage logs exist in `backend/usage_logs/`
2. Date range is correct
3. Client filter (if used) matches existing client IDs
### Permission Errors
Ensure the script has read access to the usage logs directory:
```bash
chmod +r backend/usage_logs/*.jsonl
```
### Import Errors
Make sure you're running from the backend directory:
```bash
cd backend
python generate_usage_report.py
```
## Advanced Usage
### Combine with Other Tools
```bash
# Count total analyses
python generate_usage_report.py --format json | jq '.total_analyses'
# Get top user by usage
python generate_usage_report.py --format json | jq -r '.by_user | to_entries | sort_by(.value.count) | reverse | .[0].value.email'
# Calculate total cost for specific date range
python generate_usage_report.py --start-date 2026-02-01 --end-date 2026-02-28 --format json | jq '.total_cost'
```
### Monitor Trends
```bash
# Generate monthly reports and compare
for month in 01 02 03; do
python generate_usage_report.py \
--start-date 2026-${month}-01 \
--end-date 2026-${month}-31 \
--format json \
--output reports/2026-${month}.json
done
# Analyze trends
jq -s '[.[] | {month: (.by_date | keys | min), total: .total_analyses, cost: .total_cost}]' reports/2026-*.json
```
## Report Examples
### Sample Text Report
```
================================================================================
AI QC USAGE REPORT
================================================================================
Generated: 2026-02-02 12:00:00
SUMMARY
--------------------------------------------------------------------------------
Total Analyses: 156
Total QC Checks: 1,560
Total Estimated Cost: $78.00 USD
Average Checks per Analysis: 10.0
Average Cost per Analysis: $0.5000 USD
USAGE BY CLIENT
--------------------------------------------------------------------------------
DIAGEO
Analyses: 45
QC Checks: 495
Unique Users: 3
Average Score: 87.5/100
Estimated Cost: $24.75 USD
Top Profiles:
• diageo_key_visual: 30 analyses
• diageo_packaging: 15 analyses
UNILEVER
Analyses: 38
QC Checks: 570
Unique Users: 2
Average Score: 89.2/100
Estimated Cost: $28.50 USD
Top Profiles:
• unilever_key_visual: 25 analyses
• unilever_packaging: 13 analyses
...
```
## Support
For issues or questions about usage reports:
1. Check this guide first
2. Review log files in `backend/usage_logs/`
3. Contact the development team