Compare commits

...
Sign in to create a new pull request.

2 commits

Author SHA1 Message Date
nickviljoen
35672fee30 docs(hp): add CLAUDE_HP.md client doc + link from main CLAUDE.md
Documents the hp_copy_review profile, the Source Messaging Excel
reference-asset flow, the excel_processor pattern, cycle-1 shipped
state on v1.4.0, known limitations carried into cycle 2, and the
cycle 2/3 roadmap (Word/PPT processor, Box picker — deferred until
HP team feedback indicates they're needed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:28:17 +02:00
nickviljoen
a47be6f2fd feat(reports): surface est. cost + token usage on every per-analysis HTML
Standard QC reports (single + parallel) gain a 6th summary-grid card
"Tokens / Est. cost" showing aggregated token usage and a Gemini/OpenAI
priced estimate. Document-mode and document-diff reports gain a muted
footer line in the overall card matching the existing diff-report
"Tokens:" line.

New helpers in usage_tracker.py:
- estimate_cost(prompt, completion, provider) for single-call costs
- estimate_cost_for_checks(check_results) aggregating dict/list shapes

Cost rates already maintained in COST_PER_1K_TOKENS; no pricing change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:26:38 +02:00
6 changed files with 179 additions and 4 deletions

View file

@ -102,7 +102,7 @@ Profiles define check sets, weights, and LLM assignments. Profiles can be marked
| Honda | generic only | [CLAUDE_HONDA.md](CLAUDE_HONDA.md) |
| Rank | generic only | [CLAUDE_RANK.md](CLAUDE_RANK.md) |
| Google | generic only | _scope pending_ |
| HP | generic only | _scope pending_ |
| HP | `hp_copy_review` (1, copy review vs canonical Source Messaging) | [CLAUDE_HP.md](CLAUDE_HP.md) |
| Ferrero | generic only | _scope pending_ |
| General | generic only | [CLAUDE_GENERAL.md](CLAUDE_GENERAL.md) |

108
CLAUDE_HP.md Normal file
View file

@ -0,0 +1,108 @@
# HP Client Documentation
> Referenced from main CLAUDE.md. Detailed HP QC profile, Source Messaging reference-asset flow, Excel processor notes, and roadmap.
## Overview
HP QC is built around **copy review against canonical Source Messaging**. HP supplies an Excel workbook per product variant (e.g. Messi-Core, Messi-Mainstream) containing the approved KSPs, body copy, disclaimers, spec call-outs, and brand-name forms. The QC compares all visible copy on a marketing asset against that workbook and reports every discrepancy.
**Status (2026-05-17):** HP cycle 1 live on **prod (`v1.4.0`)** since 2026-05-17. Single profile (`hp_copy_review`), single check, Source Messaging Excel ingestion via `excel_processor.py`, structured-findings table in the report. Cycle 2 (Word/PPT processor) and cycle 3 (Box picker) deferred — scope only when HP team's first-week feedback indicates they need them.
## HP Profiles
### `hp_copy_review` — copy QC against Source Messaging (1 check)
`mode: asset`, `visibility: client_specific`, weight 10.0. Default profile for the HP client.
| Check | What it does | Weight |
|------|--------------|--------|
| `hp_copy_review` | Single Gemini 2.5 Pro call comparing every visible claim/headline/body/disclaimer/spec call-out/brand mention against the canonical Source Messaging Markdown. Returns 010 score + `findings[]` array (priority/category/quote/issue/suggested_fix/source_reference). | 10.0 |
If no Source Messaging reference asset is attached at QC time, the LLM is instructed to return score 0 with an explanatory message rather than grade blind. The score-zero rule is baked into the prompt, not enforced as a pre-LLM short-circuit — so it still costs one Gemini call.
## Source Messaging reference-asset flow
HP's workflow centres on uploading the product's Source Messaging `.xlsx` as a Reference Asset (Settings → Reference Assets → HP) **before** running QC. The pipeline:
1. **Upload** — user uploads `messi_core.xlsx` (or similar) via the Reference Assets UI.
2. **Dispatch** (`api_server.py`, `/api/brand_guidelines` POST handler) — the `.xlsx` extension triggers a two-step dispatch:
- First tries `localization_processor.parse_localization_matrix` (preserves existing localization-matrix workflow for other clients).
- Falls back to `excel_processor.process_excel_file` for HP Source Messaging (`asset_type='source_messaging'`).
3. **Extraction + summarisation** (`backend/excel_processor.py`) — openpyxl reads every sheet, Gemini 2.5 Pro summarises the raw cell content into structured Markdown under `brand_guidelines/files/{file_id}_summary.md`. Output sections:
- `## Product / Variant`
- `## Key Selling Points (KSPs)` — with ultra-short / short / medium / long variants
- `## Disclaimers / Footnotes` — numbered, with anchor claim
- `## Approved Brand and Product Names` — exact trademark glyphs (™, ®, ©)
- `## Variant Notes / Watch-outs`
- `## Verboten Phrasing`
4. **Injection at QC time**`get_reference_asset_content()` in `api_server.py` reads the `summary_path` from the file record and prepends the Markdown summary to the `hp_copy_review` check prompt as `Source Messaging Summary (extracted from <filename>):`.
5. **LLM call** — Gemini 2.5 Pro evaluates the marketing asset against the canonical Source Messaging Markdown, returning a structured JSON response.
6. **Rendering** — the `findings[]` array renders as a priority-coloured table via `_render_findings_table()` (HTML-escaped, used by both HTML report generators).
`backend/excel_processor.py` **never raises** — on extraction or summarisation failure it writes a degraded summary embedding the raw extracted text so the reference asset stays usable.
## Excel processor pattern
`excel_processor.py` mirrors the established `pdf_processor.py` pattern:
| Aspect | PDF processor | Excel processor |
|---|---|---|
| Extraction | PyMuPDF (all pages text) | openpyxl (all sheets, tab-aligned) |
| LLM | Gemini 2.5 Pro | Gemini 2.5 Pro |
| Output | `{file_id}_summary.txt` (~2000-4000 words) | `{file_id}_summary.md` (structured sections) |
| Raw cap | n/a | 50K chars (truncation marker if exceeded) |
| Never raises | yes (degraded summary fallback) | yes (degraded summary fallback) |
Public entry point: `process_excel_file(file_path: str, file_id: str) -> Tuple[str, str]` returning `(summary_text, summary_path)`.
## Routing
- `client_config.py` maps HP with `default_profile: 'hp_copy_review'` and the visible profile list `['hp_copy_review', 'static_general', 'video_general']`.
- `get_client_from_profile()` maps any profile id starting `hp_` → client `'hp'`. Without this branch HP reports would land in `output-dev/general/` instead of `output-dev/hp/` — fixed in 2026-05-17's v1.4.0 deploy.
- Media-plan integration: the standard `media_plan_processor.py` extracts `language` (case-insensitive column lookup) and surfaces it into the check prompt as `- Language: <value>`. HP's prompt is language-aware (UK English vs US English vs French etc.).
## Cycle 1 — what shipped (v1.4.0, 2026-05-17)
- HP client config promoted from `_scope pending_` to real entry.
- New `hp_copy_review` profile (single weighted check, client-specific visibility).
- New `hp_copy_review` check (`backend/visual_qc_apps/hp_copy_review/app.py`) — single Gemini call, structured JSON findings output.
- New `backend/excel_processor.py` — openpyxl extraction + Gemini summarisation.
- `/api/brand_guidelines` POST dispatch: `.xlsx` tries localisation matrix first, falls back to excel_processor.
- `get_reference_asset_content` extended to read `.xlsx` `summary_path` and inject the Markdown.
- Both HTML report generators get a shared `_render_findings_table` helper rendering `findings` arrays as priority-coloured tables.
- Media-plan `language` column extracted case-insensitively, surfaced into prompt context.
- `get_client_from_profile` extended to route `hp_*` profiles to `output-dev/hp/`.
## Known limitations carried into cycle 2
1. **Singular reference-asset selection.** User can only attach **one** Source Messaging file per QC run. If an asset has both Messi-Core and Messi-Mainstream variants side-by-side, the user has to pick one variant to grade against. Multi-reference plumbing is a future enhancement.
2. **No pre-LLM short-circuit when no Source Messaging is attached.** The "score 0" rule lives in the prompt, not in the dispatch layer, so a Source-less run still costs one Gemini call (~$0.01).
3. **Two cosmetic display issues** in the report metadata strip (pre-existing patterns, not HP-specific):
- `Weight: 1000.0%` — weight × 100 percentage formatting bug, harmless.
- `Reference Asset: Not required` — shows even when a reference *was* used; mismatch between metadata label and `reference_asset_used` JSON field.
## Cycle 2 (deferred) — Word / PPT processor
Mirror the `excel_processor` pattern for `.docx` and `.pptx` reference assets. Same shape:
- python-docx / python-pptx extraction
- Gemini 2.5 Pro summarisation
- `{file_id}_summary.md` output
- Same `/api/brand_guidelines` dispatch fallback
Scope only when HP team's feedback indicates they upload Word or PowerPoint Source Messaging in practice.
## Cycle 3 (deferred) — Box picker
Browse the HP-shared Box folder tree from the UI, multi-select files, run QC against the selected asset(s). Builds on the existing Box JWT scaffolding from L'Oreal Phase 4 (see `backend/box_jwt_client.py`, `backend/BOX_CLIENT_ONBOARDING.md`).
Scope only when HP team's feedback indicates Box is the canonical asset store for their workflow.
## Key files
- `backend/profiles/hp_copy_review.json` — profile config (single check, weight 10, Gemini)
- `backend/visual_qc_apps/hp_copy_review/app.py` — check implementation + prompt template + standalone `build_prompt()` helper
- `backend/excel_processor.py` — Source Messaging Excel ingestion
- `backend/client_config.py` — HP client entry, `get_client_from_profile()` routing
- `backend/api_server.py``/api/brand_guidelines` dispatch, `get_reference_asset_content` xlsx branch, `_render_findings_table` helper
- `docs/superpowers/specs/2026-05-17-hp-cycle-1-onboarding-design.md` — Cycle 1 design spec
- `docs/superpowers/plans/2026-05-17-hp-cycle-1-onboarding.md` — Cycle 1 implementation plan

View file

@ -950,17 +950,20 @@ def save_results_to_file(report_data, filename, output_mode='html', session_id=N
def generate_html_content(report_data, filename, file_path=None):
"""Generate HTML content for report data with expandable sections"""
from usage_tracker import estimate_cost_for_checks
# Define a function to get color based on score
def get_score_result(score):
if score >= 6:
return "Pass", "#28a745" # Green for pass
else:
return "Fail", "#dc3545" # Red for fail
# Get reference asset information from profile selection
profile_selection = report_data.get('profile_selection', {})
reference_asset = profile_selection.get('reference_asset', None)
reference_asset_used = profile_selection.get('reference_asset_used', False)
total_tokens, total_cost = estimate_cost_for_checks(report_data.get('results') or {})
# Build HTML for each check result with expandable sections
check_results_html = ""
@ -1225,6 +1228,10 @@ def generate_html_content(report_data, filename, file_path=None):
<div style="font-size: 1.2em; color: {'#28a745' if reference_asset_used else '#6c757d'}; font-weight: bold;">{'✅ Used' if reference_asset_used else ' None'}</div>
<div>Reference Asset</div>
</div>
<div class="summary-item">
<div style="font-size: 1.2em; font-weight: bold; color: #495057;">{total_tokens:,} / ${total_cost:.2f}</div>
<div>Tokens / Est. cost</div>
</div>
</div>
</div>
@ -1376,11 +1383,13 @@ def _render_technical_section_html(report):
def generate_comprehensive_html_report(analysis_result, filename, file_path=None):
"""Generate comprehensive HTML report similar to the web UI format"""
from usage_tracker import estimate_cost_for_checks
summary = analysis_result.get('summary', {})
qc_analysis = analysis_result.get('qc_analysis', {})
profile_selection = analysis_result.get('profile_selection', {})
check_results = qc_analysis.get('check_results', {})
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
overall_score = summary.get('overall_score', 0)
profile_name = profile_selection.get('suggested_profile', 'Unknown Profile')
@ -1388,6 +1397,7 @@ def generate_comprehensive_html_report(analysis_result, filename, file_path=None
completed_checks = qc_analysis.get('completed_checks', 0)
reference_asset = profile_selection.get('reference_asset', None)
reference_asset_used = profile_selection.get('reference_asset_used', False)
total_tokens, total_cost = estimate_cost_for_checks(check_results)
# Generate check results HTML
check_results_html = ''
@ -1539,6 +1549,10 @@ def generate_comprehensive_html_report(analysis_result, filename, file_path=None
<div style="font-size: 1.2em; color: {'#28a745' if reference_asset_used else '#6c757d'}; font-weight: bold;">{'✅ Used' if reference_asset_used else ' None'}</div>
<div>Reference Asset</div>
</div>
<div class="summary-item">
<div style="font-size: 1.2em; font-weight: bold; color: #495057;">{total_tokens:,} / ${total_cost:.2f}</div>
<div>Tokens / Est. cost</div>
</div>
</div>
</div>

View file

@ -16,6 +16,8 @@ import json
import os
from typing import Dict, List, Optional
from usage_tracker import estimate_cost
def _slug(name: str) -> str:
base = os.path.splitext(os.path.basename(name))[0]
@ -321,7 +323,7 @@ def _render_html(result: Dict) -> str:
<span class='grade-badge'>{html.escape(grade)}</span>
</div>
<div class='cost-line muted'>
Tokens: {result.get('token_usage', {}).get('total_tokens', 0):,}
Tokens: {result.get('token_usage', {}).get('total_tokens', 0):,} &middot; Est. cost: ${estimate_cost(result.get('token_usage', {}).get('prompt_tokens', 0), result.get('token_usage', {}).get('completion_tokens', 0)):.2f}
</div>
</div>

View file

@ -17,6 +17,8 @@ import json
import os
from typing import Dict, List, Optional
from usage_tracker import estimate_cost_for_checks
def _slugify_filename(name: str) -> str:
base = os.path.splitext(os.path.basename(name))[0]
@ -511,6 +513,7 @@ def _render_html(result: Dict, original_filename: str) -> str:
check_results = result.get('check_results', {})
pages = result.get('pages', [])
fonts_inventory = (result.get('ingest_metadata') or {}).get('fonts_inventory', [])
total_tokens, total_cost = estimate_cost_for_checks(check_results)
truncated_banner = ''
if result.get('truncated'):
@ -653,6 +656,9 @@ def _render_html(result: Dict, original_filename: str) -> str:
<div>
<span class='grade-badge grade-{grade}'>{grade}</span>
</div>
<div style='font-size:12px;color:#666;margin-left:auto;'>
Tokens: {total_tokens:,} &middot; Est. cost: ${total_cost:.2f}
</div>
</div>
<h2>Findings at a glance</h2>

View file

@ -198,6 +198,51 @@ def log_access_request(entry):
return log_entry
def estimate_cost(prompt_tokens, completion_tokens, provider='Gemini'):
"""Estimated USD cost for a single LLM call given its token usage.
Used by per-analysis report renderers (diff, document, standard) to
surface a cost figure on the downloaded HTML alongside the existing
score / grade summary.
"""
pricing = COST_PER_1K_TOKENS.get(provider)
if not pricing:
return 0.0
return (prompt_tokens / 1000) * pricing['input'] + \
(completion_tokens / 1000) * pricing['output']
def estimate_cost_for_checks(check_results):
"""Aggregate (total_tokens, total_cost_usd) across a dict-or-list of
per-check result dicts. Each dict is expected to carry a `token_usage`
sub-dict with prompt_tokens / completion_tokens / total_tokens, and
optionally `model_used.provider` (defaults to 'Gemini').
Deterministic checks (no LLM, no token_usage) contribute 0/0 and are
silently skipped.
"""
if isinstance(check_results, dict):
check_results = check_results.values()
total_tokens = 0
total_cost = 0.0
for cr in check_results:
if not isinstance(cr, dict):
continue
tu = cr.get('token_usage') or {}
prompt = tu.get('prompt_tokens') or 0
completion = tu.get('completion_tokens') or 0
total_tokens += tu.get('total_tokens') or (prompt + completion)
model_used = cr.get('model_used') or {}
provider = (
model_used.get('provider', 'Gemini')
if isinstance(model_used, dict)
else 'Gemini'
)
total_cost += estimate_cost(prompt, completion, provider)
return total_tokens, total_cost
def _calculate_analysis_cost(results):
"""
Calculate cost based on actual token usage from LLM responses