docs(hp): add CLAUDE_HP.md client doc + link from main CLAUDE.md

Documents the hp_copy_review profile, the Source Messaging Excel reference-asset flow, the excel_processor pattern, cycle-1 shipped state on v1.4.0, known limitations carried into cycle 2, and the cycle 2/3 roadmap (Word/PPT processor, Box picker — deferred until HP team feedback indicates they're needed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(reports): surface est. cost + token usage on every per-analysis HTML
2026-05-19 13:28:17 +02:00 · 2026-05-19 13:26:38 +02:00
6 changed files with 179 additions and 4 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -102,7 +102,7 @@ Profiles define check sets, weights, and LLM assignments. Profiles can be marked
 | Honda | generic only | [CLAUDE_HONDA.md](CLAUDE_HONDA.md) |
 | Rank | generic only | [CLAUDE_RANK.md](CLAUDE_RANK.md) |
 | Google | generic only | _scope pending_ |
-| HP | generic only | _scope pending_ |
+| HP | `hp_copy_review` (1, copy review vs canonical Source Messaging) | [CLAUDE_HP.md](CLAUDE_HP.md) |
 | Ferrero | generic only | _scope pending_ |
 | General | generic only | [CLAUDE_GENERAL.md](CLAUDE_GENERAL.md) |

--- a/CLAUDE_HP.md
+++ b/CLAUDE_HP.md
@ -0,0 +1,108 @@
+# HP Client Documentation
+
+> Referenced from main CLAUDE.md. Detailed HP QC profile, Source Messaging reference-asset flow, Excel processor notes, and roadmap.
+
+## Overview
+
+HP QC is built around **copy review against canonical Source Messaging**. HP supplies an Excel workbook per product variant (e.g. Messi-Core, Messi-Mainstream) containing the approved KSPs, body copy, disclaimers, spec call-outs, and brand-name forms. The QC compares all visible copy on a marketing asset against that workbook and reports every discrepancy.
+
+**Status (2026-05-17):** HP cycle 1 live on **prod (`v1.4.0`)** since 2026-05-17. Single profile (`hp_copy_review`), single check, Source Messaging Excel ingestion via `excel_processor.py`, structured-findings table in the report. Cycle 2 (Word/PPT processor) and cycle 3 (Box picker) deferred — scope only when HP team's first-week feedback indicates they need them.
+
+## HP Profiles
+
+### `hp_copy_review` — copy QC against Source Messaging (1 check)
+
+`mode: asset`, `visibility: client_specific`, weight 10.0. Default profile for the HP client.
+
+| Check | What it does | Weight |
+|------|--------------|--------|
+| `hp_copy_review` | Single Gemini 2.5 Pro call comparing every visible claim/headline/body/disclaimer/spec call-out/brand mention against the canonical Source Messaging Markdown. Returns 0–10 score + `findings[]` array (priority/category/quote/issue/suggested_fix/source_reference). | 10.0 |
+
+If no Source Messaging reference asset is attached at QC time, the LLM is instructed to return score 0 with an explanatory message rather than grade blind. The score-zero rule is baked into the prompt, not enforced as a pre-LLM short-circuit — so it still costs one Gemini call.
+
+## Source Messaging reference-asset flow
+
+HP's workflow centres on uploading the product's Source Messaging `.xlsx` as a Reference Asset (Settings → Reference Assets → HP) **before** running QC. The pipeline:
+
+1. **Upload** — user uploads `messi_core.xlsx` (or similar) via the Reference Assets UI.
+2. **Dispatch** (`api_server.py`, `/api/brand_guidelines` POST handler) — the `.xlsx` extension triggers a two-step dispatch:
+   - First tries `localization_processor.parse_localization_matrix` (preserves existing localization-matrix workflow for other clients).
+   - Falls back to `excel_processor.process_excel_file` for HP Source Messaging (`asset_type='source_messaging'`).
+3. **Extraction + summarisation** (`backend/excel_processor.py`) — openpyxl reads every sheet, Gemini 2.5 Pro summarises the raw cell content into structured Markdown under `brand_guidelines/files/{file_id}_summary.md`. Output sections:
+   - `## Product / Variant`
+   - `## Key Selling Points (KSPs)` — with ultra-short / short / medium / long variants
+   - `## Disclaimers / Footnotes` — numbered, with anchor claim
+   - `## Approved Brand and Product Names` — exact trademark glyphs (™, ®, ©)
+   - `## Variant Notes / Watch-outs`
+   - `## Verboten Phrasing`
+4. **Injection at QC time** — `get_reference_asset_content()` in `api_server.py` reads the `summary_path` from the file record and prepends the Markdown summary to the `hp_copy_review` check prompt as `Source Messaging Summary (extracted from <filename>):`.
+5. **LLM call** — Gemini 2.5 Pro evaluates the marketing asset against the canonical Source Messaging Markdown, returning a structured JSON response.
+6. **Rendering** — the `findings[]` array renders as a priority-coloured table via `_render_findings_table()` (HTML-escaped, used by both HTML report generators).
+
+`backend/excel_processor.py` **never raises** — on extraction or summarisation failure it writes a degraded summary embedding the raw extracted text so the reference asset stays usable.
+
+## Excel processor pattern
+
+`excel_processor.py` mirrors the established `pdf_processor.py` pattern:
+
+| Aspect | PDF processor | Excel processor |
+|---|---|---|
+| Extraction | PyMuPDF (all pages text) | openpyxl (all sheets, tab-aligned) |
+| LLM | Gemini 2.5 Pro | Gemini 2.5 Pro |
+| Output | `{file_id}_summary.txt` (~2000-4000 words) | `{file_id}_summary.md` (structured sections) |
+| Raw cap | n/a | 50K chars (truncation marker if exceeded) |
+| Never raises | yes (degraded summary fallback) | yes (degraded summary fallback) |
+
+Public entry point: `process_excel_file(file_path: str, file_id: str) -> Tuple[str, str]` returning `(summary_text, summary_path)`.
+
+## Routing
+
+- `client_config.py` maps HP with `default_profile: 'hp_copy_review'` and the visible profile list `['hp_copy_review', 'static_general', 'video_general']`.
+- `get_client_from_profile()` maps any profile id starting `hp_` → client `'hp'`. Without this branch HP reports would land in `output-dev/general/` instead of `output-dev/hp/` — fixed in 2026-05-17's v1.4.0 deploy.
+- Media-plan integration: the standard `media_plan_processor.py` extracts `language` (case-insensitive column lookup) and surfaces it into the check prompt as `- Language: <value>`. HP's prompt is language-aware (UK English vs US English vs French etc.).
+
+## Cycle 1 — what shipped (v1.4.0, 2026-05-17)
+
+- HP client config promoted from `_scope pending_` to real entry.
+- New `hp_copy_review` profile (single weighted check, client-specific visibility).
+- New `hp_copy_review` check (`backend/visual_qc_apps/hp_copy_review/app.py`) — single Gemini call, structured JSON findings output.
+- New `backend/excel_processor.py` — openpyxl extraction + Gemini summarisation.
+- `/api/brand_guidelines` POST dispatch: `.xlsx` tries localisation matrix first, falls back to excel_processor.
+- `get_reference_asset_content` extended to read `.xlsx` `summary_path` and inject the Markdown.
+- Both HTML report generators get a shared `_render_findings_table` helper rendering `findings` arrays as priority-coloured tables.
+- Media-plan `language` column extracted case-insensitively, surfaced into prompt context.
+- `get_client_from_profile` extended to route `hp_*` profiles to `output-dev/hp/`.
+
+## Known limitations carried into cycle 2
+
+1. **Singular reference-asset selection.** User can only attach **one** Source Messaging file per QC run. If an asset has both Messi-Core and Messi-Mainstream variants side-by-side, the user has to pick one variant to grade against. Multi-reference plumbing is a future enhancement.
+2. **No pre-LLM short-circuit when no Source Messaging is attached.** The "score 0" rule lives in the prompt, not in the dispatch layer, so a Source-less run still costs one Gemini call (~$0.01).
+3. **Two cosmetic display issues** in the report metadata strip (pre-existing patterns, not HP-specific):
+   - `Weight: 1000.0%` — weight × 100 percentage formatting bug, harmless.
+   - `Reference Asset: ➖ Not required` — shows even when a reference *was* used; mismatch between metadata label and `reference_asset_used` JSON field.
+
+## Cycle 2 (deferred) — Word / PPT processor
+
+Mirror the `excel_processor` pattern for `.docx` and `.pptx` reference assets. Same shape:
+- python-docx / python-pptx extraction
+- Gemini 2.5 Pro summarisation
+- `{file_id}_summary.md` output
+- Same `/api/brand_guidelines` dispatch fallback
+
+Scope only when HP team's feedback indicates they upload Word or PowerPoint Source Messaging in practice.
+
+## Cycle 3 (deferred) — Box picker
+
+Browse the HP-shared Box folder tree from the UI, multi-select files, run QC against the selected asset(s). Builds on the existing Box JWT scaffolding from L'Oreal Phase 4 (see `backend/box_jwt_client.py`, `backend/BOX_CLIENT_ONBOARDING.md`).
+
+Scope only when HP team's feedback indicates Box is the canonical asset store for their workflow.
+
+## Key files
+
+- `backend/profiles/hp_copy_review.json` — profile config (single check, weight 10, Gemini)
+- `backend/visual_qc_apps/hp_copy_review/app.py` — check implementation + prompt template + standalone `build_prompt()` helper
+- `backend/excel_processor.py` — Source Messaging Excel ingestion
+- `backend/client_config.py` — HP client entry, `get_client_from_profile()` routing
+- `backend/api_server.py` — `/api/brand_guidelines` dispatch, `get_reference_asset_content` xlsx branch, `_render_findings_table` helper
+- `docs/superpowers/specs/2026-05-17-hp-cycle-1-onboarding-design.md` — Cycle 1 design spec
+- `docs/superpowers/plans/2026-05-17-hp-cycle-1-onboarding.md` — Cycle 1 implementation plan
--- a/backend/api_server.py
+++ b/backend/api_server.py
@ -950,17 +950,20 @@ def save_results_to_file(report_data, filename, output_mode='html', session_id=N

 def generate_html_content(report_data, filename, file_path=None):
    """Generate HTML content for report data with expandable sections"""
+    from usage_tracker import estimate_cost_for_checks
+
    # Define a function to get color based on score
    def get_score_result(score):
        if score >= 6:
            return "Pass", "#28a745"  # Green for pass
        else:
            return "Fail", "#dc3545"  # Red for fail
-    
+
    # Get reference asset information from profile selection
    profile_selection = report_data.get('profile_selection', {})
    reference_asset = profile_selection.get('reference_asset', None)
    reference_asset_used = profile_selection.get('reference_asset_used', False)
+    total_tokens, total_cost = estimate_cost_for_checks(report_data.get('results') or {})
    
    # Build HTML for each check result with expandable sections
    check_results_html = ""
@ -1225,6 +1228,10 @@ def generate_html_content(report_data, filename, file_path=None):
                        <div style="font-size: 1.2em; color: {'#28a745' if reference_asset_used else '#6c757d'}; font-weight: bold;">{'✅ Used' if reference_asset_used else '➖ None'}</div>
                        <div>Reference Asset</div>
                    </div>
+                    <div class="summary-item">
+                        <div style="font-size: 1.2em; font-weight: bold; color: #495057;">{total_tokens:,} / ${total_cost:.2f}</div>
+                        <div>Tokens / Est. cost</div>
+                    </div>
                </div>
            </div>

@ -1376,11 +1383,13 @@ def _render_technical_section_html(report):

 def generate_comprehensive_html_report(analysis_result, filename, file_path=None):
    """Generate comprehensive HTML report similar to the web UI format"""
+    from usage_tracker import estimate_cost_for_checks
+
    summary = analysis_result.get('summary', {})
    qc_analysis = analysis_result.get('qc_analysis', {})
    profile_selection = analysis_result.get('profile_selection', {})
    check_results = qc_analysis.get('check_results', {})
-    
+
    timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    overall_score = summary.get('overall_score', 0)
    profile_name = profile_selection.get('suggested_profile', 'Unknown Profile')
@ -1388,6 +1397,7 @@ def generate_comprehensive_html_report(analysis_result, filename, file_path=None
    completed_checks = qc_analysis.get('completed_checks', 0)
    reference_asset = profile_selection.get('reference_asset', None)
    reference_asset_used = profile_selection.get('reference_asset_used', False)
+    total_tokens, total_cost = estimate_cost_for_checks(check_results)
    
    # Generate check results HTML
    check_results_html = ''
@ -1539,6 +1549,10 @@ def generate_comprehensive_html_report(analysis_result, filename, file_path=None
                    <div style="font-size: 1.2em; color: {'#28a745' if reference_asset_used else '#6c757d'}; font-weight: bold;">{'✅ Used' if reference_asset_used else '➖ None'}</div>
                    <div>Reference Asset</div>
                </div>
+                <div class="summary-item">
+                    <div style="font-size: 1.2em; font-weight: bold; color: #495057;">{total_tokens:,} / ${total_cost:.2f}</div>
+                    <div>Tokens / Est. cost</div>
+                </div>
            </div>
        </div>

--- a/backend/document_mode/diff_report_writer.py
+++ b/backend/document_mode/diff_report_writer.py
@ -16,6 +16,8 @@ import json
 import os
 from typing import Dict, List, Optional

+from usage_tracker import estimate_cost
+

 def _slug(name: str) -> str:
    base = os.path.splitext(os.path.basename(name))[0]
@ -321,7 +323,7 @@ def _render_html(result: Dict) -> str:
        <span class='grade-badge'>{html.escape(grade)}</span>
      </div>
      <div class='cost-line muted'>
-        Tokens: {result.get('token_usage', {}).get('total_tokens', 0):,}
+        Tokens: {result.get('token_usage', {}).get('total_tokens', 0):,} &middot; Est. cost: ${estimate_cost(result.get('token_usage', {}).get('prompt_tokens', 0), result.get('token_usage', {}).get('completion_tokens', 0)):.2f}
      </div>
    </div>

--- a/backend/document_mode/result_writer.py
+++ b/backend/document_mode/result_writer.py
@ -17,6 +17,8 @@ import json
 import os
 from typing import Dict, List, Optional

+from usage_tracker import estimate_cost_for_checks
+

 def _slugify_filename(name: str) -> str:
    base = os.path.splitext(os.path.basename(name))[0]
@ -511,6 +513,7 @@ def _render_html(result: Dict, original_filename: str) -> str:
    check_results = result.get('check_results', {})
    pages = result.get('pages', [])
    fonts_inventory = (result.get('ingest_metadata') or {}).get('fonts_inventory', [])
+    total_tokens, total_cost = estimate_cost_for_checks(check_results)

    truncated_banner = ''
    if result.get('truncated'):
@ -653,6 +656,9 @@ def _render_html(result: Dict, original_filename: str) -> str:
      <div>
        <span class='grade-badge grade-{grade}'>{grade}</span>
      </div>
+      <div style='font-size:12px;color:#666;margin-left:auto;'>
+        Tokens: {total_tokens:,} &middot; Est. cost: ${total_cost:.2f}
+      </div>
    </div>

    <h2>Findings at a glance</h2>
--- a/backend/usage_tracker.py
+++ b/backend/usage_tracker.py
@ -198,6 +198,51 @@ def log_access_request(entry):
    return log_entry


+def estimate_cost(prompt_tokens, completion_tokens, provider='Gemini'):
+    """Estimated USD cost for a single LLM call given its token usage.
+
+    Used by per-analysis report renderers (diff, document, standard) to
+    surface a cost figure on the downloaded HTML alongside the existing
+    score / grade summary.
+    """
+    pricing = COST_PER_1K_TOKENS.get(provider)
+    if not pricing:
+        return 0.0
+    return (prompt_tokens / 1000) * pricing['input'] + \
+        (completion_tokens / 1000) * pricing['output']
+
+
+def estimate_cost_for_checks(check_results):
+    """Aggregate (total_tokens, total_cost_usd) across a dict-or-list of
+    per-check result dicts. Each dict is expected to carry a `token_usage`
+    sub-dict with prompt_tokens / completion_tokens / total_tokens, and
+    optionally `model_used.provider` (defaults to 'Gemini').
+
+    Deterministic checks (no LLM, no token_usage) contribute 0/0 and are
+    silently skipped.
+    """
+    if isinstance(check_results, dict):
+        check_results = check_results.values()
+
+    total_tokens = 0
+    total_cost = 0.0
+    for cr in check_results:
+        if not isinstance(cr, dict):
+            continue
+        tu = cr.get('token_usage') or {}
+        prompt = tu.get('prompt_tokens') or 0
+        completion = tu.get('completion_tokens') or 0
+        total_tokens += tu.get('total_tokens') or (prompt + completion)
+        model_used = cr.get('model_used') or {}
+        provider = (
+            model_used.get('provider', 'Gemini')
+            if isinstance(model_used, dict)
+            else 'Gemini'
+        )
+        total_cost += estimate_cost(prompt, completion, provider)
+    return total_tokens, total_cost
+
+
 def _calculate_analysis_cost(results):
    """
    Calculate cost based on actual token usage from LLM responses