diff --git a/docs/superpowers/specs/2026-05-17-hp-cycle-1-onboarding-design.md b/docs/superpowers/specs/2026-05-17-hp-cycle-1-onboarding-design.md new file mode 100644 index 0000000..91ceae9 --- /dev/null +++ b/docs/superpowers/specs/2026-05-17-hp-cycle-1-onboarding-design.md @@ -0,0 +1,280 @@ +# HP Onboarding — Cycle 1: `hp_copy_review` Check + +**Goal:** Onboard HP onto the AI QC platform with a Source-Messaging-grounded copy review check, replacing the existing `hp-copy` PHP/Make.com POC tool. + +**Architecture:** Single new QC check `hp_copy_review` grades an HP marketing asset's on-asset copy against canonical Source Messaging Excel files uploaded as reference assets. A new `excel_processor.py` mirrors `pdf_processor.py`: openpyxl extracts raw cell content at upload time, Gemini summarises into structured Markdown, saved alongside the file under `brand_guidelines/files/`. At QC time the check prompt assembles the Markdown summary(s) + media-plan language metadata + the asset image and returns a structured findings list. HP gets a real client config entry plus the generic profiles it already has visibility into. + +**Tech stack:** openpyxl 3.x (already a project dep — used by `media_plan_processor.py`), existing `llm_config.py` Gemini integration, existing brand-guidelines flow, existing media-plan processor. **No new external dependencies.** + +**Status:** Cycle 1 of 3 in HP onboarding. Cycles 2 (Word/PPT ingestion) and 3 (Box file picker) are independent and ship later. This cycle is independently shippable. + +--- + +## Context + +HP's existing `hp-copy` is a PHP UI wrapping a Make.com webhook (opaque). The PM raised seven concerns; Dave's decision is to deprecate the POC and migrate HP onto AI QC. Of the seven concerns: + +- **Solved natively by AI QC today:** stability, configurable rule sets, accuracy (LLM + reference assets eliminate the false-positives-on-brand-names class of bugs because the canonical source list comes from the Excels), bulk processing (local upload supports multi-file out of the box). +- **Cycle 1 (this spec) addresses:** the HP-specific check, the Source-Messaging Excel ingestion pipeline, and multilingual via a media-plan `language` field. +- **Other cycles:** Word/PPT support (Cycle 2), Box file picker (Cycle 3). + +The user-visible flow Day 1 after this cycle ships: +1. HP user uploads Source Messaging `.xlsx` files (Messi-Core, Messi-Mainstream, Gaston) once via Settings → Reference Assets. +2. HP user uploads marketing asset(s) via local upload — same UX as Boots/AXA/LOREAL. +3. HP user selects the `hp_copy_review` profile and attaches the relevant Source Messaging reference(s). +4. The check returns a structured findings table matching the Messi Copy Review document format (priority, quote, issue, suggested fix, source citation). + +## Scope + +### In scope (this cycle) + +1. **HP client config** promoted from `_scope pending_` to a real entry with `hp_copy_review` as the default profile. +2. **`hp_copy_review` profile JSON** — single weighted check, client-specific visibility. +3. **`hp_copy_review` QC check** at `backend/visual_qc_apps/hp_copy_review/app.py`. +4. **`backend/excel_processor.py`** — new module mirroring `pdf_processor.py`. openpyxl extraction → Gemini summary → Markdown saved as `{file_id}_summary.md`. +5. **Reference-asset upload routing** — `.xlsx` uploads route to `excel_processor.process_excel_file`. Existing endpoints (`POST /api/brand_guidelines`, `GET /api/brand_guidelines//status`, `POST .../reprocess`) work without modification beyond the dispatch line. +6. **Media plan `language` field** — free-form text column; surfaced in matched-row metadata; included in the check prompt when present; absent → graceful no-op. +7. **Report rendering** — small case in the two HTML report generators so the findings JSON renders as a priority-coloured table instead of a wall of text. +8. **Unit + smoke tests** as listed under Testing. + +### Out of scope (other cycles or deferred) + +- Word / PPT ingestion as reference assets — Cycle 2. +- Box file picker UI — Cycle 3. +- HP master brand guidelines reference — HP hasn't provided one yet. +- Briefs (`.pptx`) as reference assets — depends on Cycle 2. +- Multi-language Source Messaging variants — HP currently has English-only files. If they later provide Spanish / Dutch versions, no code change is needed; they upload as separate reference assets. +- Strict-grade enforcement — the HP Copy Review is a nuanced priority-tiered (High / Medium / Low) review, not pass/fail. Standard 0–100 weighted scoring. +- Replacing or modifying the existing `hp-copy` PHP tool. We leave it running; HP migrates traffic at their own pace. + +--- + +## Components + +### `backend/client_config.py` — HP entry + +Promote HP from placeholder to a real entry. Add `hp_copy_review` to the profile list, set as default: + +```python +'hp': { + 'name': 'HP', + 'profiles': ['hp_copy_review', 'static_general', 'video_general'], + 'display_name': 'HP', + 'description': 'HP marketing copy QC graded against canonical Source Messaging', + 'default_profile': 'hp_copy_review', +}, +``` + +`box_folder_id` / `box_reports_folder_id` deferred to Cycle 3. + +### `backend/profiles/hp_copy_review.json` — new profile + +```json +{ + "name": "HP Copy Review", + "description": "Marketing copy graded against canonical HP Source Messaging", + "mode": "asset", + "visibility": "client_specific", + "visible_to_clients": ["hp"], + "checks": { + "hp_copy_review": { + "weight": 10.0, + "llm": "gemini", + "enabled": true + } + } +} +``` + +Total weight = 10.0 → scoring uses the `weighted_score × 10` path, max 100. Single check carries the whole score. No `strict_grade`. + +### `backend/visual_qc_apps/hp_copy_review/app.py` — new check + +Standard QC app module following `flask_app_template.py`. Single Gemini call. Returns: `score` (0–10), `summary` (one-paragraph headline), and `findings` (JSON list). + +**Prompt structure** (starting point — expect tuning during smoke testing): + +``` +You are a copy reviewer for HP marketing materials. Compare the +marketing asset against the canonical Source Messaging provided. + +PRODUCT LANGUAGE: + +CANONICAL SOURCE MESSAGING: + + +MARKETING ASSET: + + +For every claim, headline, body line, disclaimer, footnote, spec +call-out, and brand mention visible on the asset, evaluate against +the canonical source. Output a structured findings array: + +[ + { + "priority": "high" | "medium" | "low", + "category": "ksp" | "disclaimer" | "spec" | "variant" | + "tone" | "brand-name" | "language" | "other", + "quote": "", + "issue": "", + "suggested_fix": "", + "source_reference": "" + }, + ... +] + +Then provide a score from 0–10 reflecting overall copy quality +(10 = no issues, 0 = severe and pervasive issues). Score should +weight high-priority issues most heavily. + +If no Source Messaging is attached, return score 0 with a clear +summary explaining that no canonical source was provided. +``` + +**Empty-findings case** (clean asset): valid result — score 9–10, `findings: []`, summary "no issues identified". + +**No-reference-attached case**: check returns score 0 with the explanatory message, rather than running blind against an empty source. + +### `backend/excel_processor.py` — new module + +Mirrors `pdf_processor.py`. Public surface: + +- `process_excel_file(file_path, file_id) -> tuple[str, str]` — reads `.xlsx`, returns `(summary_text, summary_path)`. Saves `{file_id}_summary.md` under `brand_guidelines/files/`. + +Internal helpers: + +- `_extract_workbook_text(path) -> str` — openpyxl, iterates all sheets, dumps as `"Sheet: \n\n\n"`. Skips empty rows. Caps at a reasonable cell budget (e.g. 50K chars) to bound prompt size. +- `_summarise_with_gemini(raw_text, source_filename) -> str` — Gemini 2.5 Pro call with HP-tuned system prompt (below) producing a structured Markdown summary, ~1500–3000 words. + +**Summary prompt** (Excel-specific): + +``` +You're processing an HP Source Messaging Excel into a structured +Markdown reference. Output these sections: + +## Product / Variant +(brand, product line, variant if any — e.g. "HP OmniDesk Mini — Core") + +## Key Selling Points (KSPs) +For each KSP: heading, value proposition, supporting body copy, +message-length variants (ultra-short / short / medium / long if +present in the source). + +## Disclaimers / Footnotes +Numbered list, exact wording, what claim each footnote anchors to. + +## Approved Brand and Product Names +Exact spellings, including trademark glyphs (™, ®, ©). + +## Variant Notes / Watch-outs +Anything explicitly marked variant-specific (e.g. "Mainstream only", +"Core only", "must not appear in entry tier"). + +## Verboten Phrasing +Any explicitly disallowed or deprecated phrasing called out in the source. + +Be exhaustive but concise. Quote exactly where the source is explicit. +``` + +No cover image (Excel has no analogous concept). The reference-asset DB record schema already permits a null `cover_path`. + +### `backend/media_plan_processor.py` — `language` column + +When parsing media-plan Excel sheets, extract `language` (case-insensitive header match: `language`, `Language`, `LANGUAGE`) into the matched-row metadata dict. The existing media-plan-context block injected into prompts gains a `Language: ` line when the field is present; if absent, the line is omitted entirely (graceful no-op for clients whose media plans don't include language). + +### `api_server.py` — reference asset upload routing + +Existing `/api/brand_guidelines` POST routes `.pdf` → `pdf_processor.process_pdf_file`. Extend the dispatch: `.xlsx` → `excel_processor.process_excel_file`. Reuse the existing DB-record shape and the existing `GET ...//status` and `POST ...//reprocess` endpoints unchanged — they're agnostic to processor type. + +### Report rendering — findings table + +Per the [[feedback_multi_html_generators]] memory, there are two HTML generators (`generate_html_content` and `generate_comprehensive_html_report`). Both need a small case for `hp_copy_review`: when the check response contains a `findings` array, render as a table with columns for **Priority** (red/amber/green pill), **Category** (pill), **Quote** (monospace), **Issue**, **Suggested fix**, **Source**. Falls back to the existing plain-text response renderer if `findings` is absent (e.g. malformed LLM response). + +--- + +## Data Flow + +**Reference asset upload (one-time per Source Messaging file):** + +1. HP user uploads `.xlsx` via Settings → Reference Assets. +2. `api_server.py` routes by extension to `excel_processor.process_excel_file`. +3. openpyxl extracts raw cell content from all sheets. +4. Gemini summarises into structured Markdown via the HP-specific summary prompt. +5. Summary saved at `brand_guidelines/files/{file_id}_summary.md`. +6. DB record updated; status flips to `ready`. + +**QC run (per analysis):** + +1. HP user uploads marketing asset (image). +2. Selects `hp_copy_review` profile. +3. Selects one or more Source Messaging reference assets (Core / Mainstream / Gaston as applicable). +4. (Optional) The asset's filename matches a media plan row containing a `language` value. +5. `process_single_check` for `hp_copy_review` assembles the prompt: system instructions + concatenated Markdown summaries + media-plan context (with language if present) + asset image. +6. Single Gemini call returns score + summary + findings JSON. +7. Report renderer presents findings as a Messi-Review-style table. + +--- + +## Error Handling + +- **Excel parse failure** (corrupt file, password-protected, etc.) — processor returns an error; DB status = `failed`; user sees the error in the reference-assets list. No app crash. +- **Gemini summarisation failure at upload** — retry once with exponential backoff; if still failing, save the raw extraction as the summary and mark status = `degraded`. The check can still use a degraded summary (lower fidelity) rather than blocking. +- **Check-time LLM failure or malformed findings JSON** — existing `process_single_check` exception handling captures and records a score-0 result with the error in the response. Standard pattern, no new surface. +- **Empty findings** (clean asset) — valid result; score 9–10, `findings: []`, summary "no issues identified". +- **No reference asset attached** — check returns score 0 with a clear message ("No HP Source Messaging reference selected — attach a Source Messaging Excel to compare against"). Doesn't run blind. +- **Excel processing concurrency** — uploads are independent files; `pdf_processor.py` already handles concurrent uploads safely (per-file_id artefact paths). Same pattern applies. + +--- + +## Testing + +Tests run against the project's existing pytest setup. Real Source Messaging Excels live under `tests/fixtures/hp/` (copied from the user-provided originals). + +- **Unit tests** — `excel_processor`: + - Happy path: Messi-Core / Messi-Mainstream / Gaston Excels each yield a non-empty `.md` summary containing the expected section headers (`## Key Selling Points`, `## Disclaimers / Footnotes`, etc.) and at least one KSP-level content snippet. + - Corrupt file: error returned, no crash. + - Empty workbook: graceful degradation with a sensible message. +- **Unit tests** — `hp_copy_review/app.py`: + - Prompt assembly: given mock reference summaries and a mock media-plan row with `language: "UK English"`, assert the assembled prompt contains the language line, the source-messaging block delimiter, and the findings-format instructions. + - Response parsing: given a known Gemini-shape JSON response (fixture), assert findings list extracted correctly with all six fields per finding. + - Empty references: score 0 + the explanatory message. +- **Integration smoke test**: end-to-end with a real Messi asset (sample PNG of an OmniDesk eTail tile) + the Messi-Core Source Messaging reference attached. Assert the check runs to completion, returns a valid score, returns at least one finding (the Messi Copy Review found 34 — Gemini should surface at least 3 in the deterministic ones). +- **Profile load** in the pre-session checklist: add `hp_copy_review` to the loader test. + +--- + +## Deployment + +Code-only changes — no infrastructure work, no requirements changes (openpyxl already installed). + +1. PR `feature/hp-cycle-1-onboarding → develop`. Deploy to dev via `deploy.sh dev`. +2. **One-time data step on dev:** HP team (or Nick on their behalf) uploads the three Source Messaging Excel files (Messi-Core, Messi-Mainstream, Gaston-v2) via the UI. These land in `brand_guidelines/files/` on dev only — uploads are not synced between dev and prod; the prod uploads happen separately. +3. Dev smoke test: run an HP marketing image through `hp_copy_review` with the Messi-Core reference attached. Verify output structure mirrors the Messi Copy Review doc. +4. PR `develop → main`. Tag `v1.4.0` (minor — new client capability). Deploy to prod via `deploy.sh prod v1.4.0`. +5. HP team uploads Source Messaging files on prod, runs first real QC, provides feedback. Prompt tuning iterations are post-deploy LLM-prompt changes — small follow-up PRs as needed, no spec changes. + +--- + +## Definition of Done + +- `hp_copy_review` profile loads cleanly (pre-session checklist passes with the new profile in the loader script). +- `client_config.get_client_profiles('hp')` returns `['hp_copy_review', 'static_general', 'video_general']`. +- `client_config.get_default_profile('hp')` returns `'hp_copy_review'`. +- Uploading a Source Messaging `.xlsx` produces a non-empty `_summary.md` within 60s of upload. +- Running `hp_copy_review` on a known Messi asset with the Messi-Core reference attached returns findings overlapping with at least 3 of the 34 issues in the HP-provided Messi Copy Review doc (rough qualitative bar — Gemini scoring varies run-to-run, but the major issues should be detected). +- Report renders the findings as a structured table, not free-text. +- Media plan parsing extracts `language` when present; the check prompt includes a `Language:` line in that case. +- Standard pre-session checklist all green on develop tip. + +--- + +## Deferred decisions (worth surfacing at follow-up) + +- **Strict-grade for HP?** Not in V1. If HP wants any High-priority finding to force overall Fail, add `strict_grade: true` to the profile and extend the scoring path (small retrofit). +- **HP master brand guidelines** — none today. Whenever HP provides a master brand guide PDF (colour palette, logo usage, typography), it can be attached as an additional reference asset alongside Source Messaging. No code change. +- **Prompt template tuning** — the templates above are starting points. Live HP usage will surface what to refine. Iterate via small prompt-only PRs. +- **Non-English Source Messaging** — if HP later provides Spanish / Dutch versions, they upload as separate reference assets and select the relevant one(s) per QC run. Works without code change. +- **Findings-output schema versioning** — if HP wants additional fields per finding (e.g. screenshot crop region, suggested approval routing), add to the JSON shape and bump renderer. +- **Briefs as reference assets** — depends on Cycle 2 (Word/PPT ingestion). Once that lands, HP can attach Gaston/Messi `.pptx` briefs alongside the Excel sources.