feat(hp_copy_review): single-check LLM grader against Source Messaging
Single Gemini call per asset. Prompt assembles attached Source Messaging summaries + media-plan language context + the asset image. Returns structured JSON with score, summary, and a findings array (priority, category, quote, issue, suggested fix, source reference). Empty findings = clean asset; missing reference -> score 0 with a clear message rather than running blind. Mirrors the boots_tandc_wording pattern: subclass FlaskAppTemplate, expose a static prompt template, let process_single_check inject reference-asset content and media-plan context at runtime. A standalone build_prompt() helper mirrors that assembly for unit- style smoke tests and ad-hoc prompt inspection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
014a9cb8ff
commit
4c19a0fb9d
2 changed files with 179 additions and 0 deletions
0
backend/visual_qc_apps/hp_copy_review/__init__.py
Normal file
0
backend/visual_qc_apps/hp_copy_review/__init__.py
Normal file
179
backend/visual_qc_apps/hp_copy_review/app.py
Normal file
179
backend/visual_qc_apps/hp_copy_review/app.py
Normal file
|
|
@ -0,0 +1,179 @@
|
|||
"""HP Copy Review — single-call LLM grader against canonical Source Messaging.
|
||||
|
||||
This check compares all visible copy on an HP marketing asset (claims,
|
||||
headlines, body, disclaimers, footnotes, spec call-outs, brand mentions)
|
||||
against the canonical Source Messaging summaries attached as reference
|
||||
assets (.xlsx → Markdown summary via excel_processor).
|
||||
|
||||
It returns a structured JSON object with a 0-10 score, a one-paragraph
|
||||
summary, and a `findings` array (priority / category / quote / issue /
|
||||
suggested_fix / source_reference). Empty findings on a clean asset is a
|
||||
valid result (score 9-10). When no Source Messaging is attached, the
|
||||
LLM is instructed to return score 0 with an explanatory message rather
|
||||
than grade blind.
|
||||
|
||||
Reference assets and media-plan context (including `language`) are
|
||||
injected by `process_single_check` in `api_server.py` — this module
|
||||
exposes only the static prompt template. A standalone `build_prompt()`
|
||||
helper is provided for unit-style smoke tests and for any future caller
|
||||
that wants to assemble the full prompt outside the production path.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from typing import Iterable, Mapping, Optional, Sequence, Tuple
|
||||
|
||||
# Add parent directory to path so we can import shared template
|
||||
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
|
||||
|
||||
from visual_qc_apps.flask_app_template import FlaskAppTemplate
|
||||
|
||||
|
||||
# --- Canonical prompt template ------------------------------------------------
|
||||
#
|
||||
# The reference-asset summary block ("CANONICAL SOURCE MESSAGING") is
|
||||
# prepended by `process_single_check` in `api_server.py` via
|
||||
# `get_reference_asset_content()`. Likewise the media-plan context block
|
||||
# ("=== MEDIA PLAN CONTEXT ===" with `- Language: <value>`) is appended
|
||||
# by `process_single_check`. We embed instructions that *reference* both
|
||||
# blocks so the LLM knows where to look.
|
||||
|
||||
HP_COPY_REVIEW_PROMPT = """You are a copy reviewer for HP marketing materials. Your job is to compare the marketing asset against the canonical Source Messaging that has been attached as a reference asset, and report every copy discrepancy as a structured finding.
|
||||
|
||||
WHAT YOU WILL BE GIVEN:
|
||||
1. One or more canonical Source Messaging summaries, attached above as REFERENCE ASSET GUIDELINES. Each Source Messaging file (e.g. `messi_core.xlsx`, `messi_mainstream.xlsx`) has been pre-summarised into Markdown and is the single source of truth for product claims, KSPs, disclaimers, spec call-outs, variant naming, and approved tone.
|
||||
2. A media-plan context block (appended below the prompt) which may include `- Language: <value>` and `- Country: <value>`. Treat the language value as the PRODUCT LANGUAGE the asset should be using (e.g. "UK English", "US English", "French (France)").
|
||||
3. The marketing asset image itself.
|
||||
|
||||
WHAT TO DO:
|
||||
For every claim, headline, body line, disclaimer, footnote, spec call-out, and brand mention visible on the asset, evaluate it against the canonical Source Messaging. Flag:
|
||||
- Wording that disagrees with an approved KSP or claim.
|
||||
- Missing or incorrect mandatory disclaimers / legal footnotes / asterisked notes.
|
||||
- Spec call-outs that contradict the canonical spec (wrong number, wrong unit, wrong product variant).
|
||||
- Variant / product-name errors (e.g. "OmniDesk Mini" vs "OmniDesk Mini Core").
|
||||
- Tone / phrasing drift from the approved brand voice described in the source.
|
||||
- Brand-name misuse (HP, sub-brand capitalisation, trademark glyph misuse).
|
||||
- Language / locale mismatch against the media-plan PRODUCT LANGUAGE (e.g. "color" appearing in a UK English asset, or French copy on an asset specified as US English).
|
||||
|
||||
OUTPUT — return ONE JSON object, and nothing else (no prose, no markdown fences outside the JSON code block). The shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"score": <number 0-10>,
|
||||
"summary": "<one-paragraph headline finding>",
|
||||
"findings": [
|
||||
{
|
||||
"priority": "high" | "medium" | "low",
|
||||
"category": "ksp" | "disclaimer" | "spec" | "variant" | "tone" | "brand-name" | "language" | "other",
|
||||
"quote": "<exact quote from the asset>",
|
||||
"issue": "<what's wrong>",
|
||||
"suggested_fix": "<what it should say, citing the canonical source>",
|
||||
"source_reference": "<where in the source messaging this comes from, e.g. file name + section heading>"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
RULES:
|
||||
- If no Source Messaging reference asset is attached (i.e. there is no "REFERENCE ASSET GUIDELINES" block above describing canonical HP messaging), return EXACTLY:
|
||||
{"score": 0, "summary": "No HP Source Messaging reference was attached — cannot grade copy without a canonical source.", "findings": []}
|
||||
Do not attempt to grade copy from prior knowledge.
|
||||
- High-priority findings (factually-wrong claims, missing mandatory disclaimers, wrong product variant, wrong language) weight the score most heavily. A single high-priority finding should typically pull the score below 6.
|
||||
- Medium-priority findings are wording drift that changes nuance but not meaning, or missing optional supporting copy.
|
||||
- Low-priority findings are tone / style nits.
|
||||
- An empty `findings` array is a valid and expected result for a clean asset — in that case score 9 or 10 and write a short, positive summary.
|
||||
- The `quote` field must be the EXACT visible text from the asset, including punctuation. If you can read it, quote it.
|
||||
- `source_reference` should make it easy for a reviewer to verify the finding — name the Source Messaging file and the section/heading you matched against.
|
||||
- Return ONLY the JSON object inside a single ```json ... ``` code block. No surrounding prose, no explanations outside the JSON.
|
||||
"""
|
||||
|
||||
|
||||
def build_prompt(
|
||||
reference_summaries: Optional[Sequence[Tuple[str, str]]] = None,
|
||||
media_plan_row: Optional[Mapping[str, str]] = None,
|
||||
base_prompt: str = HP_COPY_REVIEW_PROMPT,
|
||||
) -> str:
|
||||
"""Assemble a fully-rendered HP copy-review prompt for testing / inspection.
|
||||
|
||||
In production, `process_single_check` (api_server.py) does this
|
||||
assembly itself: it prepends `get_reference_asset_content(...)` and
|
||||
appends `build_media_plan_context(...)`. This helper mirrors that
|
||||
flow so we can smoke-test the prompt assembly without running the
|
||||
full server, and so callers that want to render the exact prompt
|
||||
text for logging / debugging have a single entry point.
|
||||
|
||||
Args:
|
||||
reference_summaries: List of (filename, markdown_summary) tuples,
|
||||
one per attached Source Messaging .xlsx. Each summary is
|
||||
already a Markdown string produced by `excel_processor`.
|
||||
None or [] means "no canonical source attached" — in that
|
||||
case we still build the prompt but omit the canonical block,
|
||||
and the LLM will fall back to the score-0 rule.
|
||||
media_plan_row: Mapping with optional `language`, `country`,
|
||||
`placement`, etc. Only `language` and `country` are
|
||||
rendered into the prompt here; the production flow uses
|
||||
`build_media_plan_context` and includes more fields.
|
||||
base_prompt: Override for the canonical prompt template (used
|
||||
in tests where we want to inject a shorter stub).
|
||||
|
||||
Returns:
|
||||
The fully-assembled prompt string, with the canonical source
|
||||
messaging block (if any) prepended, the media-plan language /
|
||||
country line(s) appended, and the base template in between.
|
||||
"""
|
||||
parts = []
|
||||
|
||||
# 1. Canonical source messaging block — mirrors the shape of
|
||||
# `get_reference_asset_content` so the LLM sees a consistent
|
||||
# "REFERENCE ASSET GUIDELINES" heading whether it's running in
|
||||
# production or via this helper.
|
||||
if reference_summaries:
|
||||
ref_lines = ["\n\n=== REFERENCE ASSET GUIDELINES ===",
|
||||
"CANONICAL SOURCE MESSAGING:"]
|
||||
for filename, summary in reference_summaries:
|
||||
ref_lines.append(f"\n--- File: {filename} ---\n{summary}")
|
||||
ref_lines.append("=== END REFERENCE ASSET GUIDELINES ===\n")
|
||||
parts.append("\n".join(ref_lines))
|
||||
|
||||
# 2. The static prompt template itself.
|
||||
parts.append(base_prompt)
|
||||
|
||||
# 3. Media-plan context (language / country). Production appends
|
||||
# the full `build_media_plan_context` block; here we render just
|
||||
# the language + country fields, which is what Step 5.6 asserts.
|
||||
if media_plan_row:
|
||||
mp_lines = ["\n=== MEDIA PLAN CONTEXT ==="]
|
||||
if media_plan_row.get('language'):
|
||||
mp_lines.append(f"- Language: {media_plan_row['language']}")
|
||||
if media_plan_row.get('country'):
|
||||
mp_lines.append(f"- Country: {media_plan_row['country']}")
|
||||
mp_lines.append("=== END MEDIA PLAN CONTEXT ===")
|
||||
parts.append("\n".join(mp_lines))
|
||||
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
class HpCopyReviewApp(FlaskAppTemplate):
|
||||
"""HP Copy Review — single-call LLM copy grader against Source Messaging.
|
||||
|
||||
Subclasses `FlaskAppTemplate` so the check is auto-discovered by
|
||||
`load_qc_apps()` in `api_server.py`. The class instance exposes
|
||||
`self.prompt` (the canonical template plus the standard scoring
|
||||
instructions appended by the template base class).
|
||||
|
||||
Reference asset summaries and media-plan context are injected at
|
||||
runtime by `process_single_check` — this class does NOT call Gemini
|
||||
directly. Response parsing is handled by
|
||||
`extract_json_from_response` / `extract_score_from_result` in
|
||||
api_server.py, which will lift `score`, `summary`, and `findings`
|
||||
out of the JSON code block returned by the LLM.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__(__name__, HP_COPY_REVIEW_PROMPT)
|
||||
|
||||
|
||||
# Allow running this check standalone for ad-hoc testing
|
||||
if __name__ == "__main__":
|
||||
app_instance = HpCopyReviewApp()
|
||||
app_instance.run()
|
||||
Loading…
Add table
Reference in a new issue