Three rounds of prompt tuning against the Remington (4p), Easter Overlay (18p), and Grenade (7p) sample packs. Easter Overlay (the noisiest) climbed 72.38 → 78.97 → 80.04 across iterations, with strict-grade violations dropping 27 → 18 → 14. Remaining violations are now genuine compliance issues — the noise patterns are cleared. boots_caveat_compliance: - Superscript guard: vision LLM was flagging every roundel asterisk as superscript because the * glyph naturally sits high in its line. Strict two-feature rule now required (raised baseline AND visibly shrunk ~50-60% of body). Borderline cases → "needs_manual_check" with new superscript_caveat field. Caveat avg 4.4 → 7.27. - Same vision-LLM caveat applied to weight_matching (Light vs Regular at small sizes is below detection threshold) and sizing_compliant (1-2pt size differences below detection threshold). New weight_caveat and sizing_caveat fields. Reserved 1-2 score band for unambiguous critical violations only. - Explicit scoring principle: "when in doubt, prefer 7-8 with manual_check flags over a lower confident-violation score". boots_brand_name_accuracy: - ALL CAPS retail convention now explicitly acceptable. L'OREAL, ESTEE LAUDER, MAYBELLINE etc. no longer flagged as casing errors — only structural element mismatches (accents, hyphens, apostrophes, special chars) count. - Stylised brand logotype exception: known logomarks like `17` for SEVENTEEN, &SISTERS ampersand styling, e.l.f. dot rendering are Pass — surfaced via new logotype_observations field. - Brand name avg 5.53 → 7.47 → 6.67 (LLM run-to-run variability). Strongest real catch in dataset: Easter Overlay page 14 is labelled for the ROI market in production notes but uses £ instead of € on the artwork. Exactly the pre-press error worth surfacing. Caught consistently across all runs by boots_currency_locale. CLAUDE_BOOTS.md updated with three-pack smoke-test table, vision-LLM limitations summary, and the four reusable prompt-tuning patterns that worked on this build. Local-only — feature/boots-ppack remains unmerged until after Boots show-and-tell. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.5 KiB
Boots Client Documentation
Referenced from main CLAUDE.md. This file contains detailed Boots QC check descriptions, guidance document sources, and known limitations.
Overview
Boots is a retail client with promotional artwork compliance checks. Unlike other clients that focus on brand identity or marketing creative quality, Boots checks are strictly compliance and technical specs -- no creative/aesthetic assessment. The checks are derived from 7 thematic guidance documents the client's team previously used with a LibreChat agent.
Scoring override: Same as L'Oreal -- any individual check scoring below 6 forces an overall Fail.
Guidance documents location: /Users/nickviljoen/Desktop/AI_QC_Bitbucket/boots/Recieved Docs/
Original agent prompt: /Users/nickviljoen/Desktop/AI_QC_Bitbucket/boots/System Instruction Existing Agent.rtf
Boots QC Tools
Seven checks for Boots retail promotional artwork compliance. Profiles: boots_static (single-asset, 5 checks) and boots_ppack (multi-page production-pack document mode, all 7 checks).
| Tool | Source Document(s) | What it checks |
|---|---|---|
boots_caveat_compliance |
ASTERISK RULES | Caveat ordering (* -> dagger -> ** -> double dagger -> triangle -> clover), sizing per context (1/3 headline, 1/2 sub-headline, same as body), NO SUPERSCRIPT (critical), font weight matching, plus orphan-asterisk detection (smoke test caught a * in T&Cs with no matching marker in main copy) |
boots_brand_name_accuracy |
BRAND NAMES (3 pages) | Exact spelling of ~170 brand + product names including accents, apostrophes, hyphens, casing. Closed-world list — brands not on the list are surfaced in names_not_on_list for manual review and DO NOT cause a Fail; only spelling errors against listed brands fail. |
boots_offer_mechanics |
OFFER ROUNDELS + VALUE MECHANICS | Offer roundel format matches approved categories (price reductions, multibuys, threshold spend, FREE/GWP, points), spaced-caps styling, "Our best..." approved phrases |
boots_tandc_wording |
OFFER T&Cs + CLICK AND COLLECT + LOCK-UP T&Cs | Standard offer T&C wording (3FOR2, BOGOF, etc.), C&C exact text + font weight + hierarchy, lock-up T&Cs (Advantage Card, Parenting Club, Price Advantage, Pyramid), offer date formatting. Font weight is best-effort — flagged via font_weight_caveat field for manual verification. |
boots_currency_locale |
Agent prompt cross-cutting rules | Currency: GBP for UK / EUR for ROI, URLs: boots.com / boots.ie, consistent locale throughout asset |
boots_logo_compliance |
Built from PPack observation (no formal Boots logo guideline supplied) | Three-path scoring: A) master wordmark (strict — typeface, colour, orientation, distortion, clear space), B) partner / production lock-up (lenient — "OLIVER x BOOTS" footers etc. follow lock-up conventions, NOT master wordmark rules), C) no Boots branding (N/A neutral). |
boots_colour_palette |
Boots canonical palette derived from creative-guidance pages | Two modes: A) creative-guidance pages verify CMYK/RGB/Hex spec values match Boots Blue (#05054b), Health Primary Blue (#5dc4e9), Offer Red (#d3072a); B) artwork pages sanity-check dominant brand colours visually. |
Boots Production Pack (boots_ppack) profile — multi-page document mode
For multi-page production packs (4-18 pages each, exported from PowerPoint as PDF). Built on top of AXA's document-mode infrastructure; all 7 checks run at scope: page_each with strict-grade override.
Page classifier (backend/document_mode/page_classifier.py): heuristic tags every page as cover / checklist / palette / notes / artwork. Decision order:
- Strong palette (≥3 of CMYK/RGB/Hexadecimal headings + ≥2 hex colours) → palette
- Strong checklist (≥3 of "Asset suitable", "Fonts present", "Resolution fine", etc.) → checklist
- Artwork signals (T&Cs, offer mechanics, prices, GSL barcode) → artwork
- Yellow Notes / Client Queries with no artwork signals → notes
- Sparse Production Pack title block → cover (doubles as brief / context page)
- Default → artwork (fail-safe: false positives on artwork are recoverable)
Strict-grade exemption (Profile.strict_grade=True in profile_config.py): only artwork-classified pages count towards Pass/Fail. Cover, checklist, palette, and notes pages are scored and surfaced in the report as informational but cannot trigger a Fail. The strict-grade banner in the HTML report lists exactly which artwork-page checks fell below 6.
Cost per pack: 7 checks × pages = roughly £0.05-0.30 per pack. 4-page packs ~£0.10, 18-page packs ~£0.30.
Smoke-test results (2026-05-05): all three test packs Fail by strict-grade — but the remaining violations are genuine compliance issues, not noise. Across three rounds of prompt tuning, Easter Overlay (the noisiest 18-page pack) climbed from 72.38 → 78.97 → 80.04. Strict-grade violations dropped from 27 → 18 → 14 across 10 pages.
| Pack | Pages | Final overall | Strict-grade violations |
|---|---|---|---|
| Remington (1.8MB, 4 pages) | 4 | 70.75 | 3 (orphan asterisk, T&C wording deviations) |
| Easter Overlay (3MB, 18 pages) | 18 | 80.04 | 14 (real catches across brand_name / T&C / offer_mechanics / currency_locale) |
| Grenade (5.9MB, 7 pages) | 7 | 78.0 | 3 (caveat orphan, meal-deal format) |
The strongest real catch in the dataset: Easter Overlay page 14 is labelled for the ROI market in production notes but uses £ instead of € on the artwork — caught by boots_currency_locale. That's exactly the kind of pre-press error worth surfacing.
Vision-LLM limitations explicitly handled in prompts (so the Boots team understands what's reliable vs best-effort):
- Font weight (Boots Sharp Regular vs Light) at small sizes — surfaced via
font_weight_caveat(T&C check) andweight_caveat(caveat check) - Asterisk superscript at small sizes — surfaced via
superscript_caveat(asterisk glyph naturally sits high; only flag when raised AND shrunk) - Caveat size comparison at small sizes — surfaced via
sizing_caveat(1-2pt differences below detection threshold) - Subtle accent marks on brand names —
accent_marks_verifiableflag
Tuning patterns that worked (worth knowing for future client onboards):
- "Closed-world list" semantics — when an approved-list reference is incomplete (third-party brands, font lists, etc.), absence from list ≠ failure. Surface for manual review at neutral 7/10, flag misspellings of listed items as Fail.
- "ALL CAPS retail convention" exception — brand names rendered in caps (L'OREAL, ESTEE LAUDER) are typographic choices, not spelling errors.
- "Stylised brand logotype" exception — known logomarks like
17for SEVENTEEN are Pass. - "Best-effort with manual_check flag" pattern — for vision-LLM limitations, score 7-8 with explicit caveat field rather than confident-but-wrong Fail.
Guidance Document Summary
1. ASTERISK RULES (1 page)
- Mandatory ordering: * -> dagger -> ** -> double dagger -> triangle -> clover
- Sizing depends on context (headline = 1/3, sub-headline/roundels = 1/2, body = same size)
- NO SUPERSCRIPT ever
- Font weight must match between caveat and its T&Cs reference
- Caveat in main copy must not be smaller than in T&Cs
2. BRAND NAMES (3 pages)
- ~100 brand names with exact spelling requirements
- ~70 product/range names with exact spelling requirements
- Key patterns: accent marks (Lancome, Tresemme), internal caps (BaByliss, SkinActive), hyphens (Bio-Oil, La Roche-Posay), apostrophes (Burt's Bees), special chars (e.l.f., So...?)
3. CLICK AND COLLECT (1 page)
- One standard wording: "Available through Click & Collect, but may not be stocked in all stores. Charges may apply"
- Set in Boots Sharp Regular (not Light)
- Comes first in T&C hierarchy
- Preferred line break after "not"
4. LOCK-UP T&Cs (1 page)
- Advantage Card, Parenting Club, Price Advantage, Pyramid
- Full vs condensed copy versions
- UK vs ROI URL tailoring (boots.com / boots.ie)
5. OFFER ROUNDELS (2 pages)
- Visual templates for all approved offer types
- Spaced capitals in roundels
- Categories: Everyday Low, Save/Price Reductions, Multibuys, Threshold Spend, FREE/GWP
6. OFFER TERMS AND CONDITIONS (1 page)
- Standard T&Cs per offer type (exact wording mandated)
- Offer date format rules (same month vs different months)
- Qualifying lines
7. VALUE MECHANICS (1 page)
- Client-approved offer mechanics and messages
- "Our best..." approved phrases (10 variants)
- Points-based offers (Double/Triple points, etc.)
Known Limitations
- Accent marks on brand names: Vision LLMs may struggle with subtle accent differences (e vs e with acute, o vs o with circumflex). The check is most reliable for casing, hyphens, apostrophes, and spacing. Accent accuracy should be verified manually for critical assets.
- Font weight distinction: LLMs cannot reliably distinguish Boots Sharp Regular from Boots Sharp Light in the T&C wording check. This rule is documented but may require manual verification.
- Caveat sizing ratios: LLMs can assess relative sizing (larger/smaller/similar) but cannot measure exact point size ratios (1/3 vs 1/2). The check focuses on visually obvious violations.
- No test assets yet: Checks built from guidance documents only. Prompt tuning will be needed once test assets are available from the client.
Client Contacts (from original agent prompt)
- QC Lead: Lee Hammond (leehammond@oliver.agency) -- for guidance ambiguity/clarification
- Agent Owner: George Colesmith (georgecolesmith@oliver.agency) -- for missing information