ai_qc/backend/document_mode
nickviljoen 50d0063b37 Add Boots Production Pack profile (multi-page document mode)
New profile boots_ppack for QCing multi-page Boots production packs
(PowerPoint-exported PDFs, 4-18 pages each). Built on top of AXA's
document-mode infrastructure — branched off feature/axa-document-mode
because it reuses the dispatcher, ingest, and result writer.

New checks:
- boots_logo_compliance — three-path scoring (master wordmark / partner
  lock-up / no branding) so OLIVER x BOOTS-style footer lock-ups aren't
  scored against master wordmark rules. Conservative without a formal
  Boots logo guideline.
- boots_colour_palette — verifies CMYK/RGB/Hex spec values on creative-
  guidance pages against canonical Boots Blue / Health Primary Blue /
  Offer Red, plus visual sanity-check on artwork pages.

Existing checks tuned:
- boots_brand_name_accuracy: closed-world list semantics. Brands not on
  the approved list now go to names_not_on_list (manual review) instead
  of failing — the list is sourced from the original 7 docs and is known
  incomplete (Remington, Imodium, Maybelline etc. are legitimate Boots-
  stocked brands not on it).
- boots_tandc_wording: explicit font-weight caveat — Boots Sharp Regular
  vs Light isn't reliably distinguishable by vision LLM at small sizes.
  Surfaced via font_weight_caveat field + needs_manual_check value.

Page classifier (document_mode/page_classifier.py):
Heuristic tags each page as cover / checklist / palette / notes /
artwork. Validated on all 10 sample packs.

Strict-grade exemption (Profile.strict_grade flag):
Only artwork-classified pages count towards Pass/Fail. Cover, checklist,
palette, and notes pages are still QC'd and reported as Informational
but cannot trigger a Fail. Banner shows exactly which artwork-page
checks fell below 6.

Result writer extended:
- Per-page table with score + page_type pill for any page_each-scope
  check (auto-applied as fallback)
- Strict-grade banner (red on violation, green when clean)
- Page_type pills throughout the per-page strip

Smoke-test result (Remington 4-page pack, 2026-05-05):
Overall 70.75/100, strict-grade Fail. After two iterations of prompt
tuning, all three remaining strict-grade violations are real catches:
orphan asterisk in T&Cs, "they may not be stocked" wording deviation,
missing "Charges may apply". brand_name_accuracy 7.0 (was 3.0 before
list fix), logo_compliance 9.5 (was 1.5 before lock-up path fix).

Local-only — not pushed to dev or merged to develop until after Boots
show-and-tell. Same posture as feature/axa-document-mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:47:13 +02:00
..
data Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
__init__.py Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
accessibility_checks.py Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
checks.py Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
diff_engine.py Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
diff_report_writer.py Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
dispatcher.py Add Boots Production Pack profile (multi-page document mode) 2026-05-05 12:47:13 +02:00
ingest.py Add Boots Production Pack profile (multi-page document mode) 2026-05-05 12:47:13 +02:00
page_classifier.py Add Boots Production Pack profile (multi-page document mode) 2026-05-05 12:47:13 +02:00
print_preflight_checks.py Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-01 18:38:14 +02:00
result_writer.py Add Boots Production Pack profile (multi-page document mode) 2026-05-05 12:47:13 +02:00