ai_qc

History

nickviljoen 29ee941037 refactor(formatting_diff): narrow scope to bold + italic only First real-data test against the AXA car-insurance PDFs surfaced a noise problem: the new document is a brand refresh — every page flips font (PublicoBanner-Bold→PublicoHeadline-Bold) and colour (#893f4a→#2e3092). At medium-per-finding that crashed the diff score to 0.0 and drowned the bold-regression signal AXA actually flagged. Drop font, size, colour comparators. Keep bold + italic — the attributes the vision-LLM consistently misses on dense layouts. The LLM already narrates colour-scheme rebrands and font swaps in its Modified / Style-changes blocks; running both layers on the same visual change just double-counts it. Tests inverted from "X change is flagged" to "X change is NOT flagged" to lock the scope decision in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-19 12:37:19 +02:00
..
data	Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5)	2026-05-01 18:38:14 +02:00
__init__.py	Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5)	2026-05-01 18:38:14 +02:00
accessibility_checks.py	Wire veraPDF into axa_pdf_accessibility for PAC-equivalent PDF/UA-1 validation	2026-05-10 10:36:03 +02:00
checks.py	Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5)	2026-05-01 18:38:14 +02:00
diff_engine.py	fix(diff_engine): guard compute_formatting_diff against per-pair failure	2026-05-19 10:31:16 +02:00
diff_report_writer.py	fix(diff_report): _fmt_value labels italic flips correctly	2026-05-19 10:22:39 +02:00
dispatcher.py	perf(document-mode): parallelize per-page check dispatch in stages 3c/3d	2026-05-17 18:14:27 +02:00
formatting_diff.py	refactor(formatting_diff): narrow scope to bold + italic only	2026-05-19 12:37:19 +02:00
ingest.py	feat(document_mode): add deterministic span formatting diff	2026-05-19 09:56:34 +02:00
page_classifier.py	Add Boots Production Pack profile (multi-page document mode)	2026-05-05 12:47:13 +02:00
print_preflight_checks.py	Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5)	2026-05-01 18:38:14 +02:00
result_writer.py	Wire veraPDF into axa_pdf_accessibility for PAC-equivalent PDF/UA-1 validation	2026-05-10 10:36:03 +02:00