Changed "Success: 2" to "Processed 2 of 2" so the popup clearly
reports processing status, not QC results. Processing errors only
shown when they occur.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM was autocompleting partial words and reading them as full text, missing
the hidden overlap. New approach: explicit "DO NOT AUTOCOMPLETE" instruction,
character-level boundary check (what background is each character on),
spatial proximity check (text touching product = fail regardless), and
concrete example using the actual test case ("para" where "p" is hidden
on dark purple product appears as "ara").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds Step 4 (Hidden Overlap Detection) that catches text-product overlap
where text colour matches product colour, making overlapping letters
invisible. Instead of trying to see the hidden letter, the LLM detects
that a word is truncated/incomplete near the product edge, proving
overlap exists. E.g. "ragrância" instead of "Fragrância" where the
missing "F" is black on dark purple.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New boots_static profile (5 checks, 2.0 weight each) for retail promotional
artwork compliance: caveat rules, brand name accuracy (~170 names), offer
mechanics, T&C wording, and currency/locale. Strict grading override (any
check <6 = Fail). Guidelines embedded from 7 thematic guidance documents.
Also splits client-specific documentation out of CLAUDE.md into separate
CLAUDE_LOREAL.md, CLAUDE_AMAZON.md, CLAUDE_BOOTS.md, and CLAUDE_DOW_JONES.md
files to reduce main file size.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
L'Oreal Static profile: 3 checks → 4 checks (added text_product_overlap).
Total QC checks: 65 → 66. Documents prompt tuning decisions, detection
accuracy across both test sets, and known gaps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Callout text was fully exempt from overlap checks, so the perfume
image text crammed against the product scored 10/10. Now callouts
are still checked for direct spatial overlap — being near the product
in clean space is OK (shampoo), but overlapping or touching the
product imagery is still a fail (perfume). Only the headline width
check (Step 4) remains display-text-only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Marketing copy near the product was being misclassified as exempt
callout text. Now callout exemption requires a visible connector/
pointer line drawn from the text to a product feature. Text without
a connector line is always classified as display text regardless
of size or proximity to the product.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Small feature callouts with connector lines (e.g. "Plástico
reciclado" pointing to bottle cap) are standard cosmetics layout,
not an overlap problem. Shampoo image was false-positive failing
because callout text near the product was flagged.
Now classifies text as DISPLAY (headlines, titles, body — checked
strictly) vs CALLOUT (feature annotations with pointer lines —
exempt). Only display text triggers overlap failures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM kept passing the mask image because headline is technically
above the densest part of the translucent shape. Added Step 4
(Headline Width Check) that catches wide single-line headlines
extending across the product's horizontal space — even if vertically
above. Includes exact bad/good examples matching the L'Oreal
mask vs shampoo assets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shampoo image was incorrectly failing because:
1. Thin connector/callout lines (pointing from text to product features)
were being treated as text overlap — now explicitly excluded
2. Hero zone was too wide — small scattered droplets/bubbles were
extending the zone to cover the full image. Now clarified that
hero zone is the compact cluster around the product only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert text_readability to original (overlap is a layout issue, not a
readability one — LLM kept scoring it Pass because text was readable).
New text_product_overlap check uses a step-by-step approach:
1. Define the product hero zone (including translucent/glass elements)
2. Identify all marketing text
3. Check spatial overlap between text and hero zone
4. Compare good vs bad layout patterns
L'Oreal Static profile now has 4 checks at 2.5 weight each (was 3
checks at 3.33). Total check count: 66.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM was still dismissing translucent 3D shapes as background. Rewrote
the check as a 3-step process (define hero zone, check overlap, score)
with explicit warning not to dismiss transparent elements. Added
concrete example matching the L'Oreal Absolut Repair Molecular mask
asset where the headline crosses the translucent sculpted shape.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM was not recognising translucent/glass decorative elements around
products as part of the product area. Expanded the definition to
explicitly include semi-transparent shapes, artistic renders, and
silhouettes. Added boundary-drawing instruction and concrete
pass/fail examples matching L'Oreal Absolut Repair Molecular assets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds critical check for marketing text overlapping product or product
packaging (e.g., translucent containers, decorative elements). Text
overlapping product area caps score at 4-5/10, triggering Fail under
L'Oreal's strict grading override.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document text_readability neutral scoring for product-only shots,
background_contrast visibility-focused tuning, and test results
for 5 new L'Oreal Absolut Repair Molecular assets.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The prompt was too aggressive about light products on light backgrounds,
causing professional product photography on white backgrounds to fail
(e.g. L'Oreal cream jar on white). Now evaluates whether the product
is actually VISIBLE and DISTINGUISHABLE (via shadows, edges, texture,
contrasting elements) rather than failing on theoretical colour match.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- web_ui: add overflow:hidden and min-width:0 to .form-section to
prevent long filenames breaking the CSS grid layout
- web_ui: add overflow-x:hidden to queue list container
- api_server: add word-break:break-all to .filename in individual reports
- api_server: add table-layout:fixed and word-break to consolidated
report table cells for proper text wrapping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- File queue: add text-overflow ellipsis, min-width:0, flex-shrink:0 to
prevent long filenames from pushing Pending/Remove buttons out of view
- text_readability: product-only images (no marketing text layout) now
score 7/10 neutral instead of 1-2/10 critical fail. Hidden/invisible
text in marketing layouts still scores 1-2.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New client with embedded brand guidelines for Dow Jones Corporate,
MarketWatch, and Wall Street Journal sub-brands. Guidelines sourced
from live.standards.site scrapes and baked into check prompts.
- dow_jones_static: 5 checks (logo, color, typography, square motif, photography)
- marketwatch_static: 6 checks (logo, color, typography, image treatment, layout, art direction)
- wsj_static: 6 checks (logo, color tiers, typography, imagery, capitalization, layout)
- System now has 7 clients, 12 profiles, 65 QC checks
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The consolidated report was recalculating grades with determine_grade()
which doesn't account for profile-specific grading overrides (e.g.
Amazon checks where individual check failures force overall Fail).
Now reads the actual grade stored in the report's summary.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OCR calibration needs more work - correct files were failing due to
Tesseract bounding box inaccuracies on different server versions.
Code is commented out and ready to re-enable after proper tuning.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OCR measurements were causing the LLM to over-rely on bounding box numbers
and fail correct assets on minor measurement inaccuracies. Changes:
- All prompts now say "supplementary data" not "authoritative/primary source"
- LLM instructed to prioritise visual assessment, use OCR to confirm/question
- Alignment tolerance widened from 1.5% to 3% of width
- OCR context footer softened with accuracy caveat (~5-10px margin of error)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Issue Summary column was showing raw LLM JSON output. Now extracts
the explanation and recommendations from parsed json_data. Also uses
display_name for check names (e.g. "Amazon Margins" not "Amazon_margins").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tesseract was grouping date + logo text into one block (e.g. "8-11 luglio
amazon prime day"), inflating the date char_height and causing false
typography failures. Now groups by (block_num, line_num) so each text
line becomes a separate element, enabling correct identification of
date, logo, and legal as distinct elements.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Amazon guidelines define margins as 7% of shortest side, but OCR was only
reporting % of width — giving misleadingly small numbers on wide formats
(e.g. 2.6% of 1920px width = 50px, but 6.9% of 720px shortest side).
Now includes shortest-side percentage prominently in OCR context, plus the
7% target in pixels so the LLM can compare directly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New API endpoint POST /api/delete_output_files (auth required)
- Delete Selected button in saved files controls bar
- Confirmation prompt before deletion
- Auto-refreshes file list after successful delete
- Path traversal protection via os.path.basename sanitization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wrap OCR_RELEVANT_CHECKS import in try/except so analysis continues
gracefully if ocr_measurement module fails to load.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds Tesseract-based OCR pre-processing that computes pixel-level text
positions, margins, spacing, and alignment before LLM analysis. This
enables detection of subtle layout differences that vision models miss
(e.g. 2.8% vs 6.4% headline margin, 83px vs 39px date gap).
OCR measurements injected into 10 checks across all client profiles:
- Amazon: margins, typography, headline_layout
- Static General: element_alignment, safety_area, visual_hierarchy_general,
text_readability_general, text_edge_clearance
- L'Oreal: text_readability
- Diageo/Unilever KV: visual_hierarchy
Non-blocking: if Tesseract is unavailable, checks run with visual
estimation only. Production requires: sudo apt install tesseract-ocr
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add silent auth check every 5 minutes to detect expired sessions proactively,
showing a "Session Expired" prompt instead of failing silently on next action
- Rename amazon_box_placement to amazon_element_placement across module directory,
profile config, class name, and documentation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename "Upload Brand Guidelines" to "Upload a Reference Asset"
- Add info box explaining how each file type is used (PDF, images, Excel)
- Add "Remove" button on each uploaded reference asset
- Accept .xlsx/.xls files in the file picker
- Show "Localisation Matrix" badge with market count for parsed Excel files
- Update all labels and placeholder text to reference asset terminology
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Urgency check now runs before dates check in label classifier
("urgency messaging (to replace dates)" was matching as "dates")
- Handle unlabeled dates rows (empty column A) by inferring from position
after tagline rows
- Stop parsing at "Print Day 4" sub-section boundary to prevent
overwriting main message data
Message B / DE now correctly returns "8. - 11. Juli" instead of
"Nur noch heute" (urgency text from Print Day 4 section).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New localization_processor.py: parses Excel localization matrices with
MESSAGE A/B sections, extracting expected headline, dates, logo, legal
per country
- Excel files uploaded as reference assets are auto-detected and parsed
as localization matrices if they contain MESSAGE A/B structure
- During analysis, cross-references media plan creative_name (Message A/B)
and country with parsed matrix to inject expected copy into QC prompts
- LLM checks can now verify asset text matches the correct message version
and market localization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Typography: soften date spacing rule for tall/portrait formats (half cap-height OK)
to fix false positive on DE_Ströer correct file
- Margins: add percentage-based gap estimation to force LLMs to quantify left alignment
differences, improving detection of subtle headline oversizing (IE_BrightSide case)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Box placement: tape cut off by asset edge is now a fail (was passing if tape visible on other edges)
- Margins: left alignment consistency elevated to critical check with step-by-step comparison method
- Typography: headline-to-date spacing uses cap-height reference instead of lenient "only if clearly too tight"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix process queue bug: reset queue items to pending after completion so
reprocessing works without page refresh
- Show report details for 10/10 scores by extracting explanation and
elements_found fields from Amazon check JSON responses, with a fallback
that renders all JSON data as a structured summary
- Add Amazon Static grade override: any individual check scoring below 6
forces overall grade to Fail (same logic as L'Oreal)
- Box placement prompt: relax tape visibility rule from "all edges" to
"at least one edge visible", prevent false positives on landscape formats
where box sits near right edge
- Headline layout prompt: fix LLM misreading one-word-per-line as combined
lines in tall/portrait formats, score multi-sentence headlines in 9:16
as 6/10 (pass with recommendation) instead of fail
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the L'Oreal Static profile revert to 3 checks, prompt
refinements for hidden/invisible text detection, test results from
both Gemini and OpenAI, and known gaps for future tuning.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The background_contrast check now returns recommended_adjustments as a
list instead of a string. The HTML generation called .lower() on it,
causing an AttributeError that silently prevented reports from being
saved to disk. Analysis completed fine but no output file was written.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverted loreal_static from 2-check (visual_readability_contrast) to
3-check setup (language_consistency, text_readability, background_contrast)
to avoid score dilution. Updated text_readability and background_contrast
prompts from POS-focused to digital marketing, and added critical hidden/
invisible text detection (black-on-black, white-on-white scanning).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the full Amazon QC prompt refinement process: test file
pairs and their specific defects, all prompt changes made, current
detection accuracy per LLM, and known gaps for future work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Profiles with weights totalling slightly over 10.0 (e.g. 6 x 1.67 =
10.02) could produce scores like 100.2. Added min() caps to all 4
score calculation locations to prevent scores exceeding the maximum.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Headline layout: prepositions at end of line (di, of, de) are now
acceptable in display typography. Tall format one-word-per-line is
standard. Only fail for sentence endings mid-line or genuinely
confusing breaks.
Typography: spacing check now only flags genuinely cramped/touching
elements, not moderate gaps that provide visible separation.
Box placement: improved tape description to match actual campaign
assets (branded coloured strips on box edges, not plain packing tape).
Only flag tape as missing if cropped by asset edge or genuinely absent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The margins check was giving 10/10 when headline text was too close to
the left edge and misaligned with the logo/branding below. Added left
alignment consistency section checking that headline, date, and logo
share the same left margin position, with misalignment flagged as a
margin/sizing issue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The typography check was giving 10/10 when the date text was squeezed
up against the headline with no breathing room. Added explicit element
spacing assessment section with guidance that the gap between headline
and next element should be at least the height of a capital letter in
the date text, and included it in evaluation steps and output JSON.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LLM was giving 10/10 Pass to boxes with tape completely cut off,
treating aggressive right-side crops as acceptable. Added explicit
step-by-step tape visibility checks, clarified that right-side crop
must not remove tape/flaps, and added guidance to not assume tape is
visible just because the box body is present.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adjusted all 6 Amazon check weights to equal 1.67 each based on test
results showing incorrect scoring. Refined prompts for box placement
(format-aware positioning, better tape description), required elements
(subhead now optional for OOH), logo country (country match as primary
factor), margins (visual assessment over pixel estimates), and headline
layout (natural language break detection, tall format awareness).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Long filenames no longer push View/Download buttons out of line. Added Media Plan
dropdown in QC Configuration so users can opt-in to using a media plan per analysis.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document pdf_processor.py and media_plan_processor.py in main
components. Add detailed sections for PDF reference asset processing
and media plan system. Add production permissions fix to common
issues table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upload Excel media plans per client. On QC analysis, automatically
match the uploaded file's name against the media plan's Asset IDs
to validate dimensions and file type, and include the media plan
context (country, language, placement, vendor) in QC check prompts.
- New backend/media_plan_processor.py: Excel parsing, fuzzy filename
matching, dimension/file-type validation, prompt context builder
- New backend/media_plans/ directory for storage
- API endpoints: POST/GET/DELETE /api/media_plan
- Settings modal: new "Media Plan" tab for upload/manage
- Analysis flow: auto-match + validation in response + context in prompts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PDF brand guidelines were previously ignored - QC checks received no
content from uploaded PDFs. Now on upload, all pages are text-extracted,
summarized by Gemini into a structured brand guidelines summary, and
a cover image is extracted. QC checks receive the full summary in their
prompt and the cover image as visual reference.
- New backend/pdf_processor.py: text extraction, cover image, LLM summary
- brand_guidelines_db.py: summary/cover path tracking, cleanup on delete
- api_server.py: background processing on upload, summary-aware content
retrieval, PDF cover image support, status/reprocess endpoints, startup
backfill for existing unprocessed PDFs
- web_ui.html: processing status badges and upload feedback for PDFs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Settings modal was only created once per page load (if !modal).
Now recreates the modal each time it opens, ensuring the latest
tabs and content are always rendered. Prevents stale cached HTML.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>