Commit graph

43 commits

Author SHA1 Message Date
nickviljoen
4c19a0fb9d feat(hp_copy_review): single-check LLM grader against Source Messaging
Single Gemini call per asset. Prompt assembles attached Source
Messaging summaries + media-plan language context + the asset image.
Returns structured JSON with score, summary, and a findings array
(priority, category, quote, issue, suggested fix, source reference).
Empty findings = clean asset; missing reference -> score 0 with a
clear message rather than running blind.

Mirrors the boots_tandc_wording pattern: subclass FlaskAppTemplate,
expose a static prompt template, let process_single_check inject
reference-asset content and media-plan context at runtime. A
standalone build_prompt() helper mirrors that assembly for unit-
style smoke tests and ad-hoc prompt inspection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 21:25:30 +02:00
nickviljoen
b23b7f2e17 chore(dow-jones): archive profiles, checks, and per-client doc
Moves the Dow Jones / MarketWatch / WSJ profile JSONs (4), check apps
(22), and CLAUDE_DOW_JONES.md into backend/_archive/dow_jones/. All
moves use git mv so history follows. Adds a restore-instructions
README. No loader changes needed — the archive lives outside the
scanned directories.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:11:54 +02:00
nickviljoen
f5aaf8da24 Merge feature/dow-jones-tuning into develop: WSJ Static prompt tuning 2026-05-06 12:03:56 +02:00
nickviljoen
3b76bf2c9c Tune WSJ Static prompts: cap whitelist, graphic headline, split-layout logo, 30% sizing cap
- wsj_capitalization_punctuation: explicit complete-sentence whitelist + soft-flag pattern for Rule 5 price formatting (price_spacing_correct / price_bolded_correct accept needs_manual_check, new price_formatting_caveat field)
- wsj_typography_hierarchy: graphic/illustrative headline awareness — large stylised serif price/number graphics are recognised as the display headline; surrounding sans-serif copy is correctly classified as subhead/body. Stylised price headlines exempt from the period rule.
- wsj_logo_compliance: horizontal logo placement allows anchoring to the copy block on split/asymmetric layouts; mandatory sizing assessment block with worked examples, score capped at 6/10 for logos exceeding 30% of longest side.

Validated on 3 WSJ-NY test assets across 3 iterations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:01:59 +02:00
nickviljoen
cec11f1f6a Tune Boots PPack prompts: superscript guard, ALL CAPS / logotype exceptions, weight/sizing limits
Three rounds of prompt tuning against the Remington (4p), Easter Overlay
(18p), and Grenade (7p) sample packs. Easter Overlay (the noisiest)
climbed 72.38 → 78.97 → 80.04 across iterations, with strict-grade
violations dropping 27 → 18 → 14. Remaining violations are now genuine
compliance issues — the noise patterns are cleared.

boots_caveat_compliance:
- Superscript guard: vision LLM was flagging every roundel asterisk as
  superscript because the * glyph naturally sits high in its line.
  Strict two-feature rule now required (raised baseline AND visibly
  shrunk ~50-60% of body). Borderline cases → "needs_manual_check"
  with new superscript_caveat field. Caveat avg 4.4 → 7.27.
- Same vision-LLM caveat applied to weight_matching (Light vs Regular
  at small sizes is below detection threshold) and sizing_compliant
  (1-2pt size differences below detection threshold). New weight_caveat
  and sizing_caveat fields. Reserved 1-2 score band for unambiguous
  critical violations only.
- Explicit scoring principle: "when in doubt, prefer 7-8 with
  manual_check flags over a lower confident-violation score".

boots_brand_name_accuracy:
- ALL CAPS retail convention now explicitly acceptable. L'OREAL,
  ESTEE LAUDER, MAYBELLINE etc. no longer flagged as casing errors —
  only structural element mismatches (accents, hyphens, apostrophes,
  special chars) count.
- Stylised brand logotype exception: known logomarks like `17` for
  SEVENTEEN, &SISTERS ampersand styling, e.l.f. dot rendering are
  Pass — surfaced via new logotype_observations field.
- Brand name avg 5.53 → 7.47 → 6.67 (LLM run-to-run variability).

Strongest real catch in dataset: Easter Overlay page 14 is labelled
for the ROI market in production notes but uses £ instead of € on
the artwork. Exactly the pre-press error worth surfacing. Caught
consistently across all runs by boots_currency_locale.

CLAUDE_BOOTS.md updated with three-pack smoke-test table, vision-LLM
limitations summary, and the four reusable prompt-tuning patterns
that worked on this build.

Local-only — feature/boots-ppack remains unmerged until after Boots
show-and-tell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:26:11 +02:00
nickviljoen
50d0063b37 Add Boots Production Pack profile (multi-page document mode)
New profile boots_ppack for QCing multi-page Boots production packs
(PowerPoint-exported PDFs, 4-18 pages each). Built on top of AXA's
document-mode infrastructure — branched off feature/axa-document-mode
because it reuses the dispatcher, ingest, and result writer.

New checks:
- boots_logo_compliance — three-path scoring (master wordmark / partner
  lock-up / no branding) so OLIVER x BOOTS-style footer lock-ups aren't
  scored against master wordmark rules. Conservative without a formal
  Boots logo guideline.
- boots_colour_palette — verifies CMYK/RGB/Hex spec values on creative-
  guidance pages against canonical Boots Blue / Health Primary Blue /
  Offer Red, plus visual sanity-check on artwork pages.

Existing checks tuned:
- boots_brand_name_accuracy: closed-world list semantics. Brands not on
  the approved list now go to names_not_on_list (manual review) instead
  of failing — the list is sourced from the original 7 docs and is known
  incomplete (Remington, Imodium, Maybelline etc. are legitimate Boots-
  stocked brands not on it).
- boots_tandc_wording: explicit font-weight caveat — Boots Sharp Regular
  vs Light isn't reliably distinguishable by vision LLM at small sizes.
  Surfaced via font_weight_caveat field + needs_manual_check value.

Page classifier (document_mode/page_classifier.py):
Heuristic tags each page as cover / checklist / palette / notes /
artwork. Validated on all 10 sample packs.

Strict-grade exemption (Profile.strict_grade flag):
Only artwork-classified pages count towards Pass/Fail. Cover, checklist,
palette, and notes pages are still QC'd and reported as Informational
but cannot trigger a Fail. Banner shows exactly which artwork-page
checks fell below 6.

Result writer extended:
- Per-page table with score + page_type pill for any page_each-scope
  check (auto-applied as fallback)
- Strict-grade banner (red on violation, green when clean)
- Page_type pills throughout the per-page strip

Smoke-test result (Remington 4-page pack, 2026-05-05):
Overall 70.75/100, strict-grade Fail. After two iterations of prompt
tuning, all three remaining strict-grade violations are real catches:
orphan asterisk in T&Cs, "they may not be stocked" wording deviation,
missing "Charges may apply". brand_name_accuracy 7.0 (was 3.0 before
list fix), logo_compliance 9.5 (was 1.5 before lock-up path fix).

Local-only — not pushed to dev or merged to develop until after Boots
show-and-tell. Same posture as feature/axa-document-mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:47:13 +02:00
nickviljoen
67ed7fdd9d Add wsj podcast profile to Dow Jones client, File naming check added to all profiles 2026-04-29 18:17:36 +02:00
nickviljoen
b32e8f0c8b Add wsj podcast profile to Dow Jones client, File naming check added to all profiles 2026-04-29 18:09:58 +02:00
nickviljoen
20259dcad0 Add Honda client, video QC, session refresh, Amazon check tuning
- Add Honda client with static_general and video_general profiles
- Add video QC capability using Gemini native video analysis (4 checks:
  visual_quality, brand_consistency, text_legibility, pacing_flow)
- Add video_general profile assigned to all 8 clients
- Extend session lifetime with MSAL silent token refresh (proactive
  every 45min + reactive on expiry), switch cache to localStorage
- Re-enable OCR layout measurements for Amazon checks
- Add scope boundary notes to all 6 Amazon checks to prevent cross-
  check penalization (locale errors isolated to logo_country only)
- Relax margins left-alignment tolerance from 1% to 4% to account
  for logo lockup internal padding
- Update brand guidelines DB with Amazon localization matrix and
  processed Dove PDF summary

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:53:52 +02:00
nickviljoen
ce13213b51 Tune all 6 WSJ check prompts based on campaign asset testing
Fixes based on testing 12 WSJ subscription campaign assets across 4
concepts and multiple formats (320x50 to 1080x1920):

- wsj_color_usage: Clarify Pop-on-Jewel is the primary approved
  combination, not a tier-mixing violation
- wsj_logo_compliance: Marketing assets use 30% longest side rule
  (not 60% standalone rule); fix stacked logotype sizing too
- wsj_capitalization_punctuation: Add explicit decision tree for
  Title Case vs Sentence Case — complete sentences use sentence case
- wsj_layout_composition: Add graphic/illustrative as valid design
  variation; add format awareness for small banners
- wsj_imagery_expression: Broaden neutral category to explicitly
  cover graphic/illustrative campaign assets
- wsj_typography_hierarchy: Add format awareness so small formats
  aren't penalised for fewer hierarchy levels

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:33:13 +02:00
nickviljoen
512e5ecb8b Strengthen hidden overlap detection with anti-autocomplete and proximity checks
LLM was autocompleting partial words and reading them as full text, missing
the hidden overlap. New approach: explicit "DO NOT AUTOCOMPLETE" instruction,
character-level boundary check (what background is each character on),
spatial proximity check (text touching product = fail regardless), and
concrete example using the actual test case ("para" where "p" is hidden
on dark purple product appears as "ara").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:10:47 +02:00
nickviljoen
487b2b046b Add incomplete word detection to text_product_overlap check
Adds Step 4 (Hidden Overlap Detection) that catches text-product overlap
where text colour matches product colour, making overlapping letters
invisible. Instead of trying to see the hidden letter, the LLM detects
that a word is truncated/incomplete near the product edge, proving
overlap exists. E.g. "ragrância" instead of "Fragrância" where the
missing "F" is black on dark purple.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:03:29 +02:00
nickviljoen
6f3528b54f Add Boots client QC profile with 5 compliance checks and split CLAUDE.md client docs
New boots_static profile (5 checks, 2.0 weight each) for retail promotional
artwork compliance: caveat rules, brand name accuracy (~170 names), offer
mechanics, T&C wording, and currency/locale. Strict grading override (any
check <6 = Fail). Guidelines embedded from 7 thematic guidance documents.

Also splits client-specific documentation out of CLAUDE.md into separate
CLAUDE_LOREAL.md, CLAUDE_AMAZON.md, CLAUDE_BOOTS.md, and CLAUDE_DOW_JONES.md
files to reduce main file size.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 09:25:58 +02:00
nickviljoen
0e36998359 Check callout text for direct overlap, not just display text
Callout text was fully exempt from overlap checks, so the perfume
image text crammed against the product scored 10/10. Now callouts
are still checked for direct spatial overlap — being near the product
in clean space is OK (shampoo), but overlapping or touching the
product imagery is still a fail (perfume). Only the headline width
check (Step 4) remains display-text-only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:33:22 +02:00
nickviljoen
60a043fcb6 Tighten callout exemption: require visible connector line
Marketing copy near the product was being misclassified as exempt
callout text. Now callout exemption requires a visible connector/
pointer line drawn from the text to a product feature. Text without
a connector line is always classified as display text regardless
of size or proximity to the product.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:27:31 +02:00
nickviljoen
3b202fe1b1 Exempt callout/annotation text from text_product_overlap check
Small feature callouts with connector lines (e.g. "Plástico
reciclado" pointing to bottle cap) are standard cosmetics layout,
not an overlap problem. Shampoo image was false-positive failing
because callout text near the product was flagged.

Now classifies text as DISPLAY (headlines, titles, body — checked
strictly) vs CALLOUT (feature annotations with pointer lines —
exempt). Only display text triggers overlap failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:08:56 +02:00
nickviljoen
560ee8e85c Add headline width check to text_product_overlap prompt
LLM kept passing the mask image because headline is technically
above the densest part of the translucent shape. Added Step 4
(Headline Width Check) that catches wide single-line headlines
extending across the product's horizontal space — even if vertically
above. Includes exact bad/good examples matching the L'Oreal
mask vs shampoo assets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:02:00 +02:00
nickviljoen
a66fea7295 Fix text_product_overlap false positives: exclude callout lines and tighten hero zone
Shampoo image was incorrectly failing because:
1. Thin connector/callout lines (pointing from text to product features)
   were being treated as text overlap — now explicitly excluded
2. Hero zone was too wide — small scattered droplets/bubbles were
   extending the zone to cover the full image. Now clarified that
   hero zone is the compact cluster around the product only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 10:53:55 +02:00
nickviljoen
66f1d1480b Add dedicated text_product_overlap check for L'Oreal profile
Revert text_readability to original (overlap is a layout issue, not a
readability one — LLM kept scoring it Pass because text was readable).

New text_product_overlap check uses a step-by-step approach:
1. Define the product hero zone (including translucent/glass elements)
2. Identify all marketing text
3. Check spatial overlap between text and hero zone
4. Compare good vs bad layout patterns

L'Oreal Static profile now has 4 checks at 2.5 weight each (was 3
checks at 3.33). Total check count: 66.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 10:46:13 +02:00
nickviljoen
bff2739604 Rewrite text-product overlap as step-by-step hero zone evaluation
LLM was still dismissing translucent 3D shapes as background. Rewrote
the check as a 3-step process (define hero zone, check overlap, score)
with explicit warning not to dismiss transparent elements. Added
concrete example matching the L'Oreal Absolut Repair Molecular mask
asset where the headline crosses the translucent sculpted shape.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 10:34:33 +02:00
nickviljoen
638692ff9a Strengthen text-product overlap prompt to catch translucent packaging
LLM was not recognising translucent/glass decorative elements around
products as part of the product area. Expanded the definition to
explicitly include semi-transparent shapes, artistic renders, and
silhouettes. Added boundary-drawing instruction and concrete
pass/fail examples matching L'Oreal Absolut Repair Molecular assets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 10:28:19 +02:00
nickviljoen
fe22f0e7b9 Add text-over-product overlap detection to text_readability prompt
Adds critical check for marketing text overlapping product or product
packaging (e.g., translucent containers, decorative elements). Text
overlapping product area caps score at 4-5/10, triggering Fail under
L'Oreal's strict grading override.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 10:19:48 +02:00
nickviljoen
b6f5f7471e Tune background_contrast prompt: focus on actual visibility not colour similarity
The prompt was too aggressive about light products on light backgrounds,
causing professional product photography on white backgrounds to fail
(e.g. L'Oreal cream jar on white). Now evaluates whether the product
is actually VISIBLE and DISTINGUISHABLE (via shadows, edges, texture,
contrasting elements) rather than failing on theoretical colour match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 19:13:49 +02:00
nickviljoen
eeaeb96b4e Fix file queue overflow for long filenames + score product-only shots neutrally
- File queue: add text-overflow ellipsis, min-width:0, flex-shrink:0 to
  prevent long filenames from pushing Pending/Remove buttons out of view
- text_readability: product-only images (no marketing text layout) now
  score 7/10 neutral instead of 1-2/10 critical fail. Hidden/invisible
  text in marketing layouts still scores 1-2.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:57:01 +02:00
nickviljoen
5a57a8a064 Add Dow Jones client with 3 sub-brand QC profiles (17 new checks)
New client with embedded brand guidelines for Dow Jones Corporate,
MarketWatch, and Wall Street Journal sub-brands. Guidelines sourced
from live.standards.site scrapes and baked into check prompts.

- dow_jones_static: 5 checks (logo, color, typography, square motif, photography)
- marketwatch_static: 6 checks (logo, color, typography, image treatment, layout, art direction)
- wsj_static: 6 checks (logo, color tiers, typography, imagery, capitalization, layout)
- System now has 7 clients, 12 profiles, 65 QC checks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:33:14 +02:00
nickviljoen
9eed569587 Tone down OCR from authoritative to supplementary to reduce false positives
OCR measurements were causing the LLM to over-rely on bounding box numbers
and fail correct assets on minor measurement inaccuracies. Changes:
- All prompts now say "supplementary data" not "authoritative/primary source"
- LLM instructed to prioritise visual assessment, use OCR to confirm/question
- Alignment tolerance widened from 1.5% to 3% of width
- OCR context footer softened with accuracy caveat (~5-10px margin of error)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 13:54:00 +02:00
nickviljoen
9f9777240a Add OCR layout measurement module for precise spatial QC checks
Adds Tesseract-based OCR pre-processing that computes pixel-level text
positions, margins, spacing, and alignment before LLM analysis. This
enables detection of subtle layout differences that vision models miss
(e.g. 2.8% vs 6.4% headline margin, 83px vs 39px date gap).

OCR measurements injected into 10 checks across all client profiles:
- Amazon: margins, typography, headline_layout
- Static General: element_alignment, safety_area, visual_hierarchy_general,
  text_readability_general, text_edge_clearance
- L'Oreal: text_readability
- Diageo/Unilever KV: visual_hierarchy

Non-blocking: if Tesseract is unavailable, checks run with visual
estimation only. Production requires: sudo apt install tesseract-ocr

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:00:07 +02:00
nickviljoen
08b17508ba Add periodic auth session check and rename Amazon Box Placement to Element Placement
- Add silent auth check every 5 minutes to detect expired sessions proactively,
  showing a "Session Expired" prompt instead of failing silently on next action
- Rename amazon_box_placement to amazon_element_placement across module directory,
  profile config, class name, and documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:01:27 +02:00
nickviljoen
d5d54a6a3d Refine Amazon typography and margins prompts to reduce false positives
- Typography: soften date spacing rule for tall/portrait formats (half cap-height OK)
  to fix false positive on DE_Ströer correct file
- Margins: add percentage-based gap estimation to force LLMs to quantify left alignment
  differences, improving detection of subtle headline oversizing (IE_BrightSide case)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:36:17 +02:00
nickviljoen
1df80e4664 Tighten Amazon prompt detection for box tape cropping, margin alignment, and date spacing
- Box placement: tape cut off by asset edge is now a fail (was passing if tape visible on other edges)
- Margins: left alignment consistency elevated to critical check with step-by-step comparison method
- Typography: headline-to-date spacing uses cap-height reference instead of lenient "only if clearly too tight"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:56:20 +02:00
nickviljoen
34a03be6cb Amazon prompt tuning, report details for passing checks, and queue fix
- Fix process queue bug: reset queue items to pending after completion so
  reprocessing works without page refresh
- Show report details for 10/10 scores by extracting explanation and
  elements_found fields from Amazon check JSON responses, with a fallback
  that renders all JSON data as a structured summary
- Add Amazon Static grade override: any individual check scoring below 6
  forces overall grade to Fail (same logic as L'Oreal)
- Box placement prompt: relax tape visibility rule from "all edges" to
  "at least one edge visible", prevent false positives on landscape formats
  where box sits near right edge
- Headline layout prompt: fix LLM misreading one-word-per-line as combined
  lines in tall/portrait formats, score multi-sentence headlines in 9:16
  as 6/10 (pass with recommendation) instead of fail

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:19:42 +02:00
nickviljoen
a580d4bb0f Revert L'Oreal profile to 3 checks and add hidden text detection
Reverted loreal_static from 2-check (visual_readability_contrast) to
3-check setup (language_consistency, text_readability, background_contrast)
to avoid score dilution. Updated text_readability and background_contrast
prompts from POS-focused to digital marketing, and added critical hidden/
invisible text detection (black-on-black, white-on-white scanning).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 12:28:57 +02:00
nickviljoen
ee9f7ed2be Reduce false positives in headline, typography, and box checks
Headline layout: prepositions at end of line (di, of, de) are now
acceptable in display typography. Tall format one-word-per-line is
standard. Only fail for sentence endings mid-line or genuinely
confusing breaks.

Typography: spacing check now only flags genuinely cramped/touching
elements, not moderate gaps that provide visible separation.

Box placement: improved tape description to match actual campaign
assets (branded coloured strips on box edges, not plain packing tape).
Only flag tape as missing if cropped by asset edge or genuinely absent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 11:07:35 +02:00
nickviljoen
345eff99e3 Add left alignment consistency check to margins prompt
The margins check was giving 10/10 when headline text was too close to
the left edge and misaligned with the logo/branding below. Added left
alignment consistency section checking that headline, date, and logo
share the same left margin position, with misalignment flagged as a
margin/sizing issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:33:45 +02:00
nickviljoen
9631477250 Add headline-to-date spacing check in typography prompt
The typography check was giving 10/10 when the date text was squeezed
up against the headline with no breathing room. Added explicit element
spacing assessment section with guidance that the gap between headline
and next element should be at least the height of a capital letter in
the date text, and included it in evaluation steps and output JSON.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:24:26 +02:00
nickviljoen
0f0a953b06 Strengthen box placement tape/flap visibility check
The LLM was giving 10/10 Pass to boxes with tape completely cut off,
treating aggressive right-side crops as acceptable. Added explicit
step-by-step tape visibility checks, clarified that right-side crop
must not remove tape/flaps, and added guidance to not assume tape is
visible just because the box body is present.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:19:41 +02:00
nickviljoen
9b40ad0e77 Equalize Amazon profile weights and refine QC check prompts
Adjusted all 6 Amazon check weights to equal 1.67 each based on test
results showing incorrect scoring. Refined prompts for box placement
(format-aware positioning, better tape description), required elements
(subhead now optional for OOH), logo country (country match as primary
factor), margins (visual assessment over pixel estimates), and headline
layout (natural language break detection, tall format awareness).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 10:07:06 +02:00
nickviljoen
24df09aa9f Add Amazon and Boots clients with Amazon ASD 2025 QC profile
Add Boots client with static_general profile and Amazon client with
6 new brand-specific QC checks based on ASD 2025 design guidelines:
amazon_required_elements, amazon_logo_country, amazon_typography,
amazon_headline_layout, amazon_margins, and amazon_box_placement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:52:03 +02:00
nickviljoen
3411c1028b Strengthen product-background blending detection in visual readability check
- Add explicit per-product contrast evaluation (each product checked individually)
- Dark product on dark background or light on light = automatic fail
- Single product blending fails the entire check even if others pass
- Require per-product breakdown in JSON output (product_1, product_2, etc.)
- Add edge/silhouette distinguishability criteria

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 19:48:57 +02:00
nickviljoen
8c9401bf3c Add combined check, client-scoped assets, UI restructure, reporting, and report consolidation
- Create visual_readability_contrast combined check merging text readability
  and background contrast into a single LLM call for L'Oreal Static profile
- Update loreal_static.json to use combined check (2 checks, 100-point scale)
- Add client_id filtering to brand guidelines (upload, fetch, backfill migration)
- Restructure settings modal from 5 tabs to 4: Profile, Create New Profile,
  Reference Assets, Reporting (removed Model Selection, merged Tools into Profile)
- Add GET /api/profile_usage_stats endpoint with summary cards and recent analyses
- Add POST /api/consolidate_reports endpoint generating HTML summary with
  pass/fail highlighting from multiple selected reports
- Add report selection checkboxes and consolidation controls to saved files list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 19:10:32 +02:00
nickviljoen
70c2563521 Create digital-focused general check modules and update Static General profile
Created 4 new "_general" QC check modules optimized for digital static assets:
- visual_hierarchy_general: Digital hierarchy assessment (removed POS/physical viewing distances)
- product_visibility_general: Digital product presentation (removed POS terminology)
- logo_visibility_general: Digital logo prominence (removed 3m/1m viewing distance requirements)
- call_to_action_general: Digital CTA effectiveness (added clickability and mobile considerations)

Updated Static General profile (static_general.json):
- Now includes 10 AI vision-focused checks
- Even weighting: 1.0 per check for 100-point scale
- Total weight: 10.0 for proper scoring calculation
- All checks assigned to Gemini LLM
- Updated description to clarify focus on AI vision capabilities

Profile focuses exclusively on checks that only AI vision models can perform,
excluding physical file properties that Twist system handles (file size, format,
resolution, naming, aspect ratio, bleed, crop marks, etc.).

10 checks in Static General profile:
1. text_readability_general
2. background_contrast_general
3. language_consistency
4. visual_hierarchy_general (NEW)
5. element_alignment
6. product_visibility_general (NEW)
7. logo_visibility_general (NEW)
8. call_to_action_general (NEW)
9. accessibility
10. inclusive

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 11:55:23 +02:00
nickviljoen
16741a96d6 Add L'Oréal Static General profile with multi-file queue and enhanced reporting
## New Features

### L'Oréal Static General Profile
- Created new profile with 3 checks optimized for digital marketing assets
- Even weighting (33.3% each) for 100-point scoring scale
- Removed print-specific requirements (3m viewing distance)
- Focus on marketing text vs product packaging distinction

### Multi-File Queue System (web_ui.html)
- Added file queue functionality for batch processing
- Users can now upload and process multiple files simultaneously
- Queue displays file status (pending, analyzing, complete, error)
- Individual file removal and queue clearing options
- Progress tracking for batch operations

### New General QC Checks
1. background_contrast_general
   - Optimized for digital assets (no distance requirements)
   - Checks logo, product, and marketing text contrast
   - Detects overlapping and blending issues
   - Provides element-by-element breakdown

2. text_readability_general
   - Focus on marketing text only (excludes product packaging)
   - Checks for overlapping elements
   - Digital readability optimization
   - Specific issue identification

3. language_consistency (enhanced)
   - Better distinction between marketing and packaging text
   - Detailed language detection and reporting
   - Lists specific text analyzed

### Usage Tracking System
- Added usage_tracker.py for analysis logging
- Tracks user activity, profile usage, and costs
- Daily log files in JSONL format
- Cost estimation per LLM provider

## Bug Fixes

### Authentication & User Management
- Fixed Flask 'g' import missing issue
- Fixed user info access in background threads
- Pass user_info to threads instead of accessing g.user
- Improved error handling for usage logging

### HTML Report Generation
- Fixed missing analysis details in reports
- Now extracts and displays all JSON fields properly
- Shows comprehensive breakdowns:
  - Analysis details
  - Elements checked (logo, product, text)
  - Marketing text found
  - Issues identified
  - Specific recommendations
- No more blank "Pass/Fail" results

### Scoring System
- Fixed usage_tracker to handle dict of check results (not list)
- Better handling of model_used field variations
- Skip non-dict check results gracefully

## Configuration Changes

### Model Versions (llm_config.py)
- Fixed invalid GPT-4.1 model ID to gpt-4o
- Added Gemini 3 Pro beta model option
- AVAILABLE_MODELS dict for UI selection
- Model version override support

### Profile Updates
- Static General: 3 checks, total weight 10.0
- Each check: text_readability_general (3.33), background_contrast_general (3.33), language_consistency (3.34)
- Maximum score: 100 points

## Technical Improvements

- Enhanced prompt engineering for consistent LLM outputs
- Mandatory detailed explanations in all checks
- Structured JSON responses with comprehensive fields
- Better error messages and fallback handling
- Client configuration support (client_config.py)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 10:58:39 +02:00
nickviljoen
3fec052c12 Create frontend and backend folder structure for deployment
Organized the application into separate frontend and backend directories for cleaner deployment and better separation of concerns.

Frontend Directory (frontend/):
- index.html: Single-page web interface (renamed from web_ui.html)
- README.md: Frontend deployment guide
- Total size: ~113 KB (self-contained)
- Smart base path detection (works at / or /ai_qc/)
- No configuration changes required

Backend Directory (backend/):
- All Python files (api_server.py, llm_config.py, etc.)
- visual_qc_apps/: 33 QC check modules
- profiles/: 6 QC profile configurations
- brand_guidelines/: Reference asset storage
- config/: Environment configurations
- scripts/: Deployment automation
- uploads/, output/: Data directories
- requirements.txt, ai_qc.service, apache_config.conf
- Complete documentation

New Documentation:
- FOLDER_STRUCTURE.md: Comprehensive guide to new structure
- frontend/README.md: Frontend deployment instructions
- backend/BACKEND_README.md: Backend deployment guide

Deployment Mapping:
- frontend/ → /var/www/html/ai_qc/ (web root)
- backend/ → /opt/ai_qc/ (application directory)

Benefits:
- Clear separation of concerns
- Backend code not in web-accessible directory
- Independent frontend/backend updates
- Matches server's existing patterns (/opt/veo3, /opt/voice2text)
- Industry-standard architecture
- Easy to deploy and maintain

Original files preserved in root directory for reference.
Ready for production deployment following MIGRATION_GUIDE.md.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-06 11:55:53 +02:00