ai_qc/CLAUDE_LOREAL.md

# L'Oreal Client Documentation

> Referenced from main CLAUDE.md. This file contains detailed L'Oreal QC check descriptions, prompt tuning history, test file locations, and known gaps.

## L'Oreal QC Tools

Four checks for L'Oreal digital static marketing material quality. All checks use equal weight (2.5 each). Profile includes a strict grading override: any individual check scoring below 6 forces an overall Fail regardless of the total score.

| Tool | What it checks |
|------|---------------|
| `language_consistency` | All marketing text uses a single consistent language. Excludes product packaging text, brand names, legal disclaimers. Mixed languages in headlines/body copy = Fail |
| `text_readability` | Text size, font clarity, spacing, contrast, placement. **Hidden/invisible text detection**: actively scans for text that may be same colour as background (black-on-black, white-on-white). Checks message completeness (missing expected text = critical fail) |
| `background_contrast` | Logo presence and visibility, per-product contrast against background, marketing text contrast. **Hidden element detection**: scans dark areas for dark elements and light areas for light elements. Missing brand logo = critical fail (score 1-2) |
| `text_product_overlap` | Checks whether marketing text overlaps the product hero zone. Classifies text as **display** (headlines, titles, body -- checked strictly) vs **callout** (feature annotations with visible connector lines -- checked leniently). **Headline width check**: single-line headlines extending >50% image width across product's horizontal space = Fail. Callout text in clean negative space = Pass, callout text crammed against/overlapping product = Fail. Product-only images with no marketing text = Not Applicable (7/10) |

## Prompt Tuning History

### Round 1 (2026-03-30)
Prompts were refined based on testing with 4 assets, each with a known single defect category:

**Test files location**: `/Users/nickviljoen/Desktop/AI_QC_Bitbucket/loreal/Test Files/`
**Reports location**: `/Users/nickviljoen/Desktop/AI_QC_Bitbucket/loreal/latest_reports/` (Google and Open AI subfolders)

**What each test file has wrong**:
| Test File | Known Defect |
|---|---|
| Missing text or in black.png | Text missing or rendered in black on dark background (invisible) |
| Text in is both English and German and overlayed.png | Marketing text in both German and English, overlaid on top of each other |
| Text is not readable.png | Purple/magenta text on red background -- nearly invisible contrast |
| Text on top of the image and not readable.png | Text placed over complex imagery, brand logo missing |

**Key prompt changes made (from POS-focused to digital marketing)**:
1. **Text Readability**: Rewritten from POS/viewing-distance focus to digital marketing context. Added Step 3: "Detect Hidden or Invisible Text" -- instructs LLM to scan dark areas for dark text, light areas for light text, look for faint outlines. Added message completeness check (missing expected text = score 1-2).
2. **Background Contrast**: Rewritten from POS to digital marketing. Added Step 3: "Detect Hidden or Invisible Elements" -- same colour-matching scan for logos, products, and text. Added per-product individual evaluation (one blending product = fail). Missing logo = critical fail.
3. **Profile reverted from 2 checks to 3**: Previously used a combined `visual_readability_contrast` check. Reverted to separate `text_readability` + `background_contrast` to avoid score dilution -- the combined check was masking individual failures.
4. **L'Oreal-specific grading override** (`api_server.py`): If ANY individual check scores below 6, the overall grade is forced to Fail. This prevents high-scoring checks from pulling up the average when a critical defect exists.

**Detection accuracy (2026-03-30)**:

| File | Gemini Score | Gemini Grade | OpenAI Score | OpenAI Grade |
|---|---|---|---|---|
| Missing text or in black | 73.3 | Fail | 80.0 | Fail |
| English+German overlayed | 20.0 | Fail | 20.0 | Fail |
| Text is not readable | 46.6 | Fail | 46.6 | Fail |
| Text on top of image | 60.0 | Fail | 63.3 | Fail |

All 4 defective assets correctly detected as Fail by both models. Previously "Missing text or in black" scored 80/100 Pass with the old 2-check profile.

**Known gaps**:
- "Missing text or in black" text_readability scores 9/10 on both models -- they don't penalise the hidden text as a readability issue, only as a contrast issue. Consider whether text_readability prompt needs further tuning for this case.
- OpenAI scores background_contrast at 5/10 for "Missing text or in black" vs Gemini at 3/10 -- OpenAI is more lenient on dark-on-dark product contrast.
- "Text on top of image" text_readability scores 6-7 (borderline pass) -- the text is actually reasonably legible (black on white), so this may be accurate. Main issue is the missing brand logo caught by background_contrast.

### Round 2 (2026-04-07)
Prompts refined based on testing with 5 new assets -- L'Oreal Professionnel Absolut Repair Molecular product range:

**Test files location**: `/Users/nickviljoen/Desktop/AI_QC_Bitbucket/loreal/New Assets/AI QC test/`
**Reports location**: `/Users/nickviljoen/Desktop/AI_QC_Bitbucket/loreal/New Assets/reports3/`

**Test files**:
| Test File | Type | Expected Result |
|---|---|---|
| LP-ARM_MASK_BELOWTHEFOLD_MODULE4_CARROUSEL4_COMMITMENTS...pt-PT.jpg | Marketing layout (Portuguese) with product -- headline over product | Fail |
| LP-ARM_SHAMPOO_BELOWTHEFOLD_MODULE4_CARROUSEL4_COMMITMENTS...pt-PT.jpg | Marketing layout (Portuguese) with product -- headline wrapped to side | Pass |
| LP-ARM_MASK_BELOWTHEFOLD_MODULE6_COMPARISONTABLE1_488X700_MULTIPLE_EAN.jpg | Product-only shot (mask jar) | Pass |
| LP-ARM_MASK_BELOWTHEFOLD_MODULE6_COMPARISONTABLE2_488X700_MULTIPLE_EAN.jpg | Product-only shot (leave-in bottle) | Pass |
| test.png | Kerastase products blending into light background (Hungarian) | Fail |

**Issues found and fixes applied**:
1. **text_readability**: Product-only shots (no marketing text) were scoring 1-2/10 as "critical failure -- text missing entirely." Fixed by adding a carve-out: product-only images with no marketing layout score 7/10 neutral with explanation "no marketing text to evaluate." Assets where marketing text SHOULD exist but is hidden/invisible still score 1-2.
2. **background_contrast**: Cream/light products on white backgrounds (standard cosmetics photography) were failing because the prompt had a rigid rule: "light-coloured products on light backgrounds = FAIL." Rewritten to focus on **actual visibility** -- products pass if they have visual separation cues (shadows, edge definition, texture differences, contrasting elements like dark lids) even on similar-coloured backgrounds. Only fails if the product genuinely disappears into the background.
3. **UI fix**: Long filenames (common in L'Oreal asset naming) were overflowing the file queue container. Fixed with `overflow: hidden`, `text-overflow: ellipsis`, `min-width: 0` on queue items, and `word-break` on report filenames.

**Detection accuracy (2026-04-07, Gemini 3 Pro)**:

| File | Score | Grade | Notes |
|---|---|---|---|
| MASK_COMMITMENTS (pt-PT) | 96.7 | Pass | Language 10, Contrast 10, Readability 9 |
| SHAMPOO_COMMITMENTS (pt-PT) | 93.3 | Pass | Language 10, Contrast 10, Readability 8 |
| COMPARISONTABLE2 (bottle) | 90.0 | Pass | Language 10, Contrast 10, Readability 7 (neutral) |
| COMPARISONTABLE1 (jar) | 86.7 | Pass | Readability 7 (neutral), Contrast 9, Language 10 |
| test.png (Kerastase) | 73.3 | Fail | Contrast 4 -- products genuinely blend into background |

All 4 good assets correctly pass. The defective asset (test.png) correctly fails due to poor product-background contrast.

### Round 3 (2026-04-09)
Added new `text_product_overlap` check to detect marketing text overlapping the product hero zone. Created as a separate dedicated check because text_readability couldn't catch layout/positioning issues -- the text was readable, just poorly positioned.

**New check**: `text_product_overlap` -- evaluates spatial relationship between marketing text and the product hero zone (product + surrounding translucent/decorative elements).

**Key design decisions**:
1. **Separate check vs modifying text_readability**: Text overlapping a product is a layout/composition issue, not a readability issue. LLMs consistently scored readable-but-overlapping text as Pass on readability. A dedicated check with focused prompt works reliably.
2. **Display vs callout text classification**: Feature callout text with visible connector lines (e.g., "Plastico reciclado.1" -> bottle cap) is standard cosmetics layout. Display text (headlines, body copy) is checked strictly. Callout text is checked leniently (OK in clean space, fail if crammed against product).
3. **Headline width check**: Single-line headlines spanning >50% image width across the product's horizontal space = Fail. This catches the pattern where a headline should be wrapped to multiple lines on one side.
4. **Hero zone definition**: Compact cluster around product only -- not every scattered decorative element. Small droplets/bubbles at edges are background, not hero zone.

**Profile updated**: L'Oreal Static now has 4 checks at 2.5 weight each (was 3 checks at 3.33). Total check count: 66.

**Detection accuracy (2026-04-09, Gemini)**:

*New assets (reports10):*

| File | Score | Grade | Notes |
|---|---|---|---|
| MASK_COMMITMENTS (pt-PT) | - | Fail | Overlap 4, Language 10, Readability 9, Contrast 9 -- headline extends across product |
| SHAMPOO_COMMITMENTS (pt-PT) | - | Pass | Overlap 10, Language 10, Readability 9, Contrast 10 -- headline wrapped to side |
| COMPARISONTABLE1 (jar) | - | Pass | All 10s except Readability 7 (neutral, product-only) |
| COMPARISONTABLE2 (bottle) | - | Pass | All 10s except Readability 7 (neutral, product-only) |
| test.png (Kerastase) | - | Fail | Overlap 4, Contrast 4, Readability 4 -- text over flowers, products blend |

*Original test files (reports4):*

| File | Score | Grade | Notes |
|---|---|---|---|
| Missing text or in black | - | Pass | **Regression**: background_contrast scores 10 (was 5) due to LLM variability. Known gap -- text_readability doesn't penalise hidden text as readability issue |
| English+German overlayed | - | Fail | Language 1, Readability 1, Overlap 4 |
| Text is not readable | - | Fail | Readability 2, Contrast 3 |
| Text on top of image | - | Fail | Contrast 1 (missing logo). Overlap 10 -- text has connector lines and single-character overlap blends into product colour; at limit of LLM detection |

**Known gaps**:
- "Missing text or in black" occasionally passes due to LLM variability on background_contrast scoring
- Single-character overlaps where the letter blends into the product colour (same hue) are at the detection limit
- text_product_overlap cannot catch very subtle proximity issues where text is millimetres from the product edge but technically in white space