Video QC:
* _extract_locale_from_filename now also handles the suffix form
..._XX-yy.ext (case-insensitive both sides), so DOOH/OOH-style
adapt filenames like ..._ES-es.mp4 unblock the price_currency
check instead of skipping with "could not extract locale".
* Batch results page expires the SQLAlchemy session at the top of
the route so the post-completion reload sees committed reports
even when it lands on a different gunicorn worker than the one
that wrote them. Reload delay bumped 1s → 2s for margin.
* visual_quality prompt now passes the filename's market+language
to the LLM and tells it the on-screen copy should be in the
localized language, not the source-language guideline copy.
Stops Spanish-market videos being flagged as "language mismatch
with English campaign guidelines".
Printer Check:
* regions.json rewritten to cover all 10 H&M regions (AME, CEU,
NEU, GCN, IND, SHE, SEU, EEU, EAS, Franchise) with default-all
groups. Two judgement calls vs the screenshot: kept TR for
Turkey (TK is Tokelau in ISO and would break filename matching)
and BR for Brazil (every other code is 2-letter ISO).
Campaign codes:
* New core/utils/campaign_code.py is the single source of truth.
Matches both the legacy 4-digits-plus-optional-letter (1013A,
4116) and the new 11-char alphanumeric with year at positions
5-6 (CFUL263C01D). All four prior parser sites now import from
this helper.
Video Master:
* BOX_CAMPAIGNS_FOLDER_ID switched 156182880490 → 133295752718
(same root the Reporting tool uses). Updated config.py default
and all three .env example files.
* Match page now shows which Box folder the search runs against
(with a clickable link), and on a not-found error explains what
was searched for so missing-campaign cases are self-diagnosable.
A. Excel upload — /campaigns/pricing/upload now accepts .xlsx/.xls
alongside .pdf. File picker in the campaigns UI matches.
B. Deterministic Excel parser (openpyxl, no LLM) — looks for H&M-style
mastersheets:
- 'MPC Prices' sheet -> flat list of {product_id, language, country,
price, currency, product_name} entries (this is the gold mine).
- Regional sheets (AME/CEU/EEU/...) -> formatted prices per locale
used to derive currency symbol, position, decimal/thousands
separators. Skips OLD/COPY sheets.
Verified against the attached 1013A mastersheet: 448 price entries
across 7 products x 74 locales, 139 locale format entries.
Parser lives in modules/campaigns/pricing_parser.py alongside the
existing PDF path (which now also returns the structured form with
empty _prices).
New lookup shape stored in PricingReference.parsed_data_json:
{"_format": {"en-US": {currency_code, symbol, position, ...}, ...},
"_prices": [{product_id, language, country, price, currency,
product_name}, ...]}
Legacy flat {"<code>": {...}} is still recognised (treated as _format
only) for backwards compatibility with the legacy global JSON import.
Model helpers added:
- PricingReference.get_format_map()
- PricingReference.get_prices()
to_dict() now reports price_count alongside entry_count.
C. Upgraded price_currency_check.py — when a pricing reference with
_prices is attached, the check runs a deterministic comparison:
detected price(s) -> normalize (_normalize_price handles '$49.99',
'39,99 €', 'CHF 49.95', '1.234,56', 'Rs. 2,799', '13 995 Ft', '349,-',
'0.999.000'...) -> compare with tol=0.005 against the expected
per-locale rows. LLM-based campaign-sheet fallback only runs if no
_prices are present (legacy PDF reference or has_pricing campaign
presentation).
D. Video QC price check — new _run_price_check step in the executor.
Parses filename (Market_lang_CampaignNum_... -> 'lang-Market' locale),
detects prices across frames via the same Gemini/GPT-4o path the
other checks use, then deterministic-validates against the attached
pricing reference. Skipped if no pricing ref, unknown locale, GEN/CEN
markets, or no price visible in video.
Overall video score now uses weighted mean of active (non-skipped)
checks (visual_quality w=50, censorship w=50, price_currency w=30)
instead of the hardcoded 50/50 split — so skipping any one check
falls through cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "Global Pricing Reference" is no longer a single file at
storage/reference/global_pricing.json. Pricing references are now
first-class DB rows (PricingReference model), uploadable as a library
in the Campaigns tab and selectable per-run alongside the campaign
presentation dropdown on the HM QC and Video QC configure pages.
New:
- core/models/pricing_reference.py — PricingReference model: id, name,
pdf_filename, pdf_path, parsed_content, parsed_data_json, status,
created_at/by. get_lookup() deserializes parsed_data_json; to_dict()
powers the dropdown API.
- /campaigns/pricing/upload — creates a PricingReference row, saves PDF
under storage/pricing_references/<id>/, kicks off background parse.
- /campaigns/pricing/<id> DELETE, /campaigns/api/pricing/list,
/campaigns/api/pricing/status/<id>.
- Campaigns index: "Pricing References" table card (mirrors the
presentations card) + upload form with optional name field.
Changed:
- pricing_parser: parse_pricing_pdf_to_dict returns (dict, raw_text);
new parse_pricing_reference(id) runs the parse against a DB row and
sets status to ready/error. Legacy file-based path removed.
- QCExecutor and VideoQCExecutor accept pricing_reference_id; load the
row into context['pricing_reference']={id, name, lookup}.
- BatchQCExecutor and BatchVideoQCExecutor thread pricing_reference_id
through to per-file executors.
- price_currency_check._validate_currency reads context instead of the
disk file; returns 'skipped_no_reference' if no ref attached.
- HM QC + Video QC /execute and /execute/batch routes pass
pricing_reference_id from the JSON payload.
- Configure templates for HM QC and Video QC add a second dropdown
"Pricing Reference (Optional)" loaded from /campaigns/api/pricing/list.
Backwards compatibility:
- app.py: on startup, if storage/reference/global_pricing.json exists
and the pricing_references table is empty, import it as a
"Default (legacy global)" PricingReference row so existing installs
keep a valid reference attached (user can pick it at configure time).
- config.py: retains GLOBAL_PRICING_{PDF,JSON}_PATH for the legacy
importer; adds PRICING_REF_STORAGE_PATH for the new per-row storage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Video QC: Switch to Google Gemini direct video analysis as default (OpenAI frame grid fallback)
- HM QC: Group reports by batch with collapsible sections, ZIP download per batch
- HM QC: Generate asset thumbnails (150px) displayed in report listings
- Speed: Remove artificial delays, add ThreadPoolExecutor(2) for parallel batch processing
- Price detection: Improved prompt with country context, detect all prices, increased text limit
- New Printer Check module: CSV-to-PDF cross-referencing ported from CrossMatch Rust app
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Strip markdown code fences from LLM response before JSON parsing
- Log raw response and parsed result for debugging
- Show warning with provider/model info when detection fails (instead of silent skip)
- Separate "detection failed" (warning, 70) from "no price found" (skipped, 100)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Global pricing parser now explicitly extracts format only (symbol,
position, separators) — ignores actual price values in the reference doc
- Executors load ALL ready documents for a campaign (not just the latest),
combining their content — supports guidelines + media plan side by side
- Campaign context now separates pricing_content (from has_pricing docs)
from general parsed_content (all docs combined)
- Price check uses pricing_content specifically for actual price validation
- Report header shows document count (e.g., "1022B - AW25 Display (2 docs) + pricing")
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Filename check:
- Rewritten to flexibly parse multiple H&M naming conventions
(Display, DOOH, OOH, SOME STATIC, Social, POS, DS)
- Extracts country code, language code, dimensions, campaign number
- Scores based on how much metadata was extracted (not rigid pattern)
- Tested against real filenames: BG_bg, ES_es, NO-no formats
Price/currency check (new):
- Detects prices in images via LLM vision API
- Validates currency against global pricing reference (deterministic)
- Falls back to LLM validation for unknown countries
- Optional campaign pricing sheet validation when has_pricing=True
- Added to profile with weight 30
Profile weights rebalanced: filename 30, quality 40, price 30
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces a new Campaigns module for uploading campaign presentation PDFs
that QC checks reference to validate assets against campaign-specific
guidelines (typography, layout, copy, pricing format). Also adds a global
pricing reference system that maps country codes to currency symbols and
formats for deterministic price/currency validation.
- New CampaignPresentation model + campaigns blueprint with CRUD routes
- PDF parsing via LlamaParse (text + multimodal page images)
- Global pricing PDF parsed into structured JSON lookup
- Campaign context injected into both image and video QC executors
- Quality checks enhanced with campaign guidelines in LLM prompts
- Price/currency check uses global pricing lookup (saves an LLM call)
- Campaign dropdown added to HM QC and Video QC configure pages
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New UsageLog model tracking every LLM API call (provider, model,
tokens, estimated cost, user, module, check name)
- Instrument LLMConfig.call_vision_api() to auto-log each call
- New /usage tab in nav bar with dashboard showing:
- Summary cards (total calls, tokens, estimated cost)
- Breakdowns by provider, model, tool, and user
- Recent API calls table
- Time filters (All Time, 30 Days, 7 Days, Today)
- Cost estimates based on per-model token pricing
- Pass logged-in user through executor context for tracking
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace dimension_check with filename_parse in H&M Image Check profile
- Rewrite quality check prompt to be much stricter on text legibility:
- Text legibility is now the #1 priority (CRITICAL check)
- Any illegible text forces score below 70 (FAILED)
- Explicit instructions to check ALL text including small overlays
- Low contrast text on dark/busy backgrounds flagged as common failure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update image quality prompt to evaluate text/title legibility
- Add Google Gemini (generativeai) as LLM provider in LLMConfig
- Add AI Provider dropdown on configure page (OpenAI GPT-4o / Google Gemini)
- Pass selected provider through execute routes to override profile defaults
- Add google-generativeai to requirements.txt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace 3 profiles with single "H&M Image Check" (dimension_check + image_quality)
- Remove filename_parse check (pattern didn't match actual filenames)
- Create DimensionCheck class for image dimension validation
- Fix configure page to route multi-file uploads to batch endpoint
- Auto-select single profile, show file list on configure page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge original CLI check implementations from hm_qc/ and
hm_qc_video/ repos into modules/*/checks/legacy/ directories.
Includes profiles, launchers, utils, orchestrators, and the
standalone video Flask web app. Reference files (test data,
results, cheat sheets) copied to gitignored reference/ directory.
Censorship trainset copied to gitignored data/supporting/.
The legacy/ naming convention separates original run_check()
function-based implementations from the new BaseCheck class
architecture.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New blueprint-based module system (hm_qc, video_qc, video_master,
reporting), core framework (database, config, templates), and
unified web interface with progress tracking and tab navigation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>