- /api/search-campaign now kicks off a background thread and returns
immediately. The browser polls /api/progress/<session_id> and fetches
the cached result via the new /api/search-campaign-result/<session_id>
endpoint when complete. Box folder enumeration on a not-found campaign
was taking >30s, exceeding the GCP load balancer's response timeout
and surfacing as 'stream timeout' (not valid JSON) to the user.
- Result cached for 10 min via the existing reporting result_cache
(filesystem-backed → safe across gunicorn workers).
- Form label/placeholder/hint updated: tool accepts a campaign NUMBER,
not a campaign name. Placeholder shows '1993857' instead of
'1011A Spring SS2025'.
Video QC:
* _extract_locale_from_filename now also handles the suffix form
..._XX-yy.ext (case-insensitive both sides), so DOOH/OOH-style
adapt filenames like ..._ES-es.mp4 unblock the price_currency
check instead of skipping with "could not extract locale".
* Batch results page expires the SQLAlchemy session at the top of
the route so the post-completion reload sees committed reports
even when it lands on a different gunicorn worker than the one
that wrote them. Reload delay bumped 1s → 2s for margin.
* visual_quality prompt now passes the filename's market+language
to the LLM and tells it the on-screen copy should be in the
localized language, not the source-language guideline copy.
Stops Spanish-market videos being flagged as "language mismatch
with English campaign guidelines".
Printer Check:
* regions.json rewritten to cover all 10 H&M regions (AME, CEU,
NEU, GCN, IND, SHE, SEU, EEU, EAS, Franchise) with default-all
groups. Two judgement calls vs the screenshot: kept TR for
Turkey (TK is Tokelau in ISO and would break filename matching)
and BR for Brazil (every other code is 2-letter ISO).
Campaign codes:
* New core/utils/campaign_code.py is the single source of truth.
Matches both the legacy 4-digits-plus-optional-letter (1013A,
4116) and the new 11-char alphanumeric with year at positions
5-6 (CFUL263C01D). All four prior parser sites now import from
this helper.
Video Master:
* BOX_CAMPAIGNS_FOLDER_ID switched 156182880490 → 133295752718
(same root the Reporting tool uses). Updated config.py default
and all three .env example files.
* Match page now shows which Box folder the search runs against
(with a clickable link), and on a not-found error explains what
was searched for so missing-campaign cases are self-diagnosable.
Lifted JWT-cookie auth pattern from the AI QC sibling project:
core/auth/middleware.py validates Azure AD JWTs and stores them in
an httpOnly cookie (hm_aiqc_auth_token). Tenant membership is
enforced by JWTValidator's tid check, which is sufficient for the
tenant-wide access policy chosen for this project.
templates/login.html now drives an MSAL.js popup that POSTs the
ID token to /auth/login. base.html exposes Azure config to all
pages so the logout button can also clear the MSAL session.
app.py's @before_request now checks the JWT cookie and exposes
g.user; modules read user identity via core.auth.current_user_email
so usage logs and created_by columns now record the signed-in
user's email rather than a session value.
Legacy username/password code removed: top-level auth_middleware.py,
jwt_validator.py, deploy/generate_password.py.
- Folder discovery groups files by version (V1, V2, ...); only the highest
version per master/adapt is matched. Lower versions are reported as
"superseded" so users can see what was skipped.
- Matching is now an asymmetric 3-pass cascade per adaptation:
Pass 1: masters of same duration (±0.5s) — pHash + AKAZE
Pass 2: masters strictly longer than the adapt — pHash + AKAZE
(shorter masters can't have produced the adapt; never compared)
Pass 3: AI Vision on same-duration / different-resolution masters,
triggered only when Passes 1 and 2 find nothing (covers crops).
- AI Vision default switched from gpt-4o to gemini-2.5-flash (~10x cheaper)
and re-enabled in CampaignMatcher.
- Master temp files now persist for the whole run so Pass 3 can re-read
frames; cleanup still happens via shutil.rmtree at end of run.
- Report shows a "Resolved at" badge per match (Pass 1/2/3) and a new
Superseded Files section.
- New /video-master/report/<id>/download endpoint serves the saved HTML
with attachment headers; Download buttons added to results.html and
view_report.html.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add BOX_CAMPAIGNS_FOLDER_ID config (156182880490) separate from
BOX_REPORT_FOLDER_ID which is for QC reports
- Update search_subfolder() to use Box search API first (fast for large
folders with 1000+ campaigns), fall back to folder listing
- Increase folder listing limit from 200 to 500
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full workflow:
- Enter campaign name → search Box for campaign folder
- Auto-discover Global Masters and Regional Masters subfolders
- Preview: shows master count, countries, adaptation count
- Phase 1: Download each master to temp, fingerprint, delete video
- Phase 2: Download each adaptation to temp, match against masters, delete
- Results: per-master adaptation mapping, unmatched items, match rate
- HTML report with detailed breakdown
- Previous Matching Jobs table with View/Delete
Box client additions:
- search_subfolder() - case-insensitive subfolder search
- list_subfolders() - enumerate child folders
- list_video_files() - list video files in folder
- download_file_to_disk() - streaming download for large files (ProRes)
Storage: only fingerprints (~50KB) + key frames stored permanently.
Videos deleted immediately after processing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New blueprint-based module system (hm_qc, video_qc, video_master,
reporting), core framework (database, config, templates), and
unified web interface with progress tracking and tab navigation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>