No description
Find a file
nickviljoen 8a7d477c86 Fix batch QC: add Flask app context to ThreadPoolExecutor child threads
ThreadPoolExecutor workers don't inherit the parent thread's Flask app
context, causing "Working outside of application context" errors during
batch QC execution. Pass the app instance into BatchQCExecutor and wrap
each child thread's work with app.app_context(). Also ensure the
progress_sessions table is created on fresh databases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:20:56 +02:00
core Fix batch QC: add Flask app context to ThreadPoolExecutor child threads 2026-04-16 15:20:56 +02:00
database Add modular architecture, core framework, and web UI 2026-02-25 11:39:04 +02:00
deploy v1.2.0: Add Docker deployment, simplify auth to local login, production config 2026-03-21 14:37:53 +02:00
modules Fix batch QC: add Flask app context to ThreadPoolExecutor child threads 2026-04-16 15:20:56 +02:00
static v1.2.0: Add Docker deployment, simplify auth to local login, production config 2026-03-21 14:37:53 +02:00
templates v2.2.0: Gemini video, batch grouping, thumbnails, speed, price fix, printer check 2026-04-16 13:56:07 +02:00
.dockerignore v1.2.0: Add Docker deployment, simplify auth to local login, production config 2026-03-21 14:37:53 +02:00
.env.example v2.0.0: Update all documentation for major release 2026-03-21 22:13:27 +02:00
.gitignore Consolidate legacy hm_qc and video_qc tools into main project 2026-02-25 11:40:53 +02:00
app.py v2.2.0: Gemini video, batch grouping, thumbnails, speed, price fix, printer check 2026-04-16 13:56:07 +02:00
auth_middleware.py Initial Commit 2025-12-30 16:47:56 +02:00
box_client.py Initial Commit 2025-12-30 16:47:56 +02:00
CHANGELOG.md v2.1.0: Update README and CHANGELOG for campaigns & pricing features 2026-03-26 19:34:44 +02:00
config.py Add campaign presentation management and global pricing reference 2026-03-26 16:12:22 +02:00
DEPLOYMENT_CHECKLIST.md Add modular architecture, core framework, and web UI 2026-02-25 11:39:04 +02:00
docker-compose.yml v1.2.0: Add Docker deployment, simplify auth to local login, production config 2026-03-21 14:37:53 +02:00
Dockerfile Add storage/campaigns and storage/reference dirs to Dockerfile 2026-03-26 18:06:57 +02:00
DOCUMENTATION_SUMMARY.txt Update documentation for unified platform consolidation 2026-02-25 13:51:21 +02:00
gunicorn_config.py v1.2.0: Add Docker deployment, simplify auth to local login, production config 2026-03-21 14:37:53 +02:00
INTEGRATION_TEST_REPORT.md Add modular architecture, core framework, and web UI 2026-02-25 11:39:04 +02:00
jwt_validator.py Initial Commit 2025-12-30 16:47:56 +02:00
MIGRATION_GUIDE.md Update documentation for unified platform consolidation 2026-02-25 13:51:21 +02:00
README.md v2.2.0: Update README for all new features 2026-04-16 15:05:54 +02:00
report_parser.py Initial Commit 2025-12-30 16:47:56 +02:00
requirements.txt Add Excel (.xlsx) support for campaign media plans / price sheets 2026-03-26 18:54:59 +02:00
run.sh Reporting updated. 2026-01-14 09:14:00 +02:00
run_prod.sh Initial Commit 2025-12-30 16:47:56 +02:00
setup.sh Initial Commit 2025-12-30 16:47:56 +02:00
test_integration.py Add modular architecture, core framework, and web UI 2026-02-25 11:39:04 +02:00
test_local.sh Initial Commit 2025-12-30 16:47:56 +02:00
wsgi.py v1.2.0: Add Docker deployment, simplify auth to local login, production config 2026-03-21 14:37:53 +02:00

Unified HM QC Platform

Version: 2.2.0 Status: Production (Deployed) Deployed at: https://ai-sandbox.oliver.solutions/hm-ai-qc-report

A comprehensive quality control platform for H&M marketing assets with AI-powered validation, video matching, and consolidated reporting.


Overview

The platform integrates seven tools into a single web application:

  1. Reporting - Consolidated QC reports from Box.com with search history
  2. HM QC - AI-powered image quality control (text legibility, language, quality, pricing)
  3. Video QC - AI-powered video quality control (direct video analysis via Gemini)
  4. Video Master Adot - Campaign-based master-to-adaptation video matching via Box
  5. Printer Check - CSV-to-PDF cross-referencing for print order validation
  6. Campaigns - Campaign presentation and media plan management for QC reference
  7. Usage Dashboard - API usage tracking, token counts, and cost estimates

Key Features

  • Unified tabbed interface with H&M branding
  • Local username/password authentication
  • Multi-provider AI: OpenAI GPT-4o and Google Gemini 2.5 Flash
  • Google Gemini direct video analysis (no frame extraction needed)
  • Campaign presentation & media plan upload for guideline-based QC validation
  • Global pricing reference for currency symbol/format validation per country
  • Batch report grouping with collapsible sections and ZIP download
  • Asset thumbnails in report listings
  • Parallel batch processing (2 concurrent files) for improved speed
  • Real-time progress tracking (SSE + polling)
  • Docker deployment with Apache reverse proxy
  • Usage tracking with estimated costs per API call

Deployment

Docker (Production)

# Clone from Bitbucket
git clone git@bitbucket.org:zlalani/hm_ai_qc_report_tool.git /opt/hm-qc-app
cd /opt/hm-qc-app

# Configure environment
cp .env.example .env
# Edit .env with production values (see .env.example)
# Generate password: python3 deploy/generate_password.py

# Build and start
docker compose build
docker compose up -d

# Create database tables
docker exec hm-qc-app python3 -c "from app import app; from core.models.database import db; app.app_context().push(); db.create_all(); print('Tables created')"

The app runs on 127.0.0.1:5050 inside Docker. Configure Apache or Nginx as reverse proxy — see deploy/ for config snippets.

Common Commands

docker compose logs -f              # Tail logs
docker compose restart              # Quick restart
docker compose down && docker compose up -d --build  # Rebuild after code changes
git pull && docker compose down && docker compose up -d --build  # Deploy update

Modules

1. Reporting

Consolidated QC report search from Box.com and local database.

Features:

  • Job number search (single or comma-separated for multi-job)
  • Async search with real-time progress bar
  • Box reports saved locally for instant re-viewing (no re-fetch)
  • Previous Box Reports section with View/Delete
  • Dashboard with designer-friendly error display
  • Export: HTML and CSV (full or errors-only)

Workflow: Search job number -> Progress bar -> Dashboard with aggregated results

2. HM QC

AI-powered image quality control for marketing assets.

Profile: H&M Image Check (3 checks)

  • Filename Parse (30%) - Flexibly extracts country code, language, dimensions, campaign number from multiple H&M naming conventions (Display, DOOH, OOH, SOME STATIC, Social, POS)
  • Image Quality (40%) - AI visual assessment with strict text legibility rules; validates against campaign presentation guidelines when available
  • Price/Currency (30%) - Validates currency symbol/format against global pricing reference; validates actual prices against campaign media plan when available

AI Quality Check evaluates:

  • Text & title legibility (CRITICAL - illegible text = automatic fail)
  • Language word validation (avoids false positives like "Rock" = German for skirt)
  • Campaign guideline compliance (typography, layout, copy, logo placement)
  • Image quality, color, composition
  • Logo and branding clarity

Features:

  • Single and batch file upload (up to 100 files)
  • Batch report grouping: reports grouped by upload batch with collapsible sections, batch stats, and "Download All" ZIP
  • Asset thumbnails in report listings for quick visual identification
  • Parallel processing: 2 files processed concurrently within each batch for improved speed
  • LLM provider choice: OpenAI GPT-4o or Google Gemini 2.5 Flash
  • Campaign presentation dropdown to validate against campaign guidelines
  • Previous QC Reports with View/Delete and Download
  • HTML report generation with per-check scoring
  • Usage tracking (tokens + estimated cost)

Workflow: Upload -> Configure (provider + campaign + job number) -> Execute -> Results

3. Video QC

AI-powered video quality control with direct video analysis.

Checks:

  • Visual Quality (50%) - Language consistency + text legibility throughout the video
  • Censorship (50%) - Body coverage compliance (only for _CEN market files, skipped otherwise)

How it works (Google Gemini — default):

  1. Uploads the video file directly to Google Gemini via genai.upload_file()
  2. Gemini processes the full video with temporal context (motion, transitions, audio)
  3. AI analyzes language consistency, text legibility, and branding in a single pass
  4. Language check includes false-positive prevention (e.g., "Rock" = skirt in German)

How it works (OpenAI — fallback):

  1. Extracts 1 frame per second from the video
  2. Stitches frames into a labeled grid image
  3. Sends grid to GPT-4o for analysis (1 API call per check)

Features:

  • Default: Google Gemini direct video analysis (no frame extraction)
  • Fallback: OpenAI GPT-4o frame grid method
  • CEN market auto-detection from filename
  • Previous Video QC Reports with View/Delete
  • Usage tracking

Workflow: Upload video -> Configure (provider + campaign) -> Execute -> Results

4. Video Master Adot

Campaign-based master-to-adaptation video matching using Box.com integration.

How it works:

  1. User enters campaign name
  2. System searches Box for campaign folder, finds Global Masters and Regional Masters
  3. Preview shows: master count, countries, adaptation count
  4. Phase 1: Downloads each master temporarily, fingerprints it (~50KB), deletes video
  5. Phase 2: Downloads each adaptation temporarily, matches against fingerprints, deletes video
  6. Results: per-master adaptation mapping, unmatched items, match rate

Matching Engine (4-tier cascade):

  • Stage 0: Metadata filtering (80-95% reduction)
  • Tier 1: Perceptual hash matching
  • Tier 2: AKAZE feature verification
  • Tier 3: AI Vision fallback (smart triggering)

Storage: Only fingerprints (~50KB/master) stored permanently. Videos deleted after processing.

Box Folder Structure:

CAMPAIGNS/{campaign_name}/
├── Global Masters/          (various casing)
│   ├── DOOH/
│   ├── DS/
│   ├── OLV/
│   └── ... (video files with MASTER in name)
└── Regional Masters/        (various casing)
    ├── DE/ (country code folders)
    ├── FR/
    └── ...

5. Printer Check

CSV-to-PDF cross-referencing for print order validation. Ported from the CrossMatch desktop application.

What it does:

  1. User uploads a CSV order sheet and a ZIP file containing the PDF folder structure
  2. Filters CSV rows by selected geographic region and country groups
  3. Scans the PDF folder structure (multi-region or country-level layouts)
  4. Matches CSV filenames against actual PDF files
  5. Reports: matched, missing, and extra files with structural warnings

Features:

  • Auto-detects CSV delimiter (tab or comma)
  • Region and country group selection (EEU, CEU, etc.)
  • Campaign detection and filtering from filenames
  • Language column normalization (GEN files, KZ/MK locale handling)
  • Folder structure validation: misplaced GEN files, duplicate GEN, wrong country folders, files at wrong level
  • Results filtering by status (All, Matched, Missing, Extra)
  • XLSX export of filtered data
  • GEN asset priority: special handling for Root/GEN folder validation

Folder Layouts Supported:

  • Multi-Region: Root/EEU/PL/, Root/CEU/DE/, Root/GEN/
  • Country-Level: Root/PL/, Root/DE/, Root/GEN/

Workflow: Select region -> Upload CSV + PDF ZIP -> Process -> View results -> Export XLSX

6. Campaigns

Campaign presentation and pricing reference management.

Purpose: Upload campaign-specific documents that QC checks reference when validating assets.

Document Types:

  • Campaign Presentation (PDF) - Creative guidelines with typography specs, layout rules, copy text, ratio-specific mockups. Parsed via LlamaParse (text + page images).
  • Media Plan / Price Sheet (Excel .xlsx) - Product names, prices, and currency per country/language. Parsed via openpyxl into structured text.
  • Global Pricing Reference (PDF) - Single document mapping all countries to currency symbol, position, and format. Parsed once into storage/reference/global_pricing.json.

Workflow:

  1. Upload campaign presentation PDF for a campaign (e.g., 1022B) — leave "has pricing" unchecked
  2. Upload media plan Excel for the same campaign ID — check "Contains campaign-specific pricing"
  3. Both documents are linked by Campaign ID and loaded together during QC
  4. On HM QC or Video QC configure page, select the campaign from the dropdown

Features:

  • Multiple documents per campaign (guidelines + media plan)
  • Auto-polling: status updates in-place when parsing completes
  • View parsed content and page images
  • Global pricing reference upload (format-only, not actual prices)
  • API endpoints for QC modules: /campaigns/api/list, /campaigns/api/<campaign_id>

7. Usage Dashboard

API usage tracking across all tools.

Displays:

  • Summary cards: total API calls, tokens used, estimated cost (USD)
  • Breakdowns: by provider, model, tool, user
  • Recent API calls table with full details
  • Time filters: All Time, 30 Days, 7 Days, Today

Cost estimates based on per-model token pricing (GPT-4o, Gemini 2.5 Flash, etc.)


Configuration

Environment Variables (.env)

# Authentication
AUTH_USERS=admin:pbkdf2:sha256:600000$$salt$$hash

# Session
SESSION_COOKIE_PATH=/hm-ai-qc-report

# Box
BOX_CONFIG_PATH=config/box_config.json
BOX_REPORT_FOLDER_ID=133295752718
BOX_CAMPAIGNS_FOLDER_ID=156182880490

# Flask
SECRET_KEY=<generate-random-key>
FLASK_ENV=production

# Database (use absolute path for Docker)
DATABASE_URI=sqlite:////app/database/qc_platform.db

# LLM Providers
OPENAI_API_KEY=<your-key>
GOOGLE_API_KEY=<your-key>

Note: $$ in AUTH_USERS hash is required for Docker Compose (escapes $).


Architecture

Tech Stack

  • Backend: Flask 3.0, SQLAlchemy, Gunicorn
  • Frontend: Bootstrap 5, Vanilla JS, Server-Sent Events
  • AI: OpenAI GPT-4o, Google Gemini 2.5 Flash (via google-generativeai)
  • Video: FFmpeg, OpenCV (AKAZE), Chromaprint
  • Storage: Box.com (JWT auth), SQLite
  • Deployment: Docker, Apache reverse proxy

Directory Structure

hm_ai_qc_report_tool/
├── app.py                    # Application factory
├── config.py                 # Configuration
├── Dockerfile                # Docker image
├── docker-compose.yml        # Docker services
├── deploy/                   # Deployment scripts & configs
│
├── core/                     # Shared infrastructure
│   ├── auth/                 # Session-based authentication
│   ├── models/               # Database models (QCReport, UsageLog, CampaignPresentation)
│   ├── services/             # LLM config, Box client
│   └── utils/                # Progress tracker, report parser
│
├── modules/
│   ├── hm_qc/               # HM QC (checks, executor, profiles, batch grouping)
│   ├── video_qc/            # Video QC (executor, Gemini direct video + frame fallback)
│   ├── video_master/         # Video Master (matching engine, campaign matcher)
│   ├── printer_check/        # Printer Check (CSV parser, folder scanner, matcher)
│   ├── campaigns/            # Campaign presentations & pricing reference
│   ├── reporting/            # Reporting (aggregator, Box search, cache)
│   └── usage/                # Usage dashboard
│
├── templates/                # Shared templates (base.html, login.html)
├── static/                   # CSS, JavaScript
├── database/                 # SQLite database
└── storage/                  # Reports, fingerprints, campaigns, reference

Security

  • Local username/password auth with PBKDF2/scrypt hashing
  • Session-based with before_request login enforcement
  • No hardcoded API keys (all from environment)
  • Docker container binds to 127.0.0.1 only (not exposed to internet)
  • HTTPS via Apache with wildcard SSL certificate
  • httpOnly, Secure, SameSite=Lax cookies

License

Proprietary - H&M Hennes & Mauritz AB