No description

Find a file

nickviljoen fc15a2dda3 Rewrite filename check + add price/currency check to image QC Filename check: - Rewritten to flexibly parse multiple H&M naming conventions (Display, DOOH, OOH, SOME STATIC, Social, POS, DS) - Extracts country code, language code, dimensions, campaign number - Scores based on how much metadata was extracted (not rigid pattern) - Tested against real filenames: BG_bg, ES_es, NO-no formats Price/currency check (new): - Detects prices in images via LLM vision API - Validates currency against global pricing reference (deterministic) - Falls back to LLM validation for unknown countries - Optional campaign pricing sheet validation when has_pricing=True - Added to profile with weight 30 Profile weights rebalanced: filename 30, quality 40, price 30 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-03-26 18:39:54 +02:00
core	Add campaign presentation management and global pricing reference	2026-03-26 16:12:22 +02:00
database	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
deploy	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
modules	Rewrite filename check + add price/currency check to image QC	2026-03-26 18:39:54 +02:00
static	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
templates	Add campaign presentation management and global pricing reference	2026-03-26 16:12:22 +02:00
.dockerignore	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
.env.example	v2.0.0: Update all documentation for major release	2026-03-21 22:13:27 +02:00
.gitignore	Consolidate legacy hm_qc and video_qc tools into main project	2026-02-25 11:40:53 +02:00
app.py	Add campaign presentation management and global pricing reference	2026-03-26 16:12:22 +02:00
auth_middleware.py	Initial Commit	2025-12-30 16:47:56 +02:00
box_client.py	Initial Commit	2025-12-30 16:47:56 +02:00
CHANGELOG.md	v2.0.0: Update all documentation for major release	2026-03-21 22:13:27 +02:00
config.py	Add campaign presentation management and global pricing reference	2026-03-26 16:12:22 +02:00
DEPLOYMENT_CHECKLIST.md	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
docker-compose.yml	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
Dockerfile	Add storage/campaigns and storage/reference dirs to Dockerfile	2026-03-26 18:06:57 +02:00
DOCUMENTATION_SUMMARY.txt	Update documentation for unified platform consolidation	2026-02-25 13:51:21 +02:00
gunicorn_config.py	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
INTEGRATION_TEST_REPORT.md	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
jwt_validator.py	Initial Commit	2025-12-30 16:47:56 +02:00
MIGRATION_GUIDE.md	Update documentation for unified platform consolidation	2026-02-25 13:51:21 +02:00
README.md	v2.0.0: Update all documentation for major release	2026-03-21 22:13:27 +02:00
report_parser.py	Initial Commit	2025-12-30 16:47:56 +02:00
requirements.txt	Add llama-parse and nest_asyncio to requirements.txt	2026-03-26 18:11:36 +02:00
run.sh	Reporting updated.	2026-01-14 09:14:00 +02:00
run_prod.sh	Initial Commit	2025-12-30 16:47:56 +02:00
setup.sh	Initial Commit	2025-12-30 16:47:56 +02:00
test_integration.py	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
test_local.sh	Initial Commit	2025-12-30 16:47:56 +02:00
wsgi.py	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00

README.md

Unified HM QC Platform

Version: 2.0.0 Status: Production (Deployed) Deployed at: https://ai-sandbox.oliver.solutions/hm-ai-qc-report

A comprehensive quality control platform for H&M marketing assets with AI-powered validation, video matching, and consolidated reporting.

Overview

The platform integrates five tools into a single web application:

Reporting - Consolidated QC reports from Box.com with search history
HM QC - AI-powered image quality control (text legibility, language, quality)
Video QC - AI-powered video quality control (frame-by-frame analysis)
Video Master Adot - Campaign-based master-to-adaptation video matching via Box
Usage Dashboard - API usage tracking, token counts, and cost estimates

Key Features

Unified tabbed interface with H&M branding
Local username/password authentication
Multi-provider AI: OpenAI GPT-4o and Google Gemini 2.5 Flash
Real-time progress tracking (SSE + polling)
Docker deployment with Apache reverse proxy
Usage tracking with estimated costs per API call

Deployment

Docker (Production)

# Clone from Bitbucket
git clone git@bitbucket.org:zlalani/hm_ai_qc_report_tool.git /opt/hm-qc-app
cd /opt/hm-qc-app

# Configure environment
cp .env.example .env
# Edit .env with production values (see .env.example)
# Generate password: python3 deploy/generate_password.py

# Build and start
docker compose build
docker compose up -d

# Create database tables
docker exec hm-qc-app python3 -c "from app import app; from core.models.database import db; app.app_context().push(); db.create_all(); print('Tables created')"

The app runs on 127.0.0.1:5050 inside Docker. Configure Apache or Nginx as reverse proxy — see deploy/ for config snippets.

Common Commands

docker compose logs -f              # Tail logs
docker compose restart              # Quick restart
docker compose down && docker compose up -d --build  # Rebuild after code changes
git pull && docker compose down && docker compose up -d --build  # Deploy update

Modules

1. Reporting

Consolidated QC report search from Box.com and local database.

Features:

Job number search (single or comma-separated for multi-job)
Async search with real-time progress bar
Box reports saved locally for instant re-viewing (no re-fetch)
Previous Box Reports section with View/Delete
Dashboard with designer-friendly error display
Export: HTML and CSV (full or errors-only)

Workflow: Search job number -> Progress bar -> Dashboard with aggregated results

2. HM QC

AI-powered image quality control for marketing assets.

Profile: H&M Image Check (2 checks)

Filename Parse (50%) - Validates H&M filename conventions
Image Quality (50%) - AI visual assessment with strict text legibility rules

AI Quality Check evaluates:

Text & title legibility (CRITICAL - illegible text = automatic fail)
Language word validation (avoids false positives like "Rock" = German for skirt)
Image quality, color, composition
Logo and branding clarity

Features:

Single and batch file upload (up to 100 files)
LLM provider choice: OpenAI GPT-4o or Google Gemini 2.5 Flash
Previous QC Reports with View/Delete
HTML report generation with per-check scoring
Usage tracking (tokens + estimated cost)

Workflow: Upload -> Configure (provider + job number) -> Execute -> Results

3. Video QC

AI-powered video quality control with frame-by-frame analysis.

Checks:

Visual Quality (50%) - Language consistency + text legibility across all frames
Censorship (50%) - Body coverage compliance (only for _CEN market files, skipped otherwise)

How it works:

Extracts 1 frame per second from the video
Stitches frames into a labeled grid image
Sends grid to AI for analysis (1 API call per check)
Language check includes false-positive prevention (e.g., "Rock" = skirt in German)

Features:

LLM provider choice (OpenAI / Google Gemini)
CEN market auto-detection from filename
Previous Video QC Reports with View/Delete
Usage tracking

Workflow: Upload video -> Configure -> Execute (frame extraction + AI) -> Results

4. Video Master Adot

Campaign-based master-to-adaptation video matching using Box.com integration.

How it works:

User enters campaign name
System searches Box for campaign folder, finds Global Masters and Regional Masters
Preview shows: master count, countries, adaptation count
Phase 1: Downloads each master temporarily, fingerprints it (~50KB), deletes video
Phase 2: Downloads each adaptation temporarily, matches against fingerprints, deletes video
Results: per-master adaptation mapping, unmatched items, match rate

Matching Engine (4-tier cascade):

Stage 0: Metadata filtering (80-95% reduction)
Tier 1: Perceptual hash matching
Tier 2: AKAZE feature verification
Tier 3: AI Vision fallback (smart triggering)

Storage: Only fingerprints (~50KB/master) stored permanently. Videos deleted after processing.

Box Folder Structure:

CAMPAIGNS/{campaign_name}/
├── Global Masters/          (various casing)
│   ├── DOOH/
│   ├── DS/
│   ├── OLV/
│   └── ... (video files with MASTER in name)
└── Regional Masters/        (various casing)
    ├── DE/ (country code folders)
    ├── FR/
    └── ...

5. Usage Dashboard

API usage tracking across all tools.

Displays:

Summary cards: total API calls, tokens used, estimated cost (USD)
Breakdowns: by provider, model, tool, user
Recent API calls table with full details
Time filters: All Time, 30 Days, 7 Days, Today

Cost estimates based on per-model token pricing (GPT-4o, Gemini 2.5 Flash, etc.)

Configuration

Environment Variables (.env)

# Authentication
AUTH_USERS=admin:pbkdf2:sha256:600000$$salt$$hash

# Session
SESSION_COOKIE_PATH=/hm-ai-qc-report

# Box
BOX_CONFIG_PATH=config/box_config.json
BOX_REPORT_FOLDER_ID=133295752718
BOX_CAMPAIGNS_FOLDER_ID=156182880490

# Flask
SECRET_KEY=<generate-random-key>
FLASK_ENV=production

# Database (use absolute path for Docker)
DATABASE_URI=sqlite:////app/database/qc_platform.db

# LLM Providers
OPENAI_API_KEY=<your-key>
GOOGLE_API_KEY=<your-key>

Note: $$ in AUTH_USERS hash is required for Docker Compose (escapes $).

Architecture

Tech Stack

Backend: Flask 3.0, SQLAlchemy, Gunicorn
Frontend: Bootstrap 5, Vanilla JS, Server-Sent Events
AI: OpenAI GPT-4o, Google Gemini 2.5 Flash (via google-generativeai)
Video: FFmpeg, OpenCV (AKAZE), Chromaprint
Storage: Box.com (JWT auth), SQLite
Deployment: Docker, Apache reverse proxy

Directory Structure

hm_ai_qc_report_tool/
├── app.py                    # Application factory
├── config.py                 # Configuration
├── Dockerfile                # Docker image
├── docker-compose.yml        # Docker services
├── deploy/                   # Deployment scripts & configs
│
├── core/                     # Shared infrastructure
│   ├── auth/                 # Session-based authentication
│   ├── models/               # Database models (QCReport, UsageLog)
│   ├── services/             # LLM config, Box client
│   └── utils/                # Progress tracker, report parser
│
├── modules/
│   ├── hm_qc/               # HM QC (checks, executor, profiles)
│   ├── video_qc/            # Video QC (executor, frame extraction)
│   ├── video_master/         # Video Master (matching engine, campaign matcher)
│   ├── reporting/            # Reporting (aggregator, Box search, cache)
│   └── usage/                # Usage dashboard
│
├── templates/                # Shared templates (base.html, login.html)
├── static/                   # CSS, JavaScript
├── database/                 # SQLite database
└── storage/                  # Reports, fingerprints

Security

Local username/password auth with PBKDF2/scrypt hashing
Session-based with before_request login enforcement
No hardcoded API keys (all from environment)
Docker container binds to 127.0.0.1 only (not exposed to internet)
HTTPS via Apache with wildcard SSL certificate
httpOnly, Secure, SameSite=Lax cookies

License

Proprietary - H&M Hennes & Mauritz AB