No description

Find a file

nickviljoen b4e94ad4eb Update default Google model to gemini-2.5-flash Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-03-21 18:59:00 +02:00
core	Update default Google model to gemini-2.5-flash	2026-03-21 18:59:00 +02:00
database	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
deploy	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
modules	Update default Google model to gemini-2.5-flash	2026-03-21 18:59:00 +02:00
static	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
templates	Add Usage Dashboard with token tracking, cost estimates, and filters	2026-03-21 18:17:21 +02:00
.dockerignore	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
.env.example	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
.gitignore	Consolidate legacy hm_qc and video_qc tools into main project	2026-02-25 11:40:53 +02:00
app.py	Add Usage Dashboard with token tracking, cost estimates, and filters	2026-03-21 18:17:21 +02:00
auth_middleware.py	Initial Commit	2025-12-30 16:47:56 +02:00
box_client.py	Initial Commit	2025-12-30 16:47:56 +02:00
CHANGELOG.md	v1.1.0: Add progress tracking, CSV export, multi-job support, batch processing, and security fixes	2026-03-13 09:43:20 +02:00
config.py	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
DEPLOYMENT_CHECKLIST.md	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
docker-compose.yml	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
Dockerfile	Fix Dockerfile: update package names for Debian Trixie	2026-03-21 14:43:11 +02:00
DOCUMENTATION_SUMMARY.txt	Update documentation for unified platform consolidation	2026-02-25 13:51:21 +02:00
gunicorn_config.py	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00
INTEGRATION_TEST_REPORT.md	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
jwt_validator.py	Initial Commit	2025-12-30 16:47:56 +02:00
MIGRATION_GUIDE.md	Update documentation for unified platform consolidation	2026-02-25 13:51:21 +02:00
README.md	v1.1.0: Add progress tracking, CSV export, multi-job support, batch processing, and security fixes	2026-03-13 09:43:20 +02:00
report_parser.py	Initial Commit	2025-12-30 16:47:56 +02:00
requirements.txt	Batch 3: Add title legibility check, Google Gemini support, LLM provider selector	2026-03-21 16:53:07 +02:00
run.sh	Reporting updated.	2026-01-14 09:14:00 +02:00
run_prod.sh	Initial Commit	2025-12-30 16:47:56 +02:00
setup.sh	Initial Commit	2025-12-30 16:47:56 +02:00
test_integration.py	Add modular architecture, core framework, and web UI	2026-02-25 11:39:04 +02:00
test_local.sh	Initial Commit	2025-12-30 16:47:56 +02:00
wsgi.py	v1.2.0: Add Docker deployment, simplify auth to local login, production config	2026-03-21 14:37:53 +02:00

README.md

Unified HM QC Platform

Version: 1.1.0 Status: ✅ Production Ready

A comprehensive quality control platform merging multiple QC tools into a single unified web application with intelligent AI-powered validation, sophisticated scoring, and consolidated reporting.

Overview

The Unified HM QC Platform integrates four distinct quality control tools into a single, cohesive web application:

HM QC - PDF/Image quality control with AI-powered validation
Video QC (BETA) - Video quality control with technical specification checks
Video Master Adot Detection (BETA) - Intelligent 4-tier video matching system
Reporting - Consolidated reports from Box.com and internal QC modules

Key Features

🎯 Unified Interface - Single platform with tabbed navigation
🔒 Secure Authentication - Azure AD with JWT validation
🤖 AI-Powered - GPT-4o and Claude integration for intelligent validation
📊 Advanced Scoring - 0-100 confidence scoring with detailed breakdowns
🎨 H&M Branding - Consistent black (#000000) and yellow (#FFDD00) theme
🔄 Real-time Progress - SSE and polling support for long-running operations
📦 Modular Architecture - Flask blueprints for clean separation of concerns

Quick Start

Prerequisites

Python 3.9+
ffmpeg (for video processing)
chromaprint (optional, for audio fingerprinting)

Installation

# 1. Navigate to project directory
cd /Users/nickviljoen/Desktop/HM_QC_Bitbucket/hm_ai_qc_report_tool

# 2. Create virtual environment
python3 -m venv venv
source venv/bin/activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# 4. Create .env file (see Configuration section)
cp .env.example .env
# Edit .env with your credentials

# 5. Initialize database
python3 -c "from app import create_app; from core.models.database import db; app = create_app(); app.app_context().push(); db.create_all()"

# 6. Run application
python3 app.py

Access the platform at: http://localhost:7183

Architecture

Tech Stack

Backend:

Flask 3.0 (Python web framework)
SQLAlchemy (ORM with SQLite database)
OpenAI / Anthropic (LLM providers)
Box SDK (Cloud storage integration)

Frontend:

Bootstrap 5 (Responsive UI)
Vanilla JavaScript (Tab management, progress tracking)
Server-Sent Events (Real-time updates)

Video Processing:

ffmpeg-python (Video manipulation)
OpenCV (AKAZE feature matching)
Chromaprint (Audio fingerprinting)

Directory Structure

hm_ai_qc_report_tool/
├── app.py                    # Application factory
├── config.py                 # Configuration
├── requirements.txt          # Dependencies
│
├── core/                     # Shared infrastructure
│   ├── auth/                 # Authentication
│   ├── models/               # Database models
│   ├── services/             # LLM, Box.com
│   └── utils/                # Progress, parsing, error sanitization
│
├── modules/                  # Feature modules
│   ├── hm_qc/               # HM QC
│   ├── video_qc/            # Video QC (BETA)
│   ├── video_master/        # Video Master (BETA)
│   └── reporting/           # Reporting
│
├── templates/               # Shared templates
├── static/                  # CSS, JavaScript
├── database/                # SQLite database
├── data/                    # Video Master data
└── storage/                 # Report storage

Modules

1. HM QC Module ✅

Status: Complete (Demo with 2 sample checks)

Automated quality control for PDF and image marketing materials.

Features:

Drag-and-drop file upload (single or batch up to 100 files)
Profile-based configuration
Weighted check execution
0-100 scoring (Pass: 90+, Warning: 70-89, Fail: <70)
Real-time progress tracking (SSE + polling)
HTML report generation
Batch processing with rate limiting (batches of 10, 2s cooldown)
Batch results summary (total/passed/failed/warnings/average score)

Workflow:

Single file: Upload → Select profile → Execute QC → View results with score → Download report
Batch: Upload multiple files → Select profile → Execute batch → View per-file summary with scores

2. Video QC Module (BETA) 🔶

Status: Basic Structure

Video quality control with technical validation.

Planned Features:

Filename convention validation
Technical specs (codec, resolution, FPS, bitrate)
Audio quality validation
AI-powered censorship detection
Duration validation

3. Video Master Module (BETA) ✅

Status: Complete (Full 4-tier engine)

Intelligent video matching using cascading 4-tier approach.

Matching System:

Stage 0: Metadata filtering (80-95% reduction, <1s)
Tier 1: Perceptual hash (5-10s for 50 masters)
Tier 2: AKAZE verification (5-10s per candidate)
Tier 3: AI Vision fallback (smart triggering, ~$0.006/comparison)

Performance:

Smart AI Vision triggering: ~97% cost reduction
Typical batch: $0.30 instead of $15

Features:

0-100 confidence scoring
Match quality analysis
Cost tracking
Master library management

4. Reporting Module ✅

Status: Complete

Consolidated reporting from Box.com and internal QC.

Features:

Job number search with real-time progress bar (async with SSE)
Multi-job search (comma-separated job numbers)
Dual-source aggregation (Box + Database)
Source badges ("HM QC", "Box")
Dashboard with parsed data (single-job and multi-job views)
Designer-friendly error display with human-readable check names and action guidance
"Show Technical Details" toggle for advanced users
Export to HTML (full and errors-only)
Export to CSV (full and errors-only) with Box file hyperlinks
Multi-job combined CSV export
In-memory result caching (30-min TTL) to avoid re-downloading

AI_QC Improvements

1. Centralized LLM Configuration

✅ NO hardcoded API keys - All from environment ✅ Multi-provider support (OpenAI, Anthropic, Azure, Google) ✅ Retry logic with exponential backoff ✅ Token tracking and cost monitoring

2. Scoring System (0-100 Scale)

Thresholds:

90-100: Pass (Excellent)
70-89: Warning (Minor issues)
0-69: Fail (Significant issues)

3. Sophisticated Prompt Engineering

Structured prompts with:

Evaluation criteria
Scoring guidance
Step-by-step instructions
Decision criteria
Required JSON output

4. Profile-Based Configuration

YAML profiles with:

Per-check weights
LLM provider/model selection
Enable/disable flags
Easy customization

Configuration

Environment Variables (.env)

# Flask
SECRET_KEY=<generate-random-key>
FLASK_ENV=development
HOST=0.0.0.0
PORT=7183

# Azure AD
AZURE_TENANT_ID=e519c2e6-bc6d-4fdf-8d9c-923c2f002385
AZURE_CLIENT_ID=9079054c-9620-4757-a256-23413042f1ef

# Box.com
BOX_CONFIG_PATH=config/box_config.json
BOX_REPORT_FOLDER_ID=133295752718

# LLM Providers (NO HARDCODED KEYS!)
OPENAI_API_KEY=<your-key>
ANTHROPIC_API_KEY=<your-key>

Generate SECRET_KEY:

python3 -c "import secrets; print(secrets.token_hex(32))"

Usage

HM QC Workflow

Single File:

Navigate to HM QC tab
Upload PDF/image file (drag & drop or browse)
Select profile and configure checks
Enter job number
Execute QC with real-time progress
View results with 0-100 score
Download HTML report

Batch Processing:

Navigate to HM QC tab
Upload multiple files (up to 100, drag & drop or browse)
Review file list, remove unwanted files
Select profile → Execute batch
Monitor batch progress (files processed per batch)
View summary: total/passed/failed/warnings with per-file scores

Video Master Matching (Python API)

from modules.video_master.matching import VideoMatcher

matcher = VideoMatcher(
    enable_ai_vision=True,
    use_akaze=True
)

matcher.add_master("/path/to/master.mp4", "master_1")
matches = matcher.match_adaptation("/path/to/adaptation.mp4")

for match in matches:
    print(f"Master: {match['master_id']}")
    print(f"Confidence: {match['confidence_score']}/100")
    print(f"Method: {match['matching_method']}")

Reporting Search

Navigate to Reporting tab
Enter job number(s) — comma-separated for multiple
Watch progress bar as reports are fetched and parsed
View consolidated dashboard with designer-friendly error descriptions
Toggle "Show Technical Details" for full check data
Export: HTML (full or errors-only) or CSV (full or errors-only)
Multi-job: Combined summary + per-job sections + combined CSV export

Testing

Integration Tests

python3 test_integration.py

Tests:

App initialization
Blueprint registration
Database connectivity
Core services
Route accessibility

See INTEGRATION_TEST_REPORT.md for detailed results.

Security

✅ NO hardcoded API keys ✅ Azure AD JWT validation ✅ httpOnly cookies ✅ Input validation ✅ Secure file handling

Performance

Operation	Time	Cost
HM QC - Filename check	<1s	$0
HM QC - AI quality check	2-5s	~$0.01
Video Master - Stage 0	<1s	$0
Video Master - Tier 1	5-10s	$0
Video Master - Tier 2	5-10s	$0
Video Master - Tier 3	3-5s	~$0.006

Documentation

DEPLOYMENT_CHECKLIST.md - Comprehensive deployment guide
INTEGRATION_TEST_REPORT.md - Test results and verification
Inline code documentation

Troubleshooting

Common Issues

1. Module Not Found

source venv/bin/activate
which python  # Should show venv path

2. ffmpeg Not Found

brew install ffmpeg  # macOS
sudo apt-get install ffmpeg  # Linux

3. LLM API Errors

echo $OPENAI_API_KEY  # Verify key is set

4. Database Locked

lsof database/qc_platform.db  # Check for other processes

Production Deployment

Gunicorn (Recommended)

gunicorn -c gunicorn_config.py wsgi:app

Systemd Service

See DEPLOYMENT_CHECKLIST.md for complete systemd configuration.

Changelog

Version 1.1.0 (2026-03-13)

Reporting & Batch Processing Enhancements

✅ Async search with real-time progress bar (SSE + polling) ✅ CSV export with Box file hyperlinks and designer-friendly columns ✅ Error code cleanup: human-readable check names + action guidance ✅ "Show Technical Details" toggle on dashboard ✅ Multi-job search (comma-separated) with combined dashboard and CSV ✅ In-memory result cache (30-min TTL) ✅ HM QC batch file upload (up to 100 files) ✅ Batch processing with rate-limited execution (batches of 10) ✅ Batch results summary with per-file scores

Version 1.0.0 (2026-02-02)

Initial Release

✅ Core Infrastructure (Flask, SQLAlchemy, Auth) ✅ HM QC Module (Complete workflow with 2 checks) ✅ Video QC Module (Basic structure - BETA) ✅ Video Master Module (Complete 4-tier engine) ✅ Reporting Module (Box + Database consolidation) ✅ AI_QC Improvements (Scoring, prompts, LLM config) ✅ Unified UI (Tabbed interface with H&M branding)

License

Proprietary - H&M Hennes & Mauritz AB

Support

For issues or questions, contact the development team.

Built with ❤️ for H&M Quality Control