ai_qc/CLAUDE.md
nickviljoen bcd318a7b1 docs: update CLAUDE.md after Phases 1+2 (Dow Jones removed, demos added)
Updates the intro count (9 → 12 clients), adds Google/HP/Ferrero to
the client name list, and adds three table rows for the new demo
clients (Doc column marked _scope pending_ until per-client docs land).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:39:40 +02:00

13 KiB
Raw Permalink Blame History

CLAUDE.md

This file provides project-wide guidance to Claude Code. Per-client documentation lives in CLAUDE_<CLIENT>.md files at the repo root and is not auto-loaded — read the relevant client file when working on that client's code.

Working on a specific client?

When the user tells you the work is for a specific client (or you can infer it from the files being touched), read that client's CLAUDE_<CLIENT>.md immediately before doing anything else. Don't rely on remembered context — the client files have the up-to-date check inventories, tuning history, test asset locations, and known limitations.

Project Overview

Visual AI QC is a Flask-based AI-powered quality control platform for analyzing marketing materials and design assets using OpenAI GPT-4o and Google Gemini 2.5 Pro. It evaluates visual and video content against brand guidelines through 60+ specialized QC checks across 15 profiles, serving 12 clients (Diageo, Unilever, L'Oreal, Amazon, Boots, Honda, AXA, Rank, Google, HP, Ferrero, General).

Core Architecture

Main components

  • api_server.py — Flask server, async processing, parallel batch execution
  • visual_qc_apps/ — Modular QC check system (one directory per check)
  • document_mode/ — Multi-page PDF QC pipeline (built for AXA, reused by Boots PPack)
  • profiles/ — JSON profile configs (which checks run, weights, LLM assignments)
  • brand_guidelines/ — Reference asset storage and metadata
  • llm_config.py — Centralized LLM configuration / API interactions
  • profile_config.py — Profile loading and check discovery
  • client_config.py — Client ↔ profile mapping with visibility control
  • pdf_processor.py — PDF text extraction and LLM summarization for brand guidelines
  • media_plan_processor.py — Excel media plan parsing, filename matching, spec validation
  • usage_tracker.py — Usage tracking and cost estimation
  • web_ui.html — Single-page web interface

Key design patterns

  • Modular QC checks: Each check is visual_qc_apps/{check_name}/app.py with a standardized interface
  • Profile-based config: Profiles define which checks run, weights, and LLM assignments
  • Mode field on profiles: asset (default) | document | document_diff — document modes use the document_mode/ pipeline instead of the standard visual flow
  • Parallel batch processing: ThreadPoolExecutor, batches of 15
  • Reference asset integration: Brand guidelines augment check prompts

Development Commands

# Start local dev server
./scripts/run-local.sh                    # http://localhost:7183

# System validation
./scripts/test-system.sh                  # syntax + imports + profile load

# Quick checks
python -m py_compile **/*.py
python -c "import api_server, llm_config, profile_config"

Environment

config/
├── development.env    # local dev API keys + Flask config
├── production.env     # production
└── .env.template      # template

The app detects environment via ENVIRONMENT env var, then by config file presence, then falls back to legacy config.env at the repo root.

Adding a new QC check

  1. Create visual_qc_apps/{check_name}/app.py using flask_app_template.py
  2. Reference it in the relevant profile JSON (backend/profiles/)
  3. Restart the server

Authentication

MSAL/PKCE flow with httpOnly session cookies, JWT validated against Azure AD JWKS.

  • jwt_validator.py — token validation against Azure AD JWKS
  • auth_middleware.py — Flask middleware, session cookie management
  • Endpoints: /auth/login, /auth/logout, /auth/status
  • Frontend: MSAL Browser Library v2.38.3+ (popup flow) in web_ui.html

Required env: AZURE_TENANT_ID, AZURE_CLIENT_ID, FLASK_ENV, SECRET_KEY.

Protected endpoints include /api/start_analysis, /api/analyze, /api/process_*, /api/profiles (POST/PUT/DELETE), /api/brand_guidelines (POST/DELETE).

QC Profile System

Profiles define check sets, weights, and LLM assignments. Profiles can be marked visibility: "all" (visible to every client) or visibility: "client_specific" (only for listed clients). Profile edits auto-create new versions (my_profile_v2.json, _v3.json, ...) — originals are never overwritten. Detailed UX in backend/PROFILE_MANAGEMENT.md.

Generic profiles (visible to all clients)

  • static_general (10 checks) — baseline static asset QC
  • general_check (10 checks) — streamlined general-purpose
  • video_general (4 checks) — generic video QC
  • inclusive_accessibility (2 checks) — accessibility focus

Client-specific profiles → see per-client docs

Client Profiles Doc
Diageo diageo_key_visual (11), diageo_packaging (13) CLAUDE_DIAGEO.md
Unilever unilever_key_visual (15, 120-pt scale + zero-score logic), unilever_packaging (17) CLAUDE_UNILEVER.md
L'Oreal loreal_static (4, strict-grade) CLAUDE_LOREAL.md
Amazon amazon_static (6) CLAUDE_AMAZON.md
Boots boots_static (5, strict-grade), boots_ppack (7, document-mode, strict-grade w/ artwork-page exemption) CLAUDE_BOOTS.md
AXA axa_policy_document (7, document-mode), axa_accessibility (1, document-mode, strict-grade), axa_policy_document_diff (1, document_diff) CLAUDE_AXA.md
Honda generic only CLAUDE_HONDA.md
Rank generic only CLAUDE_RANK.md
Google generic only scope pending
HP generic only scope pending
Ferrero generic only scope pending
General generic only CLAUDE_GENERAL.md

Scoring

  • 100-point scale by default. Profiles with total weight ≥ 10.0 use direct weighted scores; profiles with lower weight use weighted_score × 10. All score-calculation paths cap at 100 (or 120 for Unilever Key Visual).
  • Strict grading (strict_grade: true on a profile): any individual check scoring < 6 forces overall Fail, regardless of total. Used by L'Oreal Static, Boots Static, Boots PPack.
  • Profile-specific zero-scoring (Unilever Key Visual): see CLAUDE_UNILEVER.md.

Cross-cutting platform features

User access control (backend/user_access.py, backend/user_access.json)

Default-deny per-user client access. Admins grant/revoke via the admin panel's User Access tab. Enforced server-side on every client-scoped endpoint via _require_client_access(client_id) in api_server.py. Returns 403 with code: "client_access_denied" on denial. Audit trail written to daily JSONL usage logs as event: "access_change". backend/user_access.json is gitignored. Bootstrap admin: nick.viljoen@brandtech.plus.

Self-service access requests

Client picker has a "Request Client Access" tile. Submits to POST /api/access_request, sends email to admins via backend/email_service.py (Mailgun SMTP), logs an access_request event.

Admin panel

"Admin" header button (admin-only). Tabs: Usage Overview + User Access. Endpoints: GET /api/admin/check, GET /api/admin/users, GET /api/admin/user_access*.

Reporting

Per-client "Reporting" tab. Endpoint GET /api/client_usage_stats?client={id}&start_date={}&end_date={}. Summary cards + per-analysis detail table.

Usage tracking

Daily JSONL files in backend/usage_logs/, one event per line. Generate reports via python backend/generate_usage_report.py --last-days 30 [--client X] [--format text|json|csv]. Details in backend/USAGE_REPORTS.md.

Media plans

Excel uploads per client at Settings → Media Plan. Parses all channel sheets via openpyxl, stores in backend/media_plans/. Fuzzy filename matching against assets at QC time. Matched metadata (country, language, placement, vendor, dimensions) injected into check prompts. Endpoints: POST/GET/DELETE /api/media_plan.

PDF reference assets

Multi-page brand guideline PDFs are processed on upload (pdf_processor.py): all pages text-extracted via PyMuPDF, summarized via Gemini 2.5 Pro (2000-4000 word structured summary), page 1 saved as cover image. Output: {file_id}_summary.txt and {file_id}_cover.png in brand_guidelines/files/. Summary text + cover image included in QC check prompts. Endpoints: GET /api/brand_guidelines/<id>/status, POST /api/brand_guidelines/<id>/reprocess.

Settings modal UX (Apr 2026)

  • Reference Assets tab: single "Name" field (was Brand Name + Tags + Description)
  • Media Plan tab: "Name" field, stored as display_name
  • Modal footer is context-aware: Save Profile + Cancel only on Profile tabs; other tabs show a single Save that closes the modal (the in-tab upload buttons are the actual save action)

Deployment

Env URL Branch Server Service
Local http://localhost:7183 any laptop none (Flask dev)
Dev https://optical-dev.oliver.solutions/ai_qc/ develop optical-production-dev (GCP, eu-west2-b) ai-qc.service
Prod https://optical-prod.oliver.solutions/ai_qc/ tags on main optical-production (GCP, eu-west2-c) ai-qc.service
Legacy sandbox older URL main older VM (www-data) ai_qc.service

Both new-style envs: app at /opt/ai_qc, runs as nick.viljoen, systemd unit running Waitress on 127.0.0.1:7183, Apache reverse-proxy include at /opt/ai_qc/deploy/apache-ai-qc.conf. TLS at the GCP load balancer. Per-server SSH key for Bitbucket pulls (~/.ssh/bitbucket_ai_qc, host alias bitbucket-ai-qc).

Branch strategy

  • develop → dev server. Push, then run deploy.sh dev on optical-dev.
  • main → prod. Never push directly. Merge develop → main via PR, tag (e.g. v1.2.0), deploy with deploy.sh prod v1.2.0.
  • Feature branches (feature/<name>): branch from main, PR into develop. Keep merged branches as history or delete once main catches up.

Deploy scripts

In backend/scripts/, run on the target server:

Script Usage What it does
deploy.sh dev backend/scripts/deploy.sh dev [--dry-run] Fetch, diff, confirm, git reset --hard origin/develop, pip install if requirements.txt changed, restart ai-qc.service, smoke test /health, auto-rollback on failure
deploy.sh prod <tag> backend/scripts/deploy.sh prod v1.2.0 [--dry-run] Same flow against a specific tag
rollback.sh backend/scripts/rollback.sh last or ... <commit-hash> Revert to the last-deploy checkpoint or any commit
health-check.sh backend/scripts/health-check.sh One-line liveness check

The deploy script writes pre-deploy HEAD to .last_deploy_rollback so rollback.sh last always knows where to go.

Production troubleshooting

Issue Check Fix
404 on web UI curl localhost:7183/ Use absolute path in serve_web_ui() (relative path breaks under Waitress)
404 on /auth/* ls /var/www/html/ai_qc/ Remove static directory conflicting with Apache ProxyPass — Apache serves files before applying ProxyPass
MSAL interaction_in_progress Browser console Clear cache; concurrent sign-in protection in signIn() (uses isSigningIn flag)
Backend not starting systemctl status ai-qc Check Python env, deps, port 7183
Permission denied on uploads/output/etc. After git pull resets ownership sudo chown -R www-data:www-data uploads output media_plans brand_guidelines usage_logs

Apache ProxyPass order matters — specific paths first:

ProxyPass /ai_qc/auth http://localhost:7183/auth
ProxyPassReverse /ai_qc/auth http://localhost:7183/auth
ProxyPass /ai_qc http://localhost:7183
ProxyPassReverse /ai_qc http://localhost:7183

Pre-Session Completion Checklist

Before ending any session, run:

  1. Syntax check: python -m py_compile **/*.py
  2. Core imports: python -c "import api_server, llm_config, profile_config"
  3. Auth imports: python -c "import jwt_validator, auth_middleware"
  4. Modified QC modules: python -m py_compile visual_qc_apps/<modified_check>/app.py
  5. Profile load (if profiles changed):
    cd backend && python3 -c "
    from profile_config import get_profile
    for p in ['general_check','static_general','unilever_key_visual','unilever_packaging','diageo_key_visual','diageo_packaging','loreal_static','amazon_static','boots_static','boots_ppack','inclusive_accessibility','video_general','axa_policy_document','axa_policy_document_diff','axa_accessibility']:
        prof = get_profile(p); print(f'OK {prof.name} ({len(prof.get_enabled_checks())} checks)')
    "
    
  6. Client config (if client_config.py changed):
    cd backend && python3 -c "
    from client_config import get_all_clients
    for cid, c in get_all_clients().items(): print(f'OK {c[\"display_name\"]}: {c[\"profiles\"]}')
    "