# CLAUDE.md This file provides project-wide guidance to Claude Code. **Per-client documentation lives in `CLAUDE_.md` files at the repo root** and is not auto-loaded — read the relevant client file when working on that client's code. ## Working on a specific client? When the user tells you the work is for a specific client (or you can infer it from the files being touched), **read that client's `CLAUDE_.md` immediately** before doing anything else. Don't rely on remembered context — the client files have the up-to-date check inventories, tuning history, test asset locations, and known limitations. ## Project Overview Visual AI QC is a Flask-based AI-powered quality control platform for analyzing marketing materials and design assets using OpenAI GPT-4o and Google Gemini 2.5 Pro. It evaluates visual and video content against brand guidelines through **60+ specialized QC checks** across **15 profiles**, serving **12 clients** (Diageo, Unilever, L'Oreal, Amazon, Boots, Honda, AXA, Rank, Google, HP, Ferrero, General). ## Core Architecture ### Main components - **`api_server.py`** — Flask server, async processing, parallel batch execution - **`visual_qc_apps/`** — Modular QC check system (one directory per check) - **`document_mode/`** — Multi-page PDF QC pipeline (built for AXA, reused by Boots PPack) - **`profiles/`** — JSON profile configs (which checks run, weights, LLM assignments) - **`brand_guidelines/`** — Reference asset storage and metadata - **`llm_config.py`** — Centralized LLM configuration / API interactions - **`profile_config.py`** — Profile loading and check discovery - **`client_config.py`** — Client ↔ profile mapping with visibility control - **`pdf_processor.py`** — PDF text extraction and LLM summarization for brand guidelines - **`media_plan_processor.py`** — Excel media plan parsing, filename matching, spec validation - **`usage_tracker.py`** — Usage tracking and cost estimation - **`web_ui.html`** — Single-page web interface ### Key design patterns - **Modular QC checks**: Each check is `visual_qc_apps/{check_name}/app.py` with a standardized interface - **Profile-based config**: Profiles define which checks run, weights, and LLM assignments - **Mode field on profiles**: `asset` (default) | `document` | `document_diff` — document modes use the `document_mode/` pipeline instead of the standard visual flow - **Parallel batch processing**: ThreadPoolExecutor, batches of 15 - **Reference asset integration**: Brand guidelines augment check prompts ## Development Commands ```bash # Start local dev server ./scripts/run-local.sh # http://localhost:7183 # System validation ./scripts/test-system.sh # syntax + imports + profile load # Quick checks python -m py_compile **/*.py python -c "import api_server, llm_config, profile_config" ``` ### Environment ``` config/ ├── development.env # local dev API keys + Flask config ├── production.env # production └── .env.template # template ``` The app detects environment via `ENVIRONMENT` env var, then by config file presence, then falls back to legacy `config.env` at the repo root. ### Adding a new QC check 1. Create `visual_qc_apps/{check_name}/app.py` using `flask_app_template.py` 2. Reference it in the relevant profile JSON (`backend/profiles/`) 3. Restart the server ## Authentication MSAL/PKCE flow with httpOnly session cookies, JWT validated against Azure AD JWKS. - **`jwt_validator.py`** — token validation against Azure AD JWKS - **`auth_middleware.py`** — Flask middleware, session cookie management - **Endpoints**: `/auth/login`, `/auth/logout`, `/auth/status` - **Frontend**: MSAL Browser Library v2.38.3+ (popup flow) in `web_ui.html` Required env: `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `FLASK_ENV`, `SECRET_KEY`. Protected endpoints include `/api/start_analysis`, `/api/analyze`, `/api/process_*`, `/api/profiles` (POST/PUT/DELETE), `/api/brand_guidelines` (POST/DELETE). ## QC Profile System Profiles define check sets, weights, and LLM assignments. Profiles can be marked `visibility: "all"` (visible to every client) or `visibility: "client_specific"` (only for listed clients). Profile edits auto-create new versions (`my_profile_v2.json`, `_v3.json`, ...) — originals are never overwritten. Detailed UX in `backend/PROFILE_MANAGEMENT.md`. ### Generic profiles (visible to all clients) - **`static_general`** (10 checks) — baseline static asset QC - **`general_check`** (10 checks) — streamlined general-purpose - **`video_general`** (4 checks) — generic video QC - **`inclusive_accessibility`** (2 checks) — accessibility focus ### Client-specific profiles → see per-client docs | Client | Profiles | Doc | |--------|----------|-----| | Diageo | `diageo_key_visual` (11), `diageo_packaging` (13) | [CLAUDE_DIAGEO.md](CLAUDE_DIAGEO.md) | | Unilever | `unilever_key_visual` (15, 120-pt scale + zero-score logic), `unilever_packaging` (17) | [CLAUDE_UNILEVER.md](CLAUDE_UNILEVER.md) | | L'Oreal | `loreal_static` (4, strict-grade) | [CLAUDE_LOREAL.md](CLAUDE_LOREAL.md) | | Amazon | `amazon_static` (6) | [CLAUDE_AMAZON.md](CLAUDE_AMAZON.md) | | Boots | `boots_static` (5, strict-grade), `boots_ppack` (7, document-mode, strict-grade w/ artwork-page exemption) | [CLAUDE_BOOTS.md](CLAUDE_BOOTS.md) | | AXA | `axa_policy_document` (7, document-mode), `axa_accessibility` (1, document-mode, strict-grade), `axa_policy_document_diff` (1, document_diff) | [CLAUDE_AXA.md](CLAUDE_AXA.md) | | Honda | generic only | [CLAUDE_HONDA.md](CLAUDE_HONDA.md) | | Rank | generic only | [CLAUDE_RANK.md](CLAUDE_RANK.md) | | Google | generic only | _scope pending_ | | HP | generic only | _scope pending_ | | Ferrero | generic only | _scope pending_ | | General | generic only | [CLAUDE_GENERAL.md](CLAUDE_GENERAL.md) | ### Scoring - 100-point scale by default. Profiles with total weight ≥ 10.0 use direct weighted scores; profiles with lower weight use `weighted_score × 10`. All score-calculation paths cap at 100 (or 120 for Unilever Key Visual). - **Strict grading** (`strict_grade: true` on a profile): any individual check scoring < 6 forces overall **Fail**, regardless of total. Used by L'Oreal Static, Boots Static, Boots PPack. - **Profile-specific zero-scoring** (Unilever Key Visual): see `CLAUDE_UNILEVER.md`. ## Cross-cutting platform features ### User access control (`backend/user_access.py`, `backend/user_access.json`) Default-deny per-user client access. Admins grant/revoke via the admin panel's User Access tab. Enforced server-side on every client-scoped endpoint via `_require_client_access(client_id)` in `api_server.py`. Returns 403 with `code: "client_access_denied"` on denial. Audit trail written to daily JSONL usage logs as `event: "access_change"`. `backend/user_access.json` is gitignored. Bootstrap admin: `nick.viljoen@brandtech.plus`. ### Self-service access requests Client picker has a "Request Client Access" tile. Submits to `POST /api/access_request`, sends email to admins via `backend/email_service.py` (Mailgun SMTP), logs an `access_request` event. ### Admin panel "Admin" header button (admin-only). Tabs: Usage Overview + User Access. Endpoints: `GET /api/admin/check`, `GET /api/admin/users`, `GET /api/admin/user_access*`. ### Reporting Per-client "Reporting" tab. Endpoint `GET /api/client_usage_stats?client={id}&start_date={}&end_date={}`. Summary cards + per-analysis detail table. ### Usage tracking Daily JSONL files in `backend/usage_logs/`, one event per line. Generate reports via `python backend/generate_usage_report.py --last-days 30 [--client X] [--format text|json|csv]`. Details in `backend/USAGE_REPORTS.md`. ### Media plans Excel uploads per client at Settings → Media Plan. Parses all channel sheets via openpyxl, stores in `backend/media_plans/`. Fuzzy filename matching against assets at QC time. Matched metadata (country, language, placement, vendor, dimensions) injected into check prompts. Endpoints: `POST/GET/DELETE /api/media_plan`. ### PDF reference assets Multi-page brand guideline PDFs are processed on upload (`pdf_processor.py`): all pages text-extracted via PyMuPDF, summarized via Gemini 2.5 Pro (2000-4000 word structured summary), page 1 saved as cover image. Output: `{file_id}_summary.txt` and `{file_id}_cover.png` in `brand_guidelines/files/`. Summary text + cover image included in QC check prompts. Endpoints: `GET /api/brand_guidelines//status`, `POST /api/brand_guidelines//reprocess`. ### Settings modal UX (Apr 2026) - Reference Assets tab: single "Name" field (was Brand Name + Tags + Description) - Media Plan tab: "Name" field, stored as `display_name` - Modal footer is context-aware: Save Profile + Cancel only on Profile tabs; other tabs show a single Save that closes the modal (the in-tab upload buttons are the actual save action) ## Deployment | Env | URL | Branch | Server | Service | |---|---|---|---|---| | Local | `http://localhost:7183` | any | laptop | none (Flask dev) | | Dev | `https://optical-dev.oliver.solutions/ai_qc/` | `develop` | `optical-production-dev` (GCP, eu-west2-b) | `ai-qc.service` | | Prod | `https://optical-prod.oliver.solutions/ai_qc/` | tags on `main` | `optical-production` (GCP, eu-west2-c) | `ai-qc.service` | | Legacy sandbox | older URL | `main` | older VM (`www-data`) | `ai_qc.service` | Both new-style envs: app at `/opt/ai_qc`, runs as `nick.viljoen`, systemd unit running Waitress on `127.0.0.1:7183`, Apache reverse-proxy include at `/opt/ai_qc/deploy/apache-ai-qc.conf`. TLS at the GCP load balancer. Per-server SSH key for Bitbucket pulls (`~/.ssh/bitbucket_ai_qc`, host alias `bitbucket-ai-qc`). ### Branch strategy - **`develop`** → dev server. Push, then run `deploy.sh dev` on optical-dev. - **`main`** → prod. Never push directly. Merge `develop → main` via PR, tag (e.g. `v1.2.0`), deploy with `deploy.sh prod v1.2.0`. - **Feature branches** (`feature/`): branch from `main`, PR into `develop`. Keep merged branches as history or delete once `main` catches up. ### Deploy scripts In `backend/scripts/`, run on the target server: | Script | Usage | What it does | |---|---|---| | `deploy.sh dev` | `backend/scripts/deploy.sh dev [--dry-run]` | Fetch, diff, confirm, `git reset --hard origin/develop`, pip install if `requirements.txt` changed, restart `ai-qc.service`, smoke test `/health`, auto-rollback on failure | | `deploy.sh prod ` | `backend/scripts/deploy.sh prod v1.2.0 [--dry-run]` | Same flow against a specific tag | | `rollback.sh` | `backend/scripts/rollback.sh last` or `... ` | Revert to the last-deploy checkpoint or any commit | | `health-check.sh` | `backend/scripts/health-check.sh` | One-line liveness check | The deploy script writes pre-deploy HEAD to `.last_deploy_rollback` so `rollback.sh last` always knows where to go. ### Production troubleshooting | Issue | Check | Fix | |-------|-------|-----| | 404 on web UI | `curl localhost:7183/` | Use absolute path in `serve_web_ui()` (relative path breaks under Waitress) | | 404 on `/auth/*` | `ls /var/www/html/ai_qc/` | Remove static directory conflicting with Apache ProxyPass — Apache serves files before applying ProxyPass | | MSAL `interaction_in_progress` | Browser console | Clear cache; concurrent sign-in protection in `signIn()` (uses `isSigningIn` flag) | | Backend not starting | `systemctl status ai-qc` | Check Python env, deps, port 7183 | | Permission denied on uploads/output/etc. | After `git pull` resets ownership | `sudo chown -R www-data:www-data uploads output media_plans brand_guidelines usage_logs` | Apache ProxyPass order matters — specific paths first: ```apache ProxyPass /ai_qc/auth http://localhost:7183/auth ProxyPassReverse /ai_qc/auth http://localhost:7183/auth ProxyPass /ai_qc http://localhost:7183 ProxyPassReverse /ai_qc http://localhost:7183 ``` ## Pre-Session Completion Checklist Before ending any session, run: 1. **Syntax check**: `python -m py_compile **/*.py` 2. **Core imports**: `python -c "import api_server, llm_config, profile_config"` 3. **Auth imports**: `python -c "import jwt_validator, auth_middleware"` 4. **Modified QC modules**: `python -m py_compile visual_qc_apps//app.py` 5. **Profile load** (if profiles changed): ```bash cd backend && python3 -c " from profile_config import get_profile for p in ['general_check','static_general','unilever_key_visual','unilever_packaging','diageo_key_visual','diageo_packaging','loreal_static','amazon_static','boots_static','boots_ppack','inclusive_accessibility','video_general','axa_policy_document','axa_policy_document_diff','axa_accessibility']: prof = get_profile(p); print(f'OK {prof.name} ({len(prof.get_enabled_checks())} checks)') " ``` 6. **Client config** (if `client_config.py` changed): ```bash cd backend && python3 -c " from client_config import get_all_clients for cid, c in get_all_clients().items(): print(f'OK {c[\"display_name\"]}: {c[\"profiles\"]}') " ```