Restructure CLAUDE.md docs: slim project-wide root, complete per-client coverage

Splits the monolithic CLAUDE.md (962 lines) into a slim project-wide root (211 lines)
plus per-client files. Auto-loaded context drops ~88% per session.

Changes:
- CLAUDE.md slimmed to project-wide essentials (architecture, auth, deployment, branch
  strategy, deploy scripts, prod troubleshooting, pre-session checklist). Adds explicit
  session-start convention pointing to CLAUDE_<CLIENT>.md for client-specific work.
  Updates client roster table to all 10 clients with profile counts.
- New CLAUDE_AXA.md: document-mode pipeline + axa_policy_document profiles
- New CLAUDE_DIAGEO.md: key_visual + packaging profiles, check inventories
- New CLAUDE_UNILEVER.md: profiles + zero-score logic for face/new visibility
- New CLAUDE_HONDA.md, CLAUDE_RANK.md, CLAUDE_GENERAL.md: stubs (clients use generic
  profiles only — kept for completeness and future expansion)
- backend/CLAUDE.md: stale 932-line duplicate replaced with 18-line redirect to root
  + backend-specific quick pointers

Per-client files (CLAUDE_LOREAL.md, CLAUDE_AMAZON.md, CLAUDE_BOOTS.md,
CLAUDE_DOW_JONES.md) unchanged — already had the right content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
nickviljoen 2026-05-06 12:29:16 +02:00
parent f5aaf8da24
commit 59a0b2408c
8 changed files with 384 additions and 1815 deletions

1020
CLAUDE.md

File diff suppressed because it is too large Load diff

58
CLAUDE_AXA.md Normal file
View file

@ -0,0 +1,58 @@
# AXA Client Documentation
> Referenced from main CLAUDE.md. Detailed AXA QC profile descriptions, document-mode pipeline notes, and status.
## Overview
AXA QC is built around **document-mode** — multi-page PDF analysis (policy documents, forms, brochures), not single-asset image checks. The document-mode subsystem (`backend/document_mode/`) was built for AXA and is now reused by Boots Production Pack.
**Status (2026-05-06):** Phases 1, 3, 4, 5 merged to `develop`. Not yet shown to AXA — gated on AXA show-and-tell. The full plan and remaining phases are in `backend/AXA_DOCUMENT_MODE_PLAN.md`.
## AXA Profiles
### `axa_policy_document` — single-document mode (8 checks)
Multi-page policy document QC. `mode: document`, scopes vary per check.
| Check | What it does | Weight |
|------|--------------|--------|
| `axa_font_inventory` | Per-page font extraction + brand-font compliance against AXA's approved font list | 1.0 |
| `axa_phone_inventory` | Extracts phone numbers across pages, validates format and approved-list membership | 1.0 |
| `axa_bold_words_definitions` | Bold-word inventory + definition cross-check (seed list at `backend/document_mode/data/axa_bold_words_seed.json`) | 2.0 |
| `axa_page_numbering` | Page numbering format and continuity | 1.0 |
| `axa_pdf_accessibility` | Tagged-PDF / accessibility checks | 2.0 |
| `axa_print_preflight` | Print-preflight checks (color space, embedded fonts, image resolution) | 1.0 |
| `axa_print_code` | Print code presence + format | 1.0 |
| `axa_omg_versioning` | OMG version footer/header presence and consistency | 1.0 |
### `axa_policy_document_diff` — old-vs-new diff mode (1 check)
`mode: document_diff` — compares two PDFs (old vs new policy version) and reports structured changes.
| Check | What it does | Weight |
|------|--------------|--------|
| `axa_pdf_diff` | Detects added/removed/modified pages, paragraphs, defined terms, phone numbers | 1.0 |
## Document-mode infrastructure
AXA's document-mode subsystem is the foundation for all multi-page PDF QC in this app:
- `document_mode/ingest.py` — PDF ingestion, page rendering, span/font/color extraction via PyMuPDF
- `document_mode/dispatcher.py` — Orchestrates per-check execution against pages, supports scopes: `document` / `targeted` / `page_sample` / `page_pair` / `page_each`
- `document_mode/checks.py`, `print_preflight_checks.py`, `accessibility_checks.py` — AXA check implementations
- `document_mode/diff_engine.py`, `diff_report_writer.py` — Old-vs-new diff handling
- `document_mode/result_writer.py` — HTML report rendering with per-page sections
Boots Production Pack reuses this entire spine — so any infra changes here affect both clients.
## Open items
- AXA show-and-tell pending — feedback will drive the next round of tuning
- Phase 2 (any further check expansion) deferred until after show-and-tell
- Canonical AXA font list / approved phone list / OMG version reference data may need expansion as test PDFs surface gaps
## Key files
- `backend/AXA_DOCUMENT_MODE_PLAN.md` — full design plan and phase breakdown
- `backend/document_mode/` — pipeline implementation
- `backend/profiles/axa_policy_document.json`, `axa_policy_document_diff.json`
- `backend/document_mode/data/axa_bold_words_seed.json` — bold-word seed list

53
CLAUDE_DIAGEO.md Normal file
View file

@ -0,0 +1,53 @@
# Diageo Client Documentation
> Referenced from main CLAUDE.md. Detailed Diageo QC profile descriptions and check inventories.
## Overview
Diageo has two specialised profiles for its core asset types: **Key Visual** (campaign creative) and **Packaging** (label/pack design). Both run against generic visual checks shared with other CPG-style brand profiles (Unilever uses an overlapping check set).
## Diageo Profiles
### `diageo_key_visual` — 11 checks
Campaign key-visual QC. Uses generic shared visual checks at brand-tuned weights.
| Check | What it does | Weight |
|-------|--------------|--------|
| `background_contrast` | Product/text contrast against background | 0.115 |
| `brand_assets_visibility` | Brand assets clearly visible | 0.077 |
| `call_to_action` | CTA presence and clarity | 0.115 |
| `face_gaze_direction` | If a face is present, gaze direction guides toward product/CTA | 0.038 |
| `face_visibility` | Face presence and visibility | 0.077 |
| `imperative_verb` | Headline uses imperative verb | 0.077 |
| `logo_visibility` | Brand logo clearly visible | 0.115 |
| `text_readability` | Text legibility | 0.115 |
| `visual_elements_count` | Element count not overwhelming | 0.077 |
| `visual_hierarchy` | Clear visual hierarchy | 0.115 |
| `word_count` | Headline word count appropriate | 0.077 |
### `diageo_packaging` — 13 checks
Packaging design QC. Adds packaging-specific checks (curved edges, color format) to a similar base.
| Check | What it does | Weight |
|-------|--------------|--------|
| `background_contrast` | Visibility of design elements | 0.087 |
| `brand_assets_visibility` | Brand assets visible on pack | 0.13 |
| `call_to_action` | CTA on pack (if applicable) | 0.043 |
| `color_format` | Color mode appropriate for print | 0.043 |
| `curved_edges` | Pack curve treatment | 0.087 |
| `face_gaze_direction` | Gaze direction (if face) | 0.043 |
| `face_visibility` | Face visibility | 0.043 |
| `logo_visibility` | Brand logo on pack | 0.13 |
| `lowercase_text` | Lowercase usage rules | 0.043 |
| `new_visibility` | "NEW" tag visibility (if present) | 0.087 |
| `product_visibility` | Product clearly visible | 0.13 |
| `text_readability` | Text legibility | 0.087 |
| `visual_elements_count` | Element count appropriate | 0.043 |
## Status
No formal prompt-tuning rounds have been run on Diageo profiles in this repo's history. Profiles use generic shared checks, so tuning is captured in the underlying `visual_qc_apps/<check>/app.py` prompts rather than client-specific check modules.
If Diageo-specific tuning is required (specific brand families, region rules, etc.), introduce dedicated `diageo_*` checks in `visual_qc_apps/` following the Boots / Amazon pattern.

21
CLAUDE_GENERAL.md Normal file
View file

@ -0,0 +1,21 @@
# General / Other Client Documentation
> Referenced from main CLAUDE.md. The "General / Other" tile is the catch-all for users without a brand-specific client assignment.
## Overview
`general` is the default client. Every authenticated user is granted access to it via `default_clients: ["general"]` in `backend/user_access.json`. It's the safe sandbox where new users can run analyses without an admin granting brand-specific access.
## Profiles available
| Profile | Notes |
|---------|-------|
| `static_general` | 10-check baseline static QC profile |
| `video_general` | Generic video QC profile |
| `inclusive_accessibility` | 2-check accessibility-focused profile (accessibility + inclusive design) |
## Notes
- The `general` client is intentionally generic — no client-specific tuning happens here.
- New profiles created with `visibility: "all"` automatically appear in this client's profile list.
- For client-specific work, set up a dedicated client tile and use a `client_specific` profile rather than adding to `general`.

24
CLAUDE_HONDA.md Normal file
View file

@ -0,0 +1,24 @@
# Honda Client Documentation
> Referenced from main CLAUDE.md. Honda has no client-specific profiles or checks at present.
## Overview
Honda is set up as a client tile in the platform but uses the **generic** `static_general` and `video_general` profiles only. No client-specific QC tools, profiles, or prompt tuning have been built for Honda.
## Profiles available
| Profile | Notes |
|---------|-------|
| `static_general` | 10-check baseline static QC profile shared with all clients |
| `video_general` | Generic video QC profile |
## Adding Honda-specific work
If Honda-specific QC needs arise (brand guidelines, dealer-template compliance, etc.), follow the established client pattern:
1. Create `honda_*` check modules under `backend/visual_qc_apps/`
2. Create a `honda_static.json` (or similar) profile in `backend/profiles/`
3. Update `client_config.py` to add the profile to the Honda client's profile list
4. Capture tuning history and known limitations in this file
See `CLAUDE_AMAZON.md` and `CLAUDE_BOOTS.md` for examples of full client builds.

24
CLAUDE_RANK.md Normal file
View file

@ -0,0 +1,24 @@
# Rank Client Documentation
> Referenced from main CLAUDE.md. Rank has no client-specific profiles or checks at present.
## Overview
Rank is set up as a client tile in the platform but uses the **generic** `static_general` and `video_general` profiles only. No client-specific QC tools, profiles, or prompt tuning have been built for Rank.
## Profiles available
| Profile | Notes |
|---------|-------|
| `static_general` | 10-check baseline static QC profile shared with all clients |
| `video_general` | Generic video QC profile |
## Adding Rank-specific work
If Rank-specific QC needs arise, follow the established client pattern:
1. Create `rank_*` check modules under `backend/visual_qc_apps/`
2. Create a `rank_static.json` (or similar) profile in `backend/profiles/`
3. Update `client_config.py` to add the profile to the Rank client's profile list
4. Capture tuning history and known limitations in this file
See `CLAUDE_AMAZON.md` and `CLAUDE_BOOTS.md` for examples of full client builds.

56
CLAUDE_UNILEVER.md Normal file
View file

@ -0,0 +1,56 @@
# Unilever Client Documentation
> Referenced from main CLAUDE.md. Detailed Unilever QC profile descriptions, profile-specific scoring logic, and check inventories.
## Overview
Unilever has two specialised profiles: **Key Visual** (campaign creative) and **Packaging** (label/pack design). Both share most checks with Diageo (generic CPG-style visual checks) but include a small number of **bonus checks** with profile-specific zero-scoring behaviour for missing critical elements.
## Unilever Profiles
### `unilever_key_visual` — 15 checks (120-point scale)
Campaign key-visual QC.
| Check | What it does | Weight |
|-------|--------------|--------|
| `background_contrast` | Product/text contrast against background | 0.10 |
| `brand_assets_visibility` | Brand assets visible | 0.12 |
| `call_to_action` | CTA presence and clarity | 0.03 |
| `curved_edges` | Curved-edge treatment | 0.04 |
| `face_gaze_direction` | Gaze direction guides toward product/CTA *(bonus / zero-score)* | 0.06 |
| `face_visibility` | Face presence and visibility *(bonus / zero-score)* | 0.07 |
| `imperative_verb` | Headline uses imperative verb | 0.02 |
| `logo_visibility` | Brand logo visible | 0.14 |
| `lowercase_text` | Lowercase usage rules | 0.10 |
| `new_visibility` | "NEW" tag visibility *(bonus / zero-score)* | 0.07 |
| `supporting_images` | Supporting imagery quality | 0.10 |
| `text_readability` | Text legibility (deprecated, now part of inheritance) | (n/a) |
| `visual_elements_count` | Element count not overwhelming | 0.14 |
| `visual_hierarchy` | Clear visual hierarchy | 0.10 |
| `visuals_left_text_right` | Visuals left, text right composition | 0.06 |
| `word_count` | Headline word count | 0.05 |
### `unilever_packaging` — 17 checks
Packaging design QC. Same base as Key Visual plus print-related checks (`crop_marks`, `color_format`).
## Profile-specific scoring logic — bonus / zero-score
The Unilever Key Visual profile implements **zero-score behaviour** for three checks tied to the presence of specific creative elements:
| Check | Trigger | Behaviour |
|-------|---------|-----------|
| `face_visibility` | `face_present == false` | Score forced to **0** |
| `new_visibility` | `new_present == false` | Score forced to **0** |
| `face_gaze_direction` | `face_present == false` | Score forced to **0** |
This ensures that creatives missing critical brand-mandated elements (a face, the "NEW" tag) cannot pass on the back of high scores from other checks. The zero-score logic lives in `api_server.py:extract_score_from_result()` and is gated by `profile_config.get('name') == 'Unilever Key Visual'`.
The Unilever Key Visual profile uses a **120-point scale** (total weight slightly above 1.0) — the bonus checks add headroom rather than being equally weighted with the core checks.
## Status
No formal client-driven prompt-tuning rounds in this repo's history. The profile-specific scoring logic was added as a system enhancement to handle the bonus-check pattern.
If Unilever-specific tuning is required (specific brand families, regional rules, etc.), introduce dedicated `unilever_*` checks in `visual_qc_apps/` following the Boots / Amazon pattern.

View file

@ -1,933 +1,18 @@
# CLAUDE.md
# CLAUDE.md (backend/)
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This file used to duplicate the project-wide guidance and is now stale. Read **`../CLAUDE.md`** at the repo root for current project-wide guidance, and the relevant **`../CLAUDE_<CLIENT>.md`** when working on client-specific code.
## Project Overview
## Quick pointers for backend work
Visual AI QC is a Python Flask-based AI-powered quality control platform for analyzing marketing materials and design assets using OpenAI GPT-4o and Google Gemini 2.5 Pro. It evaluates visual and video content against brand guidelines and design best practices through **75 specialized QC checks** across **14 profiles**, serving **10 clients** (Diageo, Unilever, L'Oreal, Amazon, Boots, Dow Jones, Honda, AXA, Rank, General).
- API server entry point: `api_server.py`
- QC check modules: `visual_qc_apps/{check_name}/app.py`
- Document-mode pipeline (multi-page PDF): `document_mode/`
- Profile JSONs: `profiles/`
- Profile loading + check discovery: `profile_config.py`
- Client ↔ profile mapping: `client_config.py`
- LLM config: `llm_config.py`
- User access control: `user_access.py` + `user_access.json` (gitignored)
- Usage logs: `usage_logs/<YYYY-MM-DD>.jsonl`
- Deploy scripts: `scripts/deploy.sh`, `scripts/rollback.sh`, `scripts/health-check.sh`
## Core Architecture
### Main Components
- **`api_server.py`** - Main Flask server with async processing and parallel execution
- **`visual_qc_apps/`** - Modular QC check system with 65 individual check modules
- **`profiles/`** - JSON configuration files defining QC check combinations and weights
- **`brand_guidelines/`** - Reference asset storage and brand guideline database
- **`llm_config.py`** - Centralized LLM configuration and API interaction
- **`profile_config.py`** - Profile loading and QC check discovery system
- **`usage_tracker.py`** - Usage tracking and cost estimation system
- **`generate_usage_report.py`** - Command-line tool for generating usage reports
- **`client_config.py`** - Client-profile relationship management with visibility control
- **`pdf_processor.py`** - PDF text extraction, LLM summarization for brand guidelines
- **`media_plan_processor.py`** - Excel media plan parsing, filename matching, spec validation
- **`web_ui.html`** - Single-page web interface for uploads and analysis
### Key Design Patterns
- **Modular QC Checks**: Each check lives in `visual_qc_apps/{check_name}/app.py` with standardized interface
- **Profile-Based Configuration**: QC profiles define which checks run, their weights, and LLM assignments
- **Parallel Batch Processing**: Checks execute in parallel batches of 15 for performance
- **Async Progress Tracking**: Non-blocking analysis with real-time progress updates
- **Reference Asset Integration**: Brand guidelines enhance analysis accuracy through prompt augmentation
## Development Commands
### Running the Application
#### Development Environment (Recommended)
```bash
# Quick start with development environment
./scripts/run-local.sh
# Access web interface at http://localhost:7183
```
#### Legacy/Manual Setup
```bash
# Start the Flask server directly
python api_server.py
# Or with environment variable
export ENVIRONMENT=development
python api_server.py
```
### Environment Setup
#### New Environment System (Recommended)
The application now supports separate development and production environments:
```bash
# Install dependencies
pip install -r requirements.txt
# Configure development environment
cp config/.env.template config/development.env
# Edit config/development.env with:
# OPENAI_API_KEY, GOOGLE_API_KEY, AZURE_CLIENT_ID, etc.
# Configure production environment
cp config/.env.template config/production.env
# Edit config/production.env with production settings
```
#### Environment Structure
```
config/
├── development.env # Local development settings
├── production.env # Production server settings
└── .env.template # Template for new environments
uploads-dev/ # Development uploads (separate from production)
output-dev/ # Development output (separate from production)
scripts/
├── run-local.sh # Start local development
├── deploy-to-prod.sh # Deploy to production
└── test-system.sh # Validate system before deployment
```
#### Legacy Environment Setup
```bash
# Fallback to legacy config.env (still supported)
cp config.env.example config.env
# Edit config.env with OPENAI_API_KEY and GOOGLE_API_KEY
```
### Adding New QC Checks
1. Create directory: `visual_qc_apps/{check_name}/`
2. Create `app.py` with standardized interface using `flask_app_template.py`
3. Register in profile configurations
4. Restart server to activate
### Code Quality Checks
#### Comprehensive Testing (Recommended)
```bash
# Run full system validation
./scripts/test-system.sh
# This includes:
# - Python syntax validation
# - Core module import testing
# - Profile system validation (all 14 profiles)
# - QC module testing
# - Configuration validation
# - Brand guidelines database testing
```
#### Manual Testing
```bash
# Run syntax check on all Python files
python -m py_compile **/*.py
# Import all modules to check for runtime issues
python -c "import api_server, llm_config, profile_config"
# Test authentication modules
python -c "import jwt_validator, auth_middleware; print('Authentication modules imported successfully')"
```
### Development Workflow
#### Local Development Process
1. **Start Development Server**: `./scripts/run-local.sh`
2. **Make Changes**: Edit code, profiles, or configurations
3. **Test Locally**: Verify functionality at http://localhost:7183
4. **Run Validation**: `./scripts/test-system.sh` before deployment
5. **Deploy to Production**: `./scripts/deploy-to-prod.sh` when ready
#### Environment Detection
The application automatically detects which environment to use:
1. **`ENVIRONMENT` environment variable** (development/production)
2. **Config file existence** in `config/` folder
3. **Fallback to legacy** `config.env` if new structure not found
#### Benefits of New Setup
- ✅ **Safe Testing**: Changes don't affect production
- ✅ **Separate Data**: Dev uploads/output don't mix with production
- ✅ **Easy Deployment**: One command to push to production
- ✅ **Automated Testing**: Validation before deployment
- ✅ **Quick Rollback**: Automatic backups before deployment
## File Structure
```
├── api_server.py # Main Flask application
├── visual_qc_apps/ # QC check modules
│ ├── utils.py # Shared utilities
│ ├── flask_app_template.py # Template for new checks
│ └── {check_name}/app.py # Individual QC checks
├── profiles/ # QC profile configurations (14 total)
│ ├── general_check.json # General purpose profile (10 checks)
│ ├── static_general.json # Static general profile (10 checks)
│ ├── unilever_key_visual.json # Unilever key visual profile (15 checks)
│ ├── unilever_packaging.json # Unilever packaging profile (17 checks)
│ ├── diageo_key_visual.json # Diageo key visual profile (11 checks)
│ ├── diageo_packaging.json # Diageo packaging profile (13 checks)
│ ├── loreal_static.json # L'Oreal static profile (2 checks)
│ ├── amazon_static.json # Amazon ASD 2025 profile (6 checks)
│ └── inclusive_accessibility.json # Accessibility profile (2 checks)
├── brand_guidelines/ # Reference assets
│ └── guidelines_db.json # Asset metadata
├── config/ # Environment configurations (NEW)
│ ├── development.env # Development environment settings
│ ├── production.env # Production environment settings
│ └── .env.template # Template for new environments
├── scripts/ # Deployment and testing scripts (NEW)
│ ├── run-local.sh # Start local development server
│ ├── deploy-to-prod.sh # Deploy to production server
│ └── test-system.sh # Comprehensive system validation
├── uploads/ # Production file uploads
├── uploads-dev/ # Development file uploads (NEW)
├── output/ # Production generated reports
├── output-dev/ # Development generated reports (NEW)
├── config.env # Legacy API keys and configuration (DEPRECATED)
├── DEV_PROD_SETUP.md # Development/Production setup guide (NEW)
└── web_ui.html # Web interface
```
## Important Configuration Files
### New Environment System
- **`config/development.env`** - Development environment API keys and Flask configuration
- **`config/production.env`** - Production environment API keys and Flask configuration
- **`config/.env.template`** - Template for creating new environment configurations
- **`scripts/run-local.sh`** - Local development startup script
- **`scripts/test-system.sh`** - Comprehensive system validation script
- **`scripts/deploy-to-prod.sh`** - Production deployment script
- **`DEV_PROD_SETUP.md`** - Detailed setup and deployment guide
### Core Application Files
- **`config.env`** - Legacy API keys and Flask configuration (DEPRECATED but still supported)
- **`requirements.txt`** - Python dependencies for OpenAI, Google AI, Flask, PIL, PyMuPDF
- **`profiles/*.json`** - QC check configurations with weights and LLM assignments
## Key Integration Points
### LLM Configuration (`llm_config.py`)
- Manages OpenAI GPT-4 and Google Gemini API interactions
- Handles model switching and error handling
- Converts images to base64 for API consumption
### Profile System (`profile_config.py`)
- Dynamically discovers available QC checks
- Loads profile configurations from JSON files
- Maps checks to specific LLM models
### Parallel Processing Architecture
- Uses ThreadPoolExecutor for concurrent API calls
- Batches of 15 checks for optimal performance
- Real-time progress tracking with batch indicators
## Authentication System
### MSAL/PKCE Implementation
The application implements Microsoft Authentication Library (MSAL) with Proof Key for Code Exchange (PKCE) flow for secure user authentication:
- **Frontend**: MSAL Browser Library v2.38.3+ with popup-based authentication
- **Backend**: Python JWT validation using PyJWT library
- **Session Management**: httpOnly cookies with security flags
- **Token Validation**: Real-time validation against Azure AD JWKS
### Authentication Components
#### Core Files
- **`jwt_validator.py`** - Azure AD JWT token validation with JWKS verification
- **`auth_middleware.py`** - Flask authentication middleware with httpOnly cookie management
- **Authentication endpoints** in `api_server.py` - `/auth/login`, `/auth/logout`, `/auth/status`
- **Frontend integration** in `web_ui.html` - MSAL configuration and popup authentication
#### Configuration Requirements
```bash
# Required environment variables in config.env
AZURE_TENANT_ID=e519c2e6-bc6d-4fdf-8d9c-923c2f002385
AZURE_CLIENT_ID=9079054c-9620-4757-a256-23413042f1ef
FLASK_ENV=development
SECRET_KEY=your-secret-key-here-change-in-production
```
#### Dependencies
- PyJWT>=2.8.0 for JWT token validation
- cryptography>=41.0.0 for cryptographic operations
- requests for HTTPS calls to Azure AD endpoints
### Protected Endpoints
The following API endpoints require authentication:
- `/api/start_analysis` - File analysis initiation
- `/api/analyze` - Smart analysis with triage
- `/api/process_file` - Direct file processing
- `/api/process_triaged_file` - Triaged file processing
- `/api/profiles` (POST/PUT/DELETE) - Profile management
- `/api/brand_guidelines` (POST/DELETE) - Brand guidelines management
### Authentication Flow
1. **Frontend**: User clicks "Sign In with Microsoft" → MSAL popup authentication
2. **Azure AD**: User authenticates → Authorization code with PKCE validation
3. **Token Exchange**: MSAL exchanges code for ID/access tokens
4. **Server Validation**: Python validates JWT against Azure AD JWKS
5. **Session Creation**: Valid tokens stored in httpOnly cookies
6. **API Access**: Authenticated requests include cookie for validation
### Security Features
- **httpOnly Cookies**: Prevent XSS access to authentication tokens
- **PKCE Flow**: Enhanced security for single-page applications
- **Real-time Validation**: Every request validates token against Azure AD
- **Secure Headers**: Cookies use Secure, SameSite=Lax flags
- **Server-side Validation**: No client-side security dependencies
## QC Profile System
### Available Profiles
The system includes 14 focused QC profiles designed for different use cases:
1. **General Check** (10 checks, 100-point scale)
- Purpose: Streamlined general-purpose QC analysis
- Checks: Essential design and technical standards
- Weighting: Even distribution (10% each)
- Requirements: No reference assets needed
- Scoring: Individual scores 1-10, final score 0-100
2. **Static General** (10 checks, 100-point scale)
- Purpose: Comprehensive digital static asset QC
- Checks: Text readability, contrast, language, hierarchy, alignment, product/logo visibility, CTA, accessibility, inclusive
- Used by: All clients as a baseline profile
3. **Unilever Key Visual** (15 checks, 120-point scale)
- Purpose: Unilever brand guidelines for key visual materials
- Special Logic: Bonus checks with zero-scoring for missing elements
- Requirements: Brand guidelines recommended
- Scoring: Weighted distribution, 120-point maximum
4. **Unilever Packaging** (17 checks)
- Purpose: Unilever packaging design standards
- Requirements: Brand guidelines recommended
5. **Diageo Key Visual** (11 checks)
- Purpose: Diageo brand guidelines for key visuals
- Requirements: Brand guidelines recommended
6. **Diageo Packaging** (13 checks)
- Purpose: Diageo packaging design standards
- Requirements: Brand guidelines recommended
7. **L'Oreal Static** (3 checks, 100-point scale)
- Purpose: Focused L'Oreal QC for digital static marketing materials
- Checks: language_consistency, text_readability, background_contrast
- Scoring: Equal weight distribution (3.33 each), any individual check <6 = overall Fail
- Note: text_readability scores 7/10 neutral for product-only shots (no marketing text)
- Note: background_contrast focuses on actual visibility, not theoretical colour similarity
8. **Amazon Static** (6 checks, 100-point scale)
- Purpose: Amazon ASD 2025 design guidelines compliance
- Checks: Required elements, logo/country compliance, typography, headline layout, margins, box placement
- Requirements: Guidelines embedded in check prompts from ASD 2025 PDF
- Scoring: Weighted distribution (element/logo checks weighted higher)
9. **Inclusive Accessibility** (2 checks)
- Purpose: Focused accessibility compliance
- Checks: Accessibility and inclusive design
- Requirements: No reference assets needed
### Client Configuration
| Client | Display Name | Profiles |
|--------|-------------|----------|
| diageo | Diageo | diageo_key_visual, diageo_packaging, static_general, video_general |
| unilever | Unilever | unilever_key_visual, unilever_packaging, static_general, video_general |
| loreal | L'Oreal | loreal_static, static_general, video_general |
| amazon | Amazon | amazon_static, static_general, video_general |
| boots | Boots | boots_static, static_general, video_general |
| dow_jones | Dow Jones | dow_jones_static, marketwatch_static, wsj_static, static_general, video_general |
| honda | Honda | static_general, video_general |
| axa | AXA | static_general, video_general |
| rank | Rank | static_general, video_general |
| general | General / Other | static_general, video_general, inclusive_accessibility |
### Profile Selection Guidelines
- **General content analysis**: Use Static General or General Check
- **Brand-specific analysis**: Use appropriate brand profile
- **Amazon ASD 2025 compliance**: Use Amazon Static
- **Dow Jones corporate**: Use Dow Jones Static
- **MarketWatch assets**: Use MarketWatch Static
- **WSJ assets**: Use WSJ Static
- **Accessibility focus**: Use Inclusive Accessibility
- **Mixed requirements**: Profiles can be combined in multi-profile analysis
## Recent System Enhancements
### Unilever Profile-Specific Scoring Logic
The **Unilever Key Visual** profile now implements specialized scoring logic for enhanced quality control:
#### Zero-Score Implementation
- **Face Visibility Check**: Automatically sets score to 0 when `face_present` = false in JSON response
- **New Visibility Check**: Automatically sets score to 0 when `new_present` = false in JSON response
- **Face Gaze Direction Check**: Automatically sets score to 0 when `face_present` = false in JSON response
#### Implementation Details (`api_server.py:extract_score_from_result()`)
```python
# Unilever Key Visual profile specific logic
if (profile_config and profile_config.get('name') == 'Unilever Key Visual' and
check_name in ['face_visibility', 'new_visibility', 'face_gaze_direction']):
# Check for zero score conditions based on missing elements
if check_name == 'face_visibility' and json_data.get('face_present') == False:
return 0
elif check_name == 'new_visibility' and json_data.get('new_present') == False:
return 0
elif check_name == 'face_gaze_direction' and json_data.get('face_present') == False:
return 0
```
This ensures that missing critical elements (faces, "new" text) result in zero scores, providing more stringent quality control for Unilever key visual assets.
### Scoring System Enhancements
The scoring calculation system has been improved to handle different profile weight structures correctly:
#### Multi-Scale Scoring Support
- **100-Point Scale**: General Check profile with total weight 10.0 uses direct weighted scores
- **Other Scales**: Profiles with lower total weights use scaled scoring (weighted_score × 10)
- **Brand-Specific Scales**: Unilever Key Visual uses 120-point maximum scale
#### Fixed Calculation Logic (`api_server.py`)
```python
# Smart scoring calculation based on profile weight structure
if total_weight >= 10.0:
overall_score = total_weighted_score # Direct score for high-weight profiles
else:
overall_score = total_weighted_score * 10 # Scale up for traditional profiles
```
#### JSON Response Merging
Enhanced JSON extraction to merge multiple JSON blocks from LLM responses:
- Combines metadata (face_present, new_present) with scoring data
- Enables proper bonus check logic for Unilever profiles
- Maintains backward compatibility with single JSON responses
### Enhanced Saved Files Management
The output file system has been significantly improved for better user experience:
#### Automatic Date Sorting (`api_server.py:list_output_files()`)
- Files now automatically sorted by creation date (newest first)
- Backend sorts using file timestamps before sending to frontend
- No more manual sorting needed in the UI
#### Smart Refresh System (`web_ui.html`)
- **Progressive Retry Mechanism**: Attempts refresh at 1s, 3s, and 5s intervals after analysis
- **File Count Detection**: Compares before/after file counts to detect new files
- **Early Success Exit**: Stops retrying immediately when new files are detected
- **Visual Loading Indicators**: Shows "🔄 Checking for new files..." during refresh
- **New File Highlighting**: Latest files highlighted with green background and "NEW" badge
- **Auto-cleanup**: Visual highlights fade after 5 seconds
#### Implementation Features
```javascript
// Enhanced refresh with progressive delays
const refreshAttempts = [1000, 3000, 5000]; // 1s, 3s, 5s delays
// Visual feedback for new files
displaySavedFiles(data.files, shouldHighlight);
// Smart detection logic
if (newFileCount > previousFileCount) {
console.log('New file(s) detected, refresh complete');
break;
}
```
### MSAL Authentication System Improvements
Enhanced the Microsoft Authentication Library implementation for better reliability:
#### Robust Error Handling (`web_ui.html`)
- **MSAL Initialization Check**: Validates MSAL library loaded before initialization
- **Authentication State Tracking**: `msalInitialized` flag prevents undefined access
- **Fallback CDN Support**: Secondary CDN source if primary fails to load
- **User-Friendly Error Messages**: Clear error messages when authentication unavailable
#### Enhanced Security
```javascript
// Safe authentication with validation
if (!msalInitialized || !myMSALObj) {
console.error('MSAL not initialized properly');
alert('Authentication system not available. Please check your connection.');
return;
}
```
#### MSAL Concurrent Sign-In Protection
Fixed interaction_in_progress error by implementing concurrent sign-in prevention:
- **Sign-In Flag**: `isSigningIn` flag prevents multiple simultaneous authentication attempts
- **Storage Cleanup**: Clears MSAL localStorage/sessionStorage before authentication to remove stuck state
- **Proper Reset**: Uses finally block to reset flag on both success and failure
```javascript
let isSigningIn = false; // Prevent concurrent sign-in attempts
async function signIn() {
if (isSigningIn) {
console.log('Sign-in already in progress, ignoring duplicate request');
return;
}
try {
isSigningIn = true;
// Clear any pending MSAL interactions
localStorage.removeItem('msal.interaction.status');
sessionStorage.removeItem('msal.interaction.status');
// ... authentication logic
} finally {
isSigningIn = false;
}
}
```
### Usage Tracking and Reporting System (NEW)
The system now includes comprehensive usage tracking and report generation capabilities:
#### Usage Tracking Features
- **Automatic Logging**: All analyses automatically logged with detailed metadata
- **Cost Estimation**: Real-time cost estimates based on LLM usage (OpenAI & Gemini)
- **User Activity**: Track which users perform analyses and their usage patterns
- **Client Breakdown**: Usage statistics per client (diageo, unilever, loreal, general)
- **Profile Usage**: Track which profiles are most frequently used
- **Daily Logs**: Usage data stored in daily JSONL files for easy processing
#### Usage Report Generator (`backend/generate_usage_report.py`)
Command-line tool to generate comprehensive usage reports:
```bash
# Generate report for last 7 days
python backend/generate_usage_report.py --last-days 7
# Generate monthly report
python backend/generate_usage_report.py --last-days 30 --output monthly_report.txt
# Filter by specific client
python backend/generate_usage_report.py --client diageo --last-days 30
# Generate CSV for Excel
python backend/generate_usage_report.py --last-days 30 --format csv --output report.csv
# Generate JSON for API integration
python backend/generate_usage_report.py --last-days 30 --format json --output report.json
```
**Report Sections**:
- Summary: Total analyses, checks, estimated costs, averages
- By Client: Usage breakdown per client with top profiles
- By User: Individual user statistics and activity
- By Profile: Profile usage across clients
- By Date: Daily breakdown of activity and costs
**Output Formats**: Text (human-readable), JSON (machine-readable), CSV (spreadsheet)
**Documentation**: See `backend/USAGE_REPORTS.md` for detailed usage guide
#### Usage Log Storage
- **Location**: `backend/usage_logs/`
- **Format**: JSONL (JSON Lines) - one log entry per line
- **Naming**: `YYYY-MM-DD.jsonl` (daily files)
- **Retention**: Logs kept indefinitely (consider archiving after 1 year)
### Profile Auto-Versioning System (NEW)
The system now implements automatic version control when profiles are edited:
#### How It Works
1. **Original Profile**: `my_profile.json` (version 1)
2. **First Edit**: Creates `my_profile_v2.json` (version 2), keeps original unchanged
3. **Second Edit**: Creates `my_profile_v3.json` (version 3), keeps v1 and v2 unchanged
4. **Client Configs**: Automatically updated to use latest version
#### Benefits
- ✅ **Safety**: Original profiles never overwritten
- ✅ **History**: Complete version history preserved
- ✅ **Rollback**: Easy to revert to previous versions
- ✅ **Audit Trail**: Track who made changes and when
- ✅ **Testing**: Test new versions without affecting production
#### Version Metadata
Each profile version includes:
- `version`: Version number (1, 2, 3, ...)
- `created_at`: ISO timestamp of creation
- `created_by`: Email of user who created profile
- `modified_at`: ISO timestamp of last modification (if edited)
- `modified_by`: Email of user who edited profile
- `previous_version`: Profile ID of previous version (if edited)
#### API Behavior
- **POST /api/profiles**: Creates new profile with version 1
- **PUT /api/profiles/<id>**: Creates new version automatically
- **DELETE /api/profiles/<id>**: Deletes specific version only
- **GET /api/profiles**: Returns all versions (filtered by client visibility)
**Documentation**: See `backend/PROFILE_MANAGEMENT.md` for detailed usage guide
### Profile Visibility Control System (NEW)
Profiles can now be configured with granular visibility settings:
#### Visibility Options
**1. All Clients (Default)**
```json
{
"visibility": "all",
"visible_to_clients": []
}
```
Profile visible to all clients in the system (diageo, unilever, loreal, general).
**2. Client-Specific**
```json
{
"visibility": "client_specific",
"visible_to_clients": ["diageo", "unilever"]
}
```
Profile visible only to specified clients.
#### Use Cases
- **All Clients**: General-purpose profiles, standard QC checks, accessibility compliance
- **Client-Specific**: Brand-specific profiles, custom checks, confidential QC criteria
#### Implementation
- **Profile Creation**: Set visibility during creation via API or Web UI
- **Client Filtering**: Users only see profiles available to their selected client
- **Dynamic Loading**: `client_config.py` automatically updated based on visibility
- **Backward Compatible**: Existing profiles default to "all" visibility
#### Web UI Integration
- **Create Profile**: Checkbox for "Reveal to All Clients"
- Checked: Visible to all clients
- Unchecked: Show client selector for specific clients
- **Profile List**: Shows visibility status with icons
- 🌍 All Clients
- 🔒 Specific Clients (with client list)
**Available Client IDs**: `diageo`, `unilever`, `loreal`, `amazon`, `boots`, `general`
**Documentation**: See `backend/PROFILE_MANAGEMENT.md` for detailed configuration guide
### Amazon ASD 2025 QC Tools
Six specialized checks for Amazon Sale Day design compliance, with guidelines from the ASD 2025 PDF embedded directly in each tool's prompt:
| Tool | What it checks |
|------|---------------|
| `amazon_required_elements` | All required elements present (Headline, Box, Subhead, Date, Legal line) |
| `amazon_logo_country` | Correct Amazon/URL logo per country (established vs emerging locales) |
| `amazon_typography` | Ember Modern Standard Display font, leading/tracking, size ratios (subhead 30-60%, date 20-45%), ligatures |
| `amazon_headline_layout` | Headline left-aligned, largest element, natural line splits |
| `amazon_margins` | 7% shortest side (10% wide, 20%/10% very wide+small formats) |
| `amazon_element_placement` | Element placement (box, bag, logo), positioning rules, cropping rules (tape NEVER cropped) |
### Client-Scoped Reporting Dashboard
Reporting has been moved from the Settings modal into a dedicated "Reporting" tab within each client's main view:
- **Date range filtering**: Start/end date pickers for custom report periods
- **Summary cards**: Total Analyses, Unique Users, Total Checks Run, Estimated Cost
- **Detail table**: Per-analysis breakdown with date, user, profile, checks, score, cost
- **Client isolation**: Reports only show data for the currently selected client
- **API endpoint**: `GET /api/client_usage_stats?client={id}&start_date={}&end_date={}`
### Admin Panel
View-only administration panel for platform user management:
- **Access**: Dedicated "Admin" button in header, visible only to admin users
- **Full page**: Separate section (not a popup), with "Back to App" navigation
- **Summary stats**: Total Users, Total Platform Analyses, Total Estimated Cost
- **User table**: Name, Email, Analyses, Total Checks, Clients Used, Last Active, Est. Cost
- **Admin config**: `ADMIN_USERS` list in `backend/client_config.py`
- **API endpoints**: `GET /api/admin/check`, `GET /api/admin/users`
### User Login Tracking
All authenticated user visits are now logged:
- **Event type**: `user_login` logged on every `/auth/status` check
- **Data captured**: user_id, user_email, user_name, timestamp
- **Storage**: Same JSONL usage logs in `backend/usage_logs/`
- **Purpose**: Enables admin panel to show all users who have visited, not just those who ran analyses
### PDF Reference Asset Processing
Multi-page PDF brand guidelines are now fully processed on upload:
- **Text extraction**: All pages extracted using PyMuPDF (`pdf_processor.py`)
- **LLM summarization**: Extracted text sent to Gemini 2.5 Pro for structured brand guidelines summary (2000-4000 words covering colors, typography, layout, do's/don'ts, QC specs)
- **Cover image**: Page 1 extracted as PNG for visual reference in QC checks
- **Storage**: `{file_id}_summary.txt` and `{file_id}_cover.png` in `brand_guidelines/files/`
- **QC integration**: Summary text included in check prompts, cover image sent as visual reference
- **Fallback chain**: LLM summary → raw text (8000 chars) → inline extraction → metadata only
- **Auto-backfill**: Existing unprocessed PDFs processed on server startup
- **API endpoints**: `GET /api/brand_guidelines/<id>/status`, `POST /api/brand_guidelines/<id>/reprocess`
### Media Plan System
Excel media plans can be uploaded per client for automatic asset validation:
- **Upload**: Settings → Media Plan tab, accepts .xlsx/.xls files
- **Parsing**: Extracts asset specs from all channel sheets (Display, OLV, OOH, TV, Print, Audio) using openpyxl
- **Filename matching**: Automatic fuzzy matching (exact → case-insensitive → starts-with → contains → fuzzy >70%)
- **Validation**: Checks uploaded asset dimensions and file type against media plan spec
- **QC context**: Matched asset metadata (country, language, placement, vendor, dimensions) injected into all check prompts
- **Storage**: `backend/media_plans/` directory with parsed JSON cache
- **API endpoints**: `POST /api/media_plan`, `GET /api/media_plan?client={id}`, `DELETE /api/media_plan/<client_id>`
- **Module**: `media_plan_processor.py` - `parse_media_plan()`, `find_matching_asset()`, `validate_asset_specs()`, `build_media_plan_context()`
### User Access Control System
Default-deny per-user client access, with admin grant/revoke via the admin panel's User Access tab. Enforced server-side on every client-scoped endpoint.
**Storage:** `backend/user_access.json` — auto-bootstrapped on first server start with `nick.viljoen@brandtech.plus` as the sole admin. Never commit this file (it's in `.gitignore`).
```json
{
"version": 1,
"default_clients": ["general"],
"admins": ["nick.viljoen@brandtech.plus"],
"users": {
"alice@example.com": {
"clients": ["general", "diageo"],
"updated_at": "2026-04-22T14:30:00Z",
"updated_by": "nick.viljoen@brandtech.plus"
}
}
}
```
**Module:** `backend/user_access.py`
- `get_user_clients(email)` — returns granted clients (admins see all)
- `set_user_clients(email, clients, actor_email)` — grant/revoke; validates against client_config
- `is_admin(email)` — used everywhere; `client_config.is_admin` now delegates here
- `promote_admin(email, actor)` / `demote_admin(email, actor)` — demote blocked if last admin
- `list_access_entries()` — for the admin panel
**Enforcement points in `api_server.py`:**
- `GET /api/clients` — returns only clients the user can see (admins see all)
- `_require_admin()` helper — gates the 4 `/api/admin/user_access*` endpoints
- `_require_client_access(client_id)` helper — applied to `start_analysis`, `output_files`, `media_plan` (GET/POST/DELETE), `client_usage_stats`, `/output/<client>/<filename>`. Returns 403 with `"code": "client_access_denied"` on denial.
**Audit trail:** `log_access_change(audit_entry)` in `usage_tracker.py` writes `event: "access_change"` records into the daily JSONL usage logs. Captures actor, target, action (grant/revoke/promote_admin/demote_admin), and clients_before/after.
**Frontend (`web_ui.html`):** Admin panel has two tabs — Usage Overview and User Access. Access tab: searchable user table, inline editor with per-client checkboxes, admin toggle, + Add User (pre-grants access before someone has signed in). `handleClientAccessDenied()` helper bounces revoked users back to the client picker with a red toast.
### Self-service Client Access Requests
A "Request Client Access" tile on the client picker lets signed-in users ask admins for additional client access without going through Slack/email side-channels.
- **Tile:** appended after the user's existing client tiles in `populateClientSelector()` (web_ui.html). Always visible — if the user already has every client, the modal short-circuits with a friendly "you already have everything" alert.
- **Modal:** auto-fills name + email from `currentUser` (read-only — identity always taken from the verified MSAL session, never the body), checkbox list of clients the user does **not** already have, optional reason textarea.
- **Endpoints (`api_server.py`):**
- `GET /api/all_clients` — auth-required, returns the full client catalogue so the form can offer clients the user can't currently see.
- `POST /api/access_request` — auth-required. Validates requested client IDs, looks up admin recipients via `user_access.list_access_entries()`, sends a plaintext + HTML email through `email_service.send_email()` with `Reply-To` set to the requester. Logs an `access_request` event to the daily JSONL usage log via `usage_tracker.log_access_request()`. Returns 502 if email delivery fails (request still logged with `email_sent: false`).
- **Email transport (`backend/email_service.py`):** thin SMTP wrapper using STARTTLS. Reads `SMTP_SERVER`, `SMTP_PORT`, `SMTP_USER`, `SMTP_PASSWORD`, `SENDER_EMAIL` from env. Currently wired to Mailgun via the `twist@mail.dev.oliver.solutions` SMTP user.
### Settings Modal UX (Apr 2026)
- **Reference Assets tab:** the Brand Name + Tags + Description form was collapsed to a single "Name" field. The user-entered name is what now drives the dropdown label on the main configuration page (falls back to `original_filename` for legacy records that pre-date the change).
- **Media Plan tab:** added a "Name" field. The backend stores `display_name` on the media plan record; both the active-plan card and the main-page dropdown prefer `display_name` and fall back to `original_filename` for old plans.
- **Modal footer is context-aware:** "Save Profile" + "Cancel" show only on the Profile / Create Profile tabs. Reference Assets / QC Tools / Media Plan tabs show a single green "Save" button that simply closes the modal — the upload buttons within those tabs are the actual save action.
## Deployment Environments
| Env | URL | Branch tracked | Server | Service | Status |
|---|---|---|---|---|---|
| Local | `http://localhost:7183` | any | your laptop | none (Flask dev) | — |
| Dev | `https://optical-dev.oliver.solutions/ai_qc/` | `develop` | `optical-production-dev` (GCP VM, europe-west2-b) | `ai-qc.service` | **Live** |
| Prod | `https://optical-prod.oliver.solutions/ai_qc/` | tags on `main` | `optical-production` (GCP VM, europe-west2-c) | `ai-qc.service` | **Live** (currently `v1.1.0`) |
| Legacy sandbox | older URL | `main` (direct) | older VM, runs as `www-data` | `ai_qc.service` | Still alive as fallback |
Both new-style envs (dev + prod):
- App lives at `/opt/ai_qc`, runs as `nick.viljoen`
- systemd unit `ai-qc.service` running Waitress on `127.0.0.1:7183`
- Apache reverse-proxy include at `/opt/ai_qc/deploy/apache-ai-qc.conf`, pulled into the main `optical-dev.oliver.solutions.conf` vhost
- TLS terminated at the GCP load balancer (no certbot on the box)
- Each server has its own SSH key for Bitbucket pulls (kept in `~/.ssh/bitbucket_ai_qc`, host alias `bitbucket-ai-qc`)
## Branch Strategy
- **`develop`** = what's deployed to the dev server. Push to `develop` → run `deploy.sh dev` on optical-dev.
- **`main`** = what's deployed to prod. Never push directly; merge `develop → main` via PR, then tag (`v1.0.0`). Deploy the tag with `deploy.sh prod v1.0.0`.
- **Feature branches** (`feature/<name>`) branch from `main`, PR into `develop`. Keep merged feature branches around as history or delete once main catches up.
## Deploy Scripts
All in `backend/scripts/`, run on the target server:
| Script | Usage | What it does |
|---|---|---|
| `deploy.sh dev` | `backend/scripts/deploy.sh dev [--dry-run]` | Fetch, show diff, confirm, `git reset --hard origin/develop`, pip install if `requirements.txt` changed, `sudo systemctl restart ai-qc.service`, smoke test via `/health`, auto-rollback on failure |
| `deploy.sh prod <tag>` | `backend/scripts/deploy.sh prod v1.2.0 [--dry-run]` | Same flow but checks out a specific tag |
| `rollback.sh` | `backend/scripts/rollback.sh last` or `... <commit-hash>` | Revert to the checkpoint written by the most recent deploy, or to any specific commit |
| `health-check.sh` | `backend/scripts/health-check.sh` | One-line "is the app alive?" — `curl /health`, exits 0/1 |
The deploy script writes the pre-deploy HEAD to `.last_deploy_rollback` in the app dir before changing anything, so `rollback.sh last` always knows where to go back to.
## Production Deployment
### Critical Production Issues and Solutions
#### Issue 1: Web UI 404 Error ("Web UI not found")
**Symptom**: Backend API runs successfully, but accessing the root URL returns `{"error":"Web UI not found"}` with 404 status.
**Root Cause**: The `serve_web_ui()` function in both `api_server.py` and `backend/api_server.py` used relative path `'web_ui.html'` which only works when Flask starts from the project root directory. Production servers (Waitress, systemd) often run from different working directories.
**Solution**: Use absolute paths relative to the script location:
```python
@app.route('/', methods=['GET'])
def serve_web_ui():
"""Serve the web UI"""
try:
# Root api_server.py - web_ui.html is in same directory
base_dir = os.path.dirname(os.path.abspath(__file__))
web_ui_path = os.path.join(base_dir, 'web_ui.html')
# Backend api_server.py - web_ui.html is in parent directory
# base_dir = os.path.dirname(os.path.abspath(__file__))
# web_ui_path = os.path.join(os.path.dirname(base_dir), 'web_ui.html')
with open(web_ui_path, 'r') as f:
html_content = f.read()
return Response(html_content, mimetype='text/html')
except FileNotFoundError:
return jsonify({'error': 'Web UI not found'}), 404
```
**Files Fixed**:
- `/api_server.py` (line 1306-1310)
- `/backend/api_server.py` (line 1306-1310)
#### Issue 2: Apache ProxyPass Not Working (Auth Endpoints 404)
**Symptom**: Backend accessible via localhost, but web URL returns 404 for `/auth/*` and other API endpoints. ProxyPass rules appear correct in Apache config.
**Root Cause**: Apache checks for static files/directories BEFORE applying ProxyPass rules. If a directory like `/var/www/html/ai_qc/` exists, Apache tries to serve files from that directory first and never triggers the ProxyPass rule.
**Solution**: Remove or rename static directory that matches ProxyPass path:
```bash
# Check for conflicting static directory
ls -la /var/www/html/ai_qc/
# Rename as backup (safer than deleting)
sudo mv /var/www/html/ai_qc /var/www/html/ai_qc.backup.$(date +%Y%m%d_%H%M%S)
# Test that ProxyPass now works
curl -I https://your-domain.com/ai_qc/auth/status
```
**Apache ProxyPass Order**: Place more specific paths before general paths:
```apache
# In /etc/apache2/apache2.conf or site config
# More specific paths first
ProxyPass /ai_qc/auth http://localhost:7183/auth
ProxyPassReverse /ai_qc/auth http://localhost:7183/auth
# General path last
ProxyPass /ai_qc http://localhost:7183
ProxyPassReverse /ai_qc http://localhost:7183
```
**Key Lesson**: When using Apache ProxyPass, do NOT create a static directory with the same name as the proxy path. The backend serves everything through the proxy.
#### Issue 3: MSAL Authentication "interaction_in_progress" Error
**Symptom**: Clicking "Sign In with Microsoft" throws `BrowserAuthError: interaction_in_progress` and authentication fails.
**Root Cause**:
1. Multiple sign-in buttons (header and auth-required screen) could trigger concurrent authentication
2. Previous failed authentication left MSAL state in localStorage/sessionStorage
3. No protection against double-clicks on sign-in button
**Solution**: Implement concurrent sign-in protection (see MSAL section above)
**Testing After Fix**: Clear browser cache or use incognito window to test, as old JavaScript and MSAL state may be cached.
### Production Deployment Checklist
1. **Code Deployment**
```bash
cd /opt/ai_qc
git pull origin main
sudo systemctl restart ai_qc.service
```
2. **Verify No Static Directory Conflicts**
```bash
# Check for conflicting directories
ls -la /var/www/html/ | grep ai_qc
# Should NOT exist if using ProxyPass
```
3. **Test Backend Directly**
```bash
curl -I http://localhost:7183/
curl -I http://localhost:7183/auth/status
curl -I http://localhost:7183/health
```
4. **Test Through Apache Proxy**
```bash
curl -I https://your-domain.com/ai_qc/
curl -I https://your-domain.com/ai_qc/auth/status
```
5. **Test in Browser**
- Open in incognito/private window (avoids cache issues)
- Verify web UI loads
- Test Microsoft authentication
- Upload and analyze a test file
6. **Monitor Logs**
```bash
# Flask application logs
sudo journalctl -u ai_qc.service -f
# Apache logs
sudo tail -f /var/log/apache2/ai_qc_ssl_error.log
sudo tail -f /var/log/apache2/ai_qc_ssl_access.log
```
### Common Production Issues
| Issue | Check | Solution |
|-------|-------|----------|
| 404 on web UI | `curl localhost:7183/` | Use absolute paths in serve_web_ui() |
| 404 on /auth/* | Check `/var/www/html/ai_qc/` | Remove static directory conflicting with ProxyPass |
| MSAL errors | Browser console | Clear browser cache, check concurrent sign-in protection |
| Backend not starting | `systemctl status ai_qc` | Check Python environment, dependencies, port conflicts |
| Permission errors | File ownership | Ensure www-data owns necessary directories |
| Permission denied on new dirs | `git pull` resets ownership | `sudo chown -R www-data:www-data uploads output media_plans brand_guidelines usage_logs` |
## Pre-Session Completion Checklist
Before ending any session, ALWAYS run these Python syntax and import checks:
1. **Syntax Check**: Run `python -m py_compile **/*.py` to verify all Python files compile without syntax errors
2. **Import Check**: Run `python -c "import api_server, llm_config, profile_config"` to verify core modules import successfully
3. **Authentication Check**: Run `python -c "import jwt_validator, auth_middleware; print('Authentication modules imported successfully')"` to verify authentication system
4. **QC Module Check**: Test import of any modified QC modules in `visual_qc_apps/`
5. **Profile System Check**: Verify all 14 profiles load correctly:
```bash
python -c "
from profile_config import get_profile
profiles = ['general_check', 'static_general', 'unilever_key_visual', 'unilever_packaging', 'diageo_key_visual', 'diageo_packaging', 'loreal_static', 'amazon_static', 'boots_static', 'inclusive_accessibility', 'dow_jones_static', 'marketwatch_static', 'wsj_static', 'video_general']
for p in profiles:
profile = get_profile(p)
print(f'✅ {profile.name} ({len(profile.get_enabled_checks())} checks)')
"
```
6. **Client Config Check**: Verify all 10 clients load correctly:
```bash
python -c "
from client_config import get_all_clients
for cid, c in get_all_clients().items():
print(f'✅ {c[\"display_name\"]}: {c[\"profiles\"]}')
"
```
7. **Enhanced System Check**: Verify recent enhancements work correctly:
- Test General Check profile 100-point scoring system
- Test Unilever profile zero-scoring logic with face/new visibility checks
- Test saved files are client-scoped (only show for selected client)
- Test client-scoped reporting dashboard with date range filters
- Test admin panel shows all platform users (admin users only)
- Test MSAL authentication initialization and error handling
- Verify scoring calculation handles different weight structures correctly
For everything else (architecture, auth, deployment, branch strategy, troubleshooting, pre-session checklist) see `../CLAUDE.md`.