ai_qc/backend/profile_config.py
nickviljoen 50d0063b37 Add Boots Production Pack profile (multi-page document mode)
New profile boots_ppack for QCing multi-page Boots production packs
(PowerPoint-exported PDFs, 4-18 pages each). Built on top of AXA's
document-mode infrastructure — branched off feature/axa-document-mode
because it reuses the dispatcher, ingest, and result writer.

New checks:
- boots_logo_compliance — three-path scoring (master wordmark / partner
  lock-up / no branding) so OLIVER x BOOTS-style footer lock-ups aren't
  scored against master wordmark rules. Conservative without a formal
  Boots logo guideline.
- boots_colour_palette — verifies CMYK/RGB/Hex spec values on creative-
  guidance pages against canonical Boots Blue / Health Primary Blue /
  Offer Red, plus visual sanity-check on artwork pages.

Existing checks tuned:
- boots_brand_name_accuracy: closed-world list semantics. Brands not on
  the approved list now go to names_not_on_list (manual review) instead
  of failing — the list is sourced from the original 7 docs and is known
  incomplete (Remington, Imodium, Maybelline etc. are legitimate Boots-
  stocked brands not on it).
- boots_tandc_wording: explicit font-weight caveat — Boots Sharp Regular
  vs Light isn't reliably distinguishable by vision LLM at small sizes.
  Surfaced via font_weight_caveat field + needs_manual_check value.

Page classifier (document_mode/page_classifier.py):
Heuristic tags each page as cover / checklist / palette / notes /
artwork. Validated on all 10 sample packs.

Strict-grade exemption (Profile.strict_grade flag):
Only artwork-classified pages count towards Pass/Fail. Cover, checklist,
palette, and notes pages are still QC'd and reported as Informational
but cannot trigger a Fail. Banner shows exactly which artwork-page
checks fell below 6.

Result writer extended:
- Per-page table with score + page_type pill for any page_each-scope
  check (auto-applied as fallback)
- Strict-grade banner (red on violation, green when clean)
- Page_type pills throughout the per-page strip

Smoke-test result (Remington 4-page pack, 2026-05-05):
Overall 70.75/100, strict-grade Fail. After two iterations of prompt
tuning, all three remaining strict-grade violations are real catches:
orphan asterisk in T&Cs, "they may not be stocked" wording deviation,
missing "Charges may apply". brand_name_accuracy 7.0 (was 3.0 before
list fix), logo_compliance 9.5 (was 1.5 before lock-up path fix).

Local-only — not pushed to dev or merged to develop until after Boots
show-and-tell. Same posture as feature/axa-document-mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:47:13 +02:00

358 lines
No EOL
13 KiB
Python
Executable file

#!/usr/bin/env python3
"""
Centralized profile configuration module for Visual AI QC.
This script manages the profiles, QC checks, weights, and LLM assignments.
"""
import os
import json
import glob
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, field
# Profiles directory path
PROFILES_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'profiles')
# Dynamic QC checks discovery
def discover_qc_checks():
"""Dynamically discover all available QC checks from the visual_qc_apps directory"""
import glob
visual_qc_apps_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'visual_qc_apps')
# Find all directories that contain an app.py file
qc_checks = []
# Look for all subdirectories in visual_qc_apps
app_dirs = glob.glob(os.path.join(visual_qc_apps_dir, '*', 'app.py'))
for app_file in app_dirs:
# Extract the directory name as the check name
check_dir = os.path.dirname(app_file)
check_name = os.path.basename(check_dir)
# Skip template files and utility files
if check_name not in ['__pycache__', 'templates', 'utils']:
qc_checks.append(check_name)
# Sort for consistency
qc_checks.sort()
print(f"Discovered {len(qc_checks)} QC checks: {qc_checks}")
return qc_checks
# List of all available QC checks (dynamically discovered)
try:
QC_CHECKS = discover_qc_checks()
except Exception as e:
print(f"Error discovering QC checks, falling back to static list: {e}")
# Fallback to static list if discovery fails
QC_CHECKS = [
# Original services
'logo_visibility',
'brand_assets_visibility',
'visual_elements_count',
'background_contrast',
'face_visibility',
'new_visibility',
'visual_hierarchy',
'supporting_images',
'curved_edges',
'visuals_left_text_right',
'face_gaze_direction',
'lowercase_text',
'call_to_action',
'word_count',
'imperative_verb',
# New services - added for the new client requirements
'file_naming',
'layer_organization',
'color_format',
'image_resolution',
'safety_area',
'element_alignment',
'animation_transitions',
'aspect_ratio',
'responsiveness',
'dark_mode_legibility',
'print_bleed',
'crop_marks',
'text_readability',
# Custom QC checks for specific requirements
'product_visibility',
'inclusive',
'accessibility',
# Format-specific checks
'curved_edges_print',
'curved_edges_digital'
]
# LLM options
LLM_OPTIONS = ["OpenAI", "Gemini"]
@dataclass
class QCCheckConfig:
"""Configuration for a QC check, including its weight and which LLM to use"""
weight: float = 0.0
llm: str = "Gemini" # Default to Gemini
enabled: bool = True
# Document-mode only: scope determines how the dispatcher runs the check.
# One of: "document" (run once on the whole PDF), "targeted" (specific
# pages — see scope_args.pages), "page_sample" (N evenly-spaced pages),
# "page_pair" (Phase 3 old-vs-new diff), "page_each" (every page — costly).
# Ignored in asset mode. None falls back to "page_each" for backwards compat.
scope: Optional[str] = None
scope_args: Optional[Dict[str, Any]] = None
@dataclass
class Profile:
"""Profile configuration including name, description, and check configs"""
name: str
description: str
checks: Dict[str, QCCheckConfig] = field(default_factory=dict)
pre_analysis_instructions: Optional[str] = None
mode: str = "asset" # "asset" (default, single image/video) or "document" (multi-page PDF)
# Strict-grade override: when True, ANY check scoring <6 forces an
# overall Fail. In document mode this only applies to artwork-classified
# pages (cover/checklist/palette/notes pages are exempt). Used by Boots
# Production Pack profile to mirror the asset-mode strict-grade rule
# already used by L'Oreal Static and Boots Static.
strict_grade: bool = False
def get_enabled_checks(self) -> List[str]:
"""Get list of enabled check names"""
return [check_name for check_name, config in self.checks.items() if config.enabled]
def get_check_weights(self) -> Dict[str, float]:
"""Get dictionary of check weights"""
return {check_name: config.weight for check_name, config in self.checks.items() if config.enabled}
def get_check_llm(self, check_name: str) -> str:
"""Get the LLM to use for a specific check"""
if check_name in self.checks:
return self.checks[check_name].llm
return "Gemini" # Default to Gemini if not specified
# Dictionary to store all loaded profiles
PROFILES = {}
def load_profiles():
"""Load all profile JSON files from the profiles directory"""
global PROFILES
PROFILES = {} # Reset profiles dictionary
# Ensure profiles directory exists
os.makedirs(PROFILES_DIR, exist_ok=True)
# Find all JSON files in the profiles directory
profile_files = glob.glob(os.path.join(PROFILES_DIR, '*.json'))
# Load each profile file
for profile_file in profile_files:
try:
with open(profile_file, 'r') as f:
profile_data = json.load(f)
# Extract profile name, description, checks, and pre_analysis_instructions
profile_name = profile_data.get('name', 'Unnamed Profile')
profile_description = profile_data.get('description', '')
profile_checks = profile_data.get('checks', {})
pre_analysis_instructions = profile_data.get('pre_analysis_instructions', None)
profile_mode = profile_data.get('mode', 'asset')
profile_strict_grade = profile_data.get('strict_grade', False)
# Create a new Profile instance
profile = Profile(
name=profile_name,
description=profile_description,
pre_analysis_instructions=pre_analysis_instructions,
mode=profile_mode,
strict_grade=profile_strict_grade,
)
# Add each check configuration
for check_name, check_config in profile_checks.items():
profile.checks[check_name] = QCCheckConfig(
weight=check_config.get('weight', 0.0),
llm=check_config.get('llm', 'Gemini'),
enabled=check_config.get('enabled', True),
scope=check_config.get('scope'),
scope_args=check_config.get('scope_args'),
)
# Add profile to the PROFILES dictionary
# Use the filename (without extension) as the profile ID
profile_id = os.path.splitext(os.path.basename(profile_file))[0].lower()
PROFILES[profile_id] = profile
print(f"Loaded profile '{profile_name}' from {profile_file}")
except Exception as e:
print(f"Error loading profile from {profile_file}: {e}")
# If no profiles were loaded, create a default profile
if not PROFILES:
print("No profiles found. Creating default profile.")
default_profile = Profile(
name="All Checks",
description="Run all available QC checks"
)
# Initialize all checks with default values
for check in QC_CHECKS:
default_profile.checks[check] = QCCheckConfig()
PROFILES['default'] = default_profile
# Save the default profile to a file
save_profile('default', default_profile)
def save_profile(profile_id: str, profile: Profile):
"""Save a profile to a JSON file"""
# Create the profile data dictionary
profile_data = {
'name': profile.name,
'description': profile.description,
'checks': {}
}
# Add pre_analysis_instructions if it exists
if profile.pre_analysis_instructions:
profile_data['pre_analysis_instructions'] = profile.pre_analysis_instructions
# Persist mode only when it diverges from the default to keep existing JSONs untouched
if profile.mode and profile.mode != 'asset':
profile_data['mode'] = profile.mode
if profile.strict_grade:
profile_data['strict_grade'] = True
# Add each check configuration
for check_name, check_config in profile.checks.items():
check_data = {
'weight': check_config.weight,
'llm': check_config.llm,
'enabled': check_config.enabled
}
# Persist scope only when set, to keep existing single-asset profiles untouched
if check_config.scope:
check_data['scope'] = check_config.scope
if check_config.scope_args:
check_data['scope_args'] = check_config.scope_args
profile_data['checks'][check_name] = check_data
# Save to a JSON file
profile_file = os.path.join(PROFILES_DIR, f"{profile_id.lower()}.json")
with open(profile_file, 'w') as f:
json.dump(profile_data, f, indent=4)
print(f"Saved profile '{profile.name}' to {profile_file}")
def get_profile(profile_name: str) -> Profile:
"""Get a profile by name"""
# If profiles haven't been loaded yet, load them
if not PROFILES:
load_profiles()
return PROFILES.get(profile_name.lower(), PROFILES.get('default'))
def add_profile(name: str, description: str, check_configs: Dict[str, Dict[str, Any]]) -> str:
"""Add a new profile and save it to a JSON file
Returns the profile_id that was created
"""
# Create a new Profile instance
profile = Profile(
name=name,
description=description
)
# Add each check configuration
for check_name, config in check_configs.items():
profile.checks[check_name] = QCCheckConfig(
weight=config.get('weight', 0.0),
llm=config.get('llm', 'Gemini'),
enabled=config.get('enabled', True)
)
# Generate a profile_id from the name
profile_id = name.lower().replace(' ', '_')
# Add to the PROFILES dictionary
PROFILES[profile_id] = profile
# Save to a JSON file
save_profile(profile_id, profile)
return profile_id
def update_profile(profile_name: str, updates: Dict[str, Any]) -> bool:
"""Update an existing profile"""
if profile_name not in PROFILES:
return False
profile = PROFILES[profile_name]
if 'name' in updates:
profile.name = updates['name']
if 'description' in updates:
profile.description = updates['description']
if 'checks' in updates:
for check_name, check_config in updates['checks'].items():
if check_name in profile.checks:
if 'weight' in check_config:
profile.checks[check_name].weight = check_config['weight']
if 'llm' in check_config:
profile.checks[check_name].llm = check_config['llm']
if 'enabled' in check_config:
profile.checks[check_name].enabled = check_config['enabled']
# Save the updated profile
save_profile(profile_name, profile)
return True
def delete_profile(profile_name: str) -> bool:
"""Delete a profile file and remove it from memory"""
if profile_name not in PROFILES or profile_name == 'default':
return False
# Remove from memory
profile = PROFILES.pop(profile_name)
# Delete the file
profile_file = os.path.join(PROFILES_DIR, f"{profile_name.lower()}.json")
if os.path.exists(profile_file):
os.remove(profile_file)
print(f"Deleted profile file: {profile_file}")
return True
def get_profile_summary() -> Dict[str, Dict[str, Any]]:
"""Get a summary of all profiles"""
# If profiles haven't been loaded yet, load them
if not PROFILES:
load_profiles()
summary = {}
for profile_name, profile in PROFILES.items():
summary[profile_name] = {
'name': profile.name,
'description': profile.description,
'enabled_checks': profile.get_enabled_checks(),
'total_checks': len(profile.checks),
'enabled_count': len(profile.get_enabled_checks())
}
return summary
def get_check_llm_map(profile_name: str) -> Dict[str, str]:
"""Get a mapping of check names to LLM names for a profile"""
profile = get_profile(profile_name)
return {check_name: config.llm for check_name, config in profile.checks.items()}
# Load profiles when the module is imported
load_profiles()