No description

Find a file

michael c5dd454fe8 Merge remote-tracking branch 'origin/main' merging for initial commit/push to existing bitbucket repo		2025-08-12 14:56:28 -05:00
__pycache__	initial commit	2025-08-12 14:52:49 -05:00
brand_guidelines	initial commit	2025-08-12 14:52:49 -05:00
output	initial commit	2025-08-12 14:52:49 -05:00
profiles	initial commit	2025-08-12 14:52:49 -05:00
uploads	initial commit	2025-08-12 14:52:49 -05:00
visual_qc_apps	initial commit	2025-08-12 14:52:49 -05:00
.DS_Store	initial commit	2025-08-12 14:52:49 -05:00
.gitignore	Initial commit	2025-08-12 06:00:25 +00:00
api_server.py	initial commit	2025-08-12 14:52:49 -05:00
brand_guidelines_db.py	initial commit	2025-08-12 14:52:49 -05:00
CLAUDE.md	initial commit	2025-08-12 14:52:49 -05:00
CLEANUP_SUMMARY.md	initial commit	2025-08-12 14:52:49 -05:00
config.env	initial commit	2025-08-12 14:52:49 -05:00
debug_mode.txt	initial commit	2025-08-12 14:52:49 -05:00
headless_curl_examples.sh	initial commit	2025-08-12 14:52:49 -05:00
headless_example.py	initial commit	2025-08-12 14:52:49 -05:00
llm_config.py	initial commit	2025-08-12 14:52:49 -05:00
profile_config.py	initial commit	2025-08-12 14:52:49 -05:00
README.md	initial commit	2025-08-12 14:52:49 -05:00
requirements.txt	initial commit	2025-08-12 14:52:49 -05:00
server.log	initial commit	2025-08-12 14:52:49 -05:00
server_debug.log	initial commit	2025-08-12 14:52:49 -05:00
web_ui.html	initial commit	2025-08-12 14:52:49 -05:00

README.md

Visual AI QC - AI-Powered Quality Control Platform

AI-driven visual quality control system for marketing materials, advertisements, and design assets

🚀 Overview

Visual AI QC is an intelligent quality control platform that uses advanced AI (OpenAI GPT-4 and Google Gemini) to automatically analyze visual content against brand guidelines, technical specifications, and design best practices. The system provides comprehensive feedback, scoring, and recommendations for improving visual assets.

✨ Key Features

🤖 AI-Powered Analysis: Leverages both OpenAI GPT-4 and Google Gemini for comprehensive visual analysis
📊 Real-Time Progress Tracking: Watch your analysis progress live with detailed step-by-step updates
🎯 Brand-Specific Profiles: Customizable analysis profiles for different brands and use cases
📋 Comprehensive Scoring: Weighted scoring system with detailed breakdowns and recommendations
📄 Multiple Output Formats: Generate results in HTML (visual reports) or JSON (data format)
🔄 Async Processing: Non-blocking analysis with progress tracking for better user experience
⚡ Parallel Processing: Batch-based parallel execution of QC checks for improved performance
📁 Reference Asset Integration: Upload and use brand guidelines to enhance analysis accuracy
🎨 Consumer-Focused Analysis: AI automatically focuses on consumer-facing visuals, ignoring production elements
⚡ Streamlined Workflow: Direct profile selection without unnecessary triage steps
🌐 Web Interface: User-friendly web UI for easy file uploads and result viewing
🔌 REST API: Full API access for programmatic integration

🆕 Recent Improvements

Parallel Processing Architecture ⚡

Batch-Based Execution: QC checks now run in parallel batches of 15 for dramatically improved performance
Smart Batching Logic: Automatically organizes checks based on profile size (e.g., 11 checks = 1 batch, 25 checks = 2 batches)
Enhanced Progress Tracking: Real-time batch progress with "Batch X of Y" indicators and individual check status
Optimized Resource Usage: ThreadPoolExecutor manages concurrent API calls while respecting rate limits

Enhanced Analysis Workflow

Removed Triage Steps: Analysis now uses your selected profile directly, reducing processing time
Consumer-Focused Instructions: AI receives specific guidance to focus only on customer-facing visuals
Production Element Filtering: Automatically ignores cut lines, registration marks, and technical elements

Reference Asset Integration

Functional Reference Assets: Reference assets now actually enhance analysis (previously were ignored)
Brand-Aware Analysis: Selected reference assets provide AI with brand-specific context and guidelines
Visual Feedback: HTML reports clearly show when reference assets were used in analysis

Improved User Experience

Streamlined Process: Reduced from 5 steps to 3 steps for faster analysis
Better Progress Tracking: More accurate progress indicators throughout analysis
Enhanced Reports: HTML reports include reference asset usage and more detailed metadata

🏗️ System Architecture

Visual AI QC Platform
├── Web Interface (web_ui.html)
├── API Server (api_server.py) - Now with parallel processing
├── QC Analysis Engine (visual_qc_apps/) - Batch execution
├── Profile Management (profiles/)
├── Brand Guidelines DB (brand_guidelines/)
└── Output Management (output/)

🛠️ Quick Start

Prerequisites

Python 3.8+
OpenAI API Key
Google AI API Key
Required Python packages (see requirements.txt)

Installation

Clone and Setup:

cd Visual_AI_QC
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure Environment:

cp config.env.example config.env
# Edit config.env with your API keys:
# OPENAI_API_KEY=your_openai_key_here
# GOOGLE_API_KEY=your_google_key_here

Start the Server:
```
python api_server.py
```
Access Web Interface: Open http://localhost:7183 in your browser

📝 How to Perform an Analysis

Using the Web Interface

Upload Your File:
- Click the upload area or drag & drop your image
- Supported formats: PNG, JPG, JPEG, GIF, WebP
Configure Analysis Settings:
- Profile: Choose from available QC profiles (General, Brand-specific, etc.)
- Output Mode: Select HTML Report (visual) or JSON Data (raw data)
- AI Model: Choose between profile settings, OpenAI, or Gemini
- Reference Asset: Optionally select a brand guideline for comparison
Start Analysis:
- Click "Analyze File"
- Watch real-time progress updates showing batch processing in action
- Progress shows "Batch X of Y" with individual check completion status
View Results:
- Detailed analysis results appear in the interface
- Overall score, grade, and individual check results
- Download HTML report or access JSON data
- Files saved automatically to /output folder

Analysis Steps Breakdown

When you start an analysis, the system performs these streamlined steps:

🎯 Quality Control Analysis (5-95% complete)
- Direct Profile Usage: Uses your selected QC profile immediately (no triage needed)
- Consumer-Focused Analysis: AI focuses only on consumer-facing visuals, ignoring production elements
- Reference Asset Integration: Incorporates selected reference assets into analysis prompts
- Parallel Batch Execution: Runs QC checks in parallel batches of 15 with real-time progress updates
- Comprehensive Evaluation: Each check analyzes specific aspects (logo visibility, text readability, etc.)
- Brand-Aware Analysis: Uses profile-specific instructions and reference guidelines
📊 Score Calculation (95% complete)
- Calculates weighted overall score from individual check results
- Determines letter grade (A, B, C, D, F)
- Compiles comprehensive feedback and recommendations
💾 Report Generation (100% complete)
- Creates detailed HTML report (if HTML mode selected)
- Saves results to output folder with reference asset tracking
- Makes downloadable report available with full analysis details

🎯 Available QC Profiles

Brand-Specific Profiles

Diageo Key Visual: Optimized for Diageo brand guidelines and key visual requirements
Diageo Packaging: Tailored for Diageo packaging design requirements
Unilever Key Visual: Focused on Unilever brand standards and visual hierarchy
Unilever Packaging: Specialized for Unilever packaging compliance

General Profiles

All Checks: Comprehensive analysis using all available QC checks
General Key Visual: Standard checks for key visual/hero image analysis
General Packaging: Technical checks for packaging design compliance
Inclusive Accessibility: Focuses on accessibility and inclusive design practices

🎨 Consumer-Focused Analysis

All profiles now include pre-analysis instructions that guide the AI to focus specifically on consumer-facing visuals:

What Gets Analyzed ✅

Primary consumer-facing design elements
Main/largest visual panels in multi-panel layouts
Final product as it appears on shelf/in-market
Assembled final product (for folded/die-cut layouts)

What Gets Ignored ❌

Cut lines and registration marks
Production guides and technical markings
Color bars and printer calibration elements
Technical text and file information
Background production elements

Built-in Complexity Check

Step 1: Count distinct visual elements (logos, images, text blocks, etc.)
Step 2: Evaluate total element count
Step 3: Flag designs with >4 elements as potentially too cluttered

This ensures analysis focuses on the actual customer experience rather than production technicalities.

⚡ Parallel Processing Performance

The system now uses batch-based parallel processing to dramatically improve analysis speed:

Performance Architecture

Batch Size: Fixed at 15 QC checks per batch for optimal API usage
Concurrent Execution: ThreadPoolExecutor manages parallel API calls within each batch
Smart Batching: Automatically calculates batches based on enabled checks:
- Diageo Key Visual (11 checks) → 1 batch
- General Key Visual (21 checks) → 2 batches (15 + 6)
- All Checks (34 checks) → 3 batches (15 + 15 + 4)

Progress Tracking Enhancements

Batch Progress: "Processing batch 1 of 2" indicators
Individual Check Status: Real-time completion tracking within batches
Enhanced Visibility: See which checks are running in parallel
Error Resilience: Individual check failures don't stop the batch

Performance Benefits

Faster Analysis: Multiple checks run simultaneously instead of sequentially
Better Resource Utilization: Optimized API usage with rate limiting consideration
Improved User Experience: More detailed progress information with batch context
Scalable Architecture: System adapts batch count based on profile complexity

🔍 Quality Control Checks

The system includes 34+ specialized QC checks organized into categories:

Visual Design Checks

Logo Visibility: Ensures brand logo is clearly visible and prominent
Brand Assets Visibility: Verifies distinctive brand assets are recognizable
Product Visibility: Checks if product imagery is clear and prominent
Visual Hierarchy: Analyzes information hierarchy and visual flow
Background Contrast: Evaluates contrast for readability and accessibility
Face Visibility & Gaze Direction: Analyzes human faces in imagery
Supporting Images: Evaluates relevance and quality of supporting visuals

Text & Typography Checks

Text Readability: Ensures text is legible at appropriate viewing distances
Call to Action: Evaluates effectiveness of CTA text and placement
Lowercase Text: Checks adherence to brand typography guidelines
Word Count: Validates appropriate amount of text content
Imperative Verb: Ensures strong, action-oriented language

Layout & Composition Checks

Element Alignment: Verifies proper alignment of design elements
Visual Elements Count: Ensures appropriate number of visual elements
Curved Edges: Checks for modern, curved design elements
Aspect Ratio: Validates correct dimensions for intended use
Safety Area: Ensures important content stays within safe zones
New Visibility: Verifies "NEW" or promotional elements are prominent

Technical Checks

Image Resolution: Validates appropriate resolution for intended use
Color Format: Ensures correct color space (RGB/CMYK) for medium
File Naming: Checks adherence to file naming conventions
Layer Organization: Validates proper layer structure (for design files)
Print Bleed: Ensures adequate bleed areas for print materials
Crop Marks: Verifies presence of crop marks when required
Animation Transitions: Evaluates smooth transitions (for animated content)
Dark Mode Legibility: Tests readability in dark mode environments
Responsiveness: Checks adaptability across different screen sizes

📊 Scoring System

Individual Check Scores

Each QC check receives a score from 1-10
8-10: Excellent (Green)
5-7: Good/Adequate (Yellow)
1-4: Needs Improvement (Red)

Overall Score Calculation

Weighted average of all individual check scores
Weights defined in the selected profile
Scaled to 100-point system for final score

Letter Grades

A (85-100): Excellent - Ready for publication
B (70-84): Good - Minor improvements recommended
C (50-69): Adequate - Several improvements needed
D (35-49): Poor - Significant rework required
F (0-34): Fail - Major issues must be addressed

🗂️ Reference Asset Management

Adding Reference Assets

Navigate to Settings → Reference Assets
Click "Add New Reference Asset"
Upload brand guideline documents (PDF, images, text files)
Add descriptive title and brand tags
Reference assets become available for analysis

Using Reference Assets ✨

Select Reference Assets: Choose from dropdown during analysis setup
Automatic Integration: Selected assets are incorporated into every QC check prompt
Brand-Specific Context: AI receives detailed brand guidelines and standards
Enhanced Accuracy: Analysis becomes more brand-aware and consistent
Visual Feedback: HTML reports show reference asset usage status
Text Content Support: Text-based reference files (.txt, .md, .json) are read and included in prompts

Reference Asset Features

Real Integration: Reference assets actually enhance analysis (no longer just stored)
Per-Check Usage: Each QC check receives reference asset context
File Type Support: Images (.png, .jpg), documents (.pdf), and text files (.txt, .md, .json)
Content Limitation: Text content limited to 3000 characters for optimal prompt performance
Error Handling: Graceful degradation when reference assets are unavailable

📁 File Management

Input Files

Supported Formats: PNG, JPG, JPEG, GIF, WebP
Upload Location: Files temporarily stored in /uploads/session_id/
File Size: Recommended under 10MB for optimal performance

Output Files

Location: /output/ directory
Naming Convention: YYYYMMDD_HHMMSS_filename_report.html/data.json
HTML Reports: Visual reports with expandable sections, scoring, and recommendations
JSON Data: Raw analysis data for programmatic processing
Retention: Files persist until manually deleted

🔌 API Integration

Key Endpoints

Start Analysis

POST /api/start_analysis

Purpose: Start async analysis with progress tracking
Parameters: file, profile, mode, brand, reference_asset
Returns: session_id for progress tracking

Check Progress

GET /api/progress/{session_id}

Purpose: Get real-time analysis progress
Returns: Current step, progress percentage, check details

Get Results

GET /api/progress/{session_id}

Purpose: Retrieve completed analysis results
Returns: Full analysis data when stage = 'complete'

Download Output

GET /output/{filename}

Purpose: Download generated HTML/JSON reports
Returns: File content for download

Example API Usage

import requests
import time

# Start analysis
with open('image.jpg', 'rb') as f:
    response = requests.post('http://localhost:7183/api/start_analysis', 
                           files={'file': f},
                           data={'profile': 'general', 'mode': 'html'})
    
session_id = response.json()['session_id']

# Poll for progress
while True:
    progress = requests.get(f'http://localhost:7183/api/progress/{session_id}')
    data = progress.json()['progress']
    
    if data['stage'] == 'complete':
        results = data['result']
        print(f"Analysis complete! Score: {results['summary']['overall_score']}")
        break
    else:
        print(f"Progress: Batch {data.get('current_batch', 1)} of {data.get('total_batches', 1)} - {data['completed_checks']}/{data['total_checks']} checks")
        time.sleep(2)

⚙️ Configuration & Customization

Creating Custom Profiles

Copy an existing profile from /profiles/
Modify checks, weights, and LLM preferences
Save with descriptive filename (e.g., my_brand.json)
Restart server to load new profile

Environment Configuration

Edit config.env:

OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
FLASK_PORT=7183
DEBUG_MODE=false

Adding Custom QC Checks

Create new check in /visual_qc_apps/my_check/
Implement app.py with analysis logic
Register in profile configurations
Restart server to activate

🚨 Troubleshooting

Fixed Issues ✅

Analysis Stuck at "Step 1 of 1"

✅ RESOLVED: System now shows proper progress tracking with individual check names and accurate step counts

HTML Mode Saving JSON Files

✅ RESOLVED: Web UI correctly maintains output mode selection throughout analysis

"originalMode is not defined" JavaScript Error

✅ RESOLVED: Fixed JavaScript variable scoping issue in web UI

Reference Assets Not Working

✅ RESOLVED: Reference assets now actually enhance analysis prompts (previously were ignored)

Current Common Issues

Server Won't Start

Check API keys are set in config.env
Verify port 7183 is not in use: lsof -ti:7183
Check Python dependencies: pip install -r requirements.txt

Analysis Fails

Verify image file is valid and under 10MB
Check server logs for specific error messages
Ensure selected profile exists and is valid
Ensure reference asset exists if selected (system will continue without it)

Log Files

Server Logs: server.log - General server activity
Debug Logs: debug_mode.txt - Parameter debugging (if enabled)
Error Logs: Check console output for detailed error messages

📚 Additional Documentation

API Reference: Detailed API endpoint documentation
Environment Setup: Comprehensive setup instructions
Testing Guide: Testing procedures and examples
Triage System: Automatic content type detection
Profiles Guide: Profile configuration and customization

🤝 Contributing

Fork the repository
Create feature branch: git checkout -b feature/amazing-feature
Commit changes: git commit -m 'Add amazing feature'
Push to branch: git push origin feature/amazing-feature
Open a Pull Request

📄 License

🆘 Support

For technical support and questions:

Review documentation in /docs/ directory
Check existing issues and troubleshooting guide
Contact the development team for enterprise support

Visual AI QC Platform - Ensuring quality through intelligent automation 🎯✨