|
|
||
|---|---|---|
| __pycache__ | ||
| brand_guidelines | ||
| output | ||
| profiles | ||
| uploads | ||
| visual_qc_apps | ||
| .DS_Store | ||
| .gitignore | ||
| api_server.py | ||
| brand_guidelines_db.py | ||
| CLAUDE.md | ||
| CLEANUP_SUMMARY.md | ||
| config.env | ||
| debug_mode.txt | ||
| headless_curl_examples.sh | ||
| headless_example.py | ||
| llm_config.py | ||
| profile_config.py | ||
| README.md | ||
| requirements.txt | ||
| server.log | ||
| server_debug.log | ||
| web_ui.html | ||
Visual AI QC - AI-Powered Quality Control Platform
AI-driven visual quality control system for marketing materials, advertisements, and design assets
🚀 Overview
Visual AI QC is an intelligent quality control platform that uses advanced AI (OpenAI GPT-4 and Google Gemini) to automatically analyze visual content against brand guidelines, technical specifications, and design best practices. The system provides comprehensive feedback, scoring, and recommendations for improving visual assets.
✨ Key Features
- 🤖 AI-Powered Analysis: Leverages both OpenAI GPT-4 and Google Gemini for comprehensive visual analysis
- 📊 Real-Time Progress Tracking: Watch your analysis progress live with detailed step-by-step updates
- 🎯 Brand-Specific Profiles: Customizable analysis profiles for different brands and use cases
- 📋 Comprehensive Scoring: Weighted scoring system with detailed breakdowns and recommendations
- 📄 Multiple Output Formats: Generate results in HTML (visual reports) or JSON (data format)
- 🔄 Async Processing: Non-blocking analysis with progress tracking for better user experience
- ⚡ Parallel Processing: Batch-based parallel execution of QC checks for improved performance
- 📁 Reference Asset Integration: Upload and use brand guidelines to enhance analysis accuracy
- 🎨 Consumer-Focused Analysis: AI automatically focuses on consumer-facing visuals, ignoring production elements
- ⚡ Streamlined Workflow: Direct profile selection without unnecessary triage steps
- 🌐 Web Interface: User-friendly web UI for easy file uploads and result viewing
- 🔌 REST API: Full API access for programmatic integration
🆕 Recent Improvements
Parallel Processing Architecture ⚡
- Batch-Based Execution: QC checks now run in parallel batches of 15 for dramatically improved performance
- Smart Batching Logic: Automatically organizes checks based on profile size (e.g., 11 checks = 1 batch, 25 checks = 2 batches)
- Enhanced Progress Tracking: Real-time batch progress with "Batch X of Y" indicators and individual check status
- Optimized Resource Usage: ThreadPoolExecutor manages concurrent API calls while respecting rate limits
Enhanced Analysis Workflow
- Removed Triage Steps: Analysis now uses your selected profile directly, reducing processing time
- Consumer-Focused Instructions: AI receives specific guidance to focus only on customer-facing visuals
- Production Element Filtering: Automatically ignores cut lines, registration marks, and technical elements
Reference Asset Integration
- Functional Reference Assets: Reference assets now actually enhance analysis (previously were ignored)
- Brand-Aware Analysis: Selected reference assets provide AI with brand-specific context and guidelines
- Visual Feedback: HTML reports clearly show when reference assets were used in analysis
Improved User Experience
- Streamlined Process: Reduced from 5 steps to 3 steps for faster analysis
- Better Progress Tracking: More accurate progress indicators throughout analysis
- Enhanced Reports: HTML reports include reference asset usage and more detailed metadata
🏗️ System Architecture
Visual AI QC Platform
├── Web Interface (web_ui.html)
├── API Server (api_server.py) - Now with parallel processing
├── QC Analysis Engine (visual_qc_apps/) - Batch execution
├── Profile Management (profiles/)
├── Brand Guidelines DB (brand_guidelines/)
└── Output Management (output/)
🛠️ Quick Start
Prerequisites
- Python 3.8+
- OpenAI API Key
- Google AI API Key
- Required Python packages (see
requirements.txt)
Installation
-
Clone and Setup:
cd Visual_AI_QC python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt -
Configure Environment:
cp config.env.example config.env # Edit config.env with your API keys: # OPENAI_API_KEY=your_openai_key_here # GOOGLE_API_KEY=your_google_key_here -
Start the Server:
python api_server.py -
Access Web Interface: Open http://localhost:7183 in your browser
📝 How to Perform an Analysis
Using the Web Interface
-
Upload Your File:
- Click the upload area or drag & drop your image
- Supported formats: PNG, JPG, JPEG, GIF, WebP
-
Configure Analysis Settings:
- Profile: Choose from available QC profiles (General, Brand-specific, etc.)
- Output Mode: Select HTML Report (visual) or JSON Data (raw data)
- AI Model: Choose between profile settings, OpenAI, or Gemini
- Reference Asset: Optionally select a brand guideline for comparison
-
Start Analysis:
- Click "Analyze File"
- Watch real-time progress updates showing batch processing in action
- Progress shows "Batch X of Y" with individual check completion status
-
View Results:
- Detailed analysis results appear in the interface
- Overall score, grade, and individual check results
- Download HTML report or access JSON data
- Files saved automatically to
/outputfolder
Analysis Steps Breakdown
When you start an analysis, the system performs these streamlined steps:
-
🎯 Quality Control Analysis (5-95% complete)
- Direct Profile Usage: Uses your selected QC profile immediately (no triage needed)
- Consumer-Focused Analysis: AI focuses only on consumer-facing visuals, ignoring production elements
- Reference Asset Integration: Incorporates selected reference assets into analysis prompts
- Parallel Batch Execution: Runs QC checks in parallel batches of 15 with real-time progress updates
- Comprehensive Evaluation: Each check analyzes specific aspects (logo visibility, text readability, etc.)
- Brand-Aware Analysis: Uses profile-specific instructions and reference guidelines
-
📊 Score Calculation (95% complete)
- Calculates weighted overall score from individual check results
- Determines letter grade (A, B, C, D, F)
- Compiles comprehensive feedback and recommendations
-
💾 Report Generation (100% complete)
- Creates detailed HTML report (if HTML mode selected)
- Saves results to output folder with reference asset tracking
- Makes downloadable report available with full analysis details
🎯 Available QC Profiles
Brand-Specific Profiles
- Diageo Key Visual: Optimized for Diageo brand guidelines and key visual requirements
- Diageo Packaging: Tailored for Diageo packaging design requirements
- Unilever Key Visual: Focused on Unilever brand standards and visual hierarchy
- Unilever Packaging: Specialized for Unilever packaging compliance
General Profiles
- All Checks: Comprehensive analysis using all available QC checks
- General Key Visual: Standard checks for key visual/hero image analysis
- General Packaging: Technical checks for packaging design compliance
- Inclusive Accessibility: Focuses on accessibility and inclusive design practices
🎨 Consumer-Focused Analysis
All profiles now include pre-analysis instructions that guide the AI to focus specifically on consumer-facing visuals:
What Gets Analyzed ✅
- Primary consumer-facing design elements
- Main/largest visual panels in multi-panel layouts
- Final product as it appears on shelf/in-market
- Assembled final product (for folded/die-cut layouts)
What Gets Ignored ❌
- Cut lines and registration marks
- Production guides and technical markings
- Color bars and printer calibration elements
- Technical text and file information
- Background production elements
Built-in Complexity Check
- Step 1: Count distinct visual elements (logos, images, text blocks, etc.)
- Step 2: Evaluate total element count
- Step 3: Flag designs with >4 elements as potentially too cluttered
This ensures analysis focuses on the actual customer experience rather than production technicalities.
⚡ Parallel Processing Performance
The system now uses batch-based parallel processing to dramatically improve analysis speed:
Performance Architecture
- Batch Size: Fixed at 15 QC checks per batch for optimal API usage
- Concurrent Execution: ThreadPoolExecutor manages parallel API calls within each batch
- Smart Batching: Automatically calculates batches based on enabled checks:
- Diageo Key Visual (11 checks) → 1 batch
- General Key Visual (21 checks) → 2 batches (15 + 6)
- All Checks (34 checks) → 3 batches (15 + 15 + 4)
Progress Tracking Enhancements
- Batch Progress: "Processing batch 1 of 2" indicators
- Individual Check Status: Real-time completion tracking within batches
- Enhanced Visibility: See which checks are running in parallel
- Error Resilience: Individual check failures don't stop the batch
Performance Benefits
- Faster Analysis: Multiple checks run simultaneously instead of sequentially
- Better Resource Utilization: Optimized API usage with rate limiting consideration
- Improved User Experience: More detailed progress information with batch context
- Scalable Architecture: System adapts batch count based on profile complexity
🔍 Quality Control Checks
The system includes 34+ specialized QC checks organized into categories:
Visual Design Checks
- Logo Visibility: Ensures brand logo is clearly visible and prominent
- Brand Assets Visibility: Verifies distinctive brand assets are recognizable
- Product Visibility: Checks if product imagery is clear and prominent
- Visual Hierarchy: Analyzes information hierarchy and visual flow
- Background Contrast: Evaluates contrast for readability and accessibility
- Face Visibility & Gaze Direction: Analyzes human faces in imagery
- Supporting Images: Evaluates relevance and quality of supporting visuals
Text & Typography Checks
- Text Readability: Ensures text is legible at appropriate viewing distances
- Call to Action: Evaluates effectiveness of CTA text and placement
- Lowercase Text: Checks adherence to brand typography guidelines
- Word Count: Validates appropriate amount of text content
- Imperative Verb: Ensures strong, action-oriented language
Layout & Composition Checks
- Element Alignment: Verifies proper alignment of design elements
- Visual Elements Count: Ensures appropriate number of visual elements
- Curved Edges: Checks for modern, curved design elements
- Aspect Ratio: Validates correct dimensions for intended use
- Safety Area: Ensures important content stays within safe zones
- New Visibility: Verifies "NEW" or promotional elements are prominent
Technical Checks
- Image Resolution: Validates appropriate resolution for intended use
- Color Format: Ensures correct color space (RGB/CMYK) for medium
- File Naming: Checks adherence to file naming conventions
- Layer Organization: Validates proper layer structure (for design files)
- Print Bleed: Ensures adequate bleed areas for print materials
- Crop Marks: Verifies presence of crop marks when required
- Animation Transitions: Evaluates smooth transitions (for animated content)
- Dark Mode Legibility: Tests readability in dark mode environments
- Responsiveness: Checks adaptability across different screen sizes
📊 Scoring System
Individual Check Scores
- Each QC check receives a score from 1-10
- 8-10: Excellent (Green)
- 5-7: Good/Adequate (Yellow)
- 1-4: Needs Improvement (Red)
Overall Score Calculation
- Weighted average of all individual check scores
- Weights defined in the selected profile
- Scaled to 100-point system for final score
Letter Grades
- A (85-100): Excellent - Ready for publication
- B (70-84): Good - Minor improvements recommended
- C (50-69): Adequate - Several improvements needed
- D (35-49): Poor - Significant rework required
- F (0-34): Fail - Major issues must be addressed
🗂️ Reference Asset Management
Adding Reference Assets
- Navigate to Settings → Reference Assets
- Click "Add New Reference Asset"
- Upload brand guideline documents (PDF, images, text files)
- Add descriptive title and brand tags
- Reference assets become available for analysis
Using Reference Assets ✨
- Select Reference Assets: Choose from dropdown during analysis setup
- Automatic Integration: Selected assets are incorporated into every QC check prompt
- Brand-Specific Context: AI receives detailed brand guidelines and standards
- Enhanced Accuracy: Analysis becomes more brand-aware and consistent
- Visual Feedback: HTML reports show reference asset usage status
- Text Content Support: Text-based reference files (.txt, .md, .json) are read and included in prompts
Reference Asset Features
- Real Integration: Reference assets actually enhance analysis (no longer just stored)
- Per-Check Usage: Each QC check receives reference asset context
- File Type Support: Images (.png, .jpg), documents (.pdf), and text files (.txt, .md, .json)
- Content Limitation: Text content limited to 3000 characters for optimal prompt performance
- Error Handling: Graceful degradation when reference assets are unavailable
📁 File Management
Input Files
- Supported Formats: PNG, JPG, JPEG, GIF, WebP
- Upload Location: Files temporarily stored in
/uploads/session_id/ - File Size: Recommended under 10MB for optimal performance
Output Files
- Location:
/output/directory - Naming Convention:
YYYYMMDD_HHMMSS_filename_report.html/data.json - HTML Reports: Visual reports with expandable sections, scoring, and recommendations
- JSON Data: Raw analysis data for programmatic processing
- Retention: Files persist until manually deleted
🔌 API Integration
Key Endpoints
Start Analysis
POST /api/start_analysis
- Purpose: Start async analysis with progress tracking
- Parameters: file, profile, mode, brand, reference_asset
- Returns: session_id for progress tracking
Check Progress
GET /api/progress/{session_id}
- Purpose: Get real-time analysis progress
- Returns: Current step, progress percentage, check details
Get Results
GET /api/progress/{session_id}
- Purpose: Retrieve completed analysis results
- Returns: Full analysis data when stage = 'complete'
Download Output
GET /output/{filename}
- Purpose: Download generated HTML/JSON reports
- Returns: File content for download
Example API Usage
import requests
import time
# Start analysis
with open('image.jpg', 'rb') as f:
response = requests.post('http://localhost:7183/api/start_analysis',
files={'file': f},
data={'profile': 'general', 'mode': 'html'})
session_id = response.json()['session_id']
# Poll for progress
while True:
progress = requests.get(f'http://localhost:7183/api/progress/{session_id}')
data = progress.json()['progress']
if data['stage'] == 'complete':
results = data['result']
print(f"Analysis complete! Score: {results['summary']['overall_score']}")
break
else:
print(f"Progress: Batch {data.get('current_batch', 1)} of {data.get('total_batches', 1)} - {data['completed_checks']}/{data['total_checks']} checks")
time.sleep(2)
⚙️ Configuration & Customization
Creating Custom Profiles
- Copy an existing profile from
/profiles/ - Modify checks, weights, and LLM preferences
- Save with descriptive filename (e.g.,
my_brand.json) - Restart server to load new profile
Environment Configuration
Edit config.env:
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
FLASK_PORT=7183
DEBUG_MODE=false
Adding Custom QC Checks
- Create new check in
/visual_qc_apps/my_check/ - Implement
app.pywith analysis logic - Register in profile configurations
- Restart server to activate
🚨 Troubleshooting
Fixed Issues ✅
Analysis Stuck at "Step 1 of 1"
- ✅ RESOLVED: System now shows proper progress tracking with individual check names and accurate step counts
HTML Mode Saving JSON Files
- ✅ RESOLVED: Web UI correctly maintains output mode selection throughout analysis
"originalMode is not defined" JavaScript Error
- ✅ RESOLVED: Fixed JavaScript variable scoping issue in web UI
Reference Assets Not Working
- ✅ RESOLVED: Reference assets now actually enhance analysis prompts (previously were ignored)
Current Common Issues
Server Won't Start
- Check API keys are set in
config.env - Verify port 7183 is not in use:
lsof -ti:7183 - Check Python dependencies:
pip install -r requirements.txt
Analysis Fails
- Verify image file is valid and under 10MB
- Check server logs for specific error messages
- Ensure selected profile exists and is valid
- Ensure reference asset exists if selected (system will continue without it)
Log Files
- Server Logs:
server.log- General server activity - Debug Logs:
debug_mode.txt- Parameter debugging (if enabled) - Error Logs: Check console output for detailed error messages
📚 Additional Documentation
- API Reference: Detailed API endpoint documentation
- Environment Setup: Comprehensive setup instructions
- Testing Guide: Testing procedures and examples
- Triage System: Automatic content type detection
- Profiles Guide: Profile configuration and customization
🤝 Contributing
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
📄 License
This project is proprietary software. All rights reserved.
🆘 Support
For technical support and questions:
- Review documentation in
/docs/directory - Check existing issues and troubleshooting guide
- Contact the development team for enterprise support
Visual AI QC Platform - Ensuring quality through intelligent automation 🎯✨