No description
Find a file
nickviljoen 96d0bf95e1 Reporting updated.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-14 09:14:00 +02:00
static Initial Commit 2025-12-30 16:47:56 +02:00
templates Reporting updated. 2026-01-14 09:14:00 +02:00
.env.example Initial Commit 2025-12-30 16:47:56 +02:00
.gitignore Initial Commit 2025-12-30 16:47:56 +02:00
app.py Reporting updated. 2026-01-14 09:14:00 +02:00
auth_middleware.py Initial Commit 2025-12-30 16:47:56 +02:00
box_client.py Initial Commit 2025-12-30 16:47:56 +02:00
CHANGELOG.md Initial Commit 2025-12-30 16:47:56 +02:00
config.py Initial Commit 2025-12-30 16:47:56 +02:00
DOCUMENTATION_SUMMARY.txt Initial Commit 2025-12-30 16:47:56 +02:00
gunicorn_config.py Initial Commit 2025-12-30 16:47:56 +02:00
jwt_validator.py Initial Commit 2025-12-30 16:47:56 +02:00
MIGRATION_GUIDE.md Initial Commit 2025-12-30 16:47:56 +02:00
README.md Reporting updated. 2026-01-14 09:14:00 +02:00
report_parser.py Initial Commit 2025-12-30 16:47:56 +02:00
requirements.txt Initial Commit 2025-12-30 16:47:56 +02:00
run.sh Reporting updated. 2026-01-14 09:14:00 +02:00
run_prod.sh Initial Commit 2025-12-30 16:47:56 +02:00
setup.sh Initial Commit 2025-12-30 16:47:56 +02:00
test_local.sh Initial Commit 2025-12-30 16:47:56 +02:00
wsgi.py Initial Commit 2025-12-30 16:47:56 +02:00

QC Report Dashboard

A Flask-based web application for viewing and analyzing QC reports stored in Box.com. This tool aggregates HTML reports by job number and provides both parsed data views and embedded report displays.

🔐 Secured with Azure AD Authentication - Users must sign in with their Microsoft account to access the application.

Features

  • 🔐 Azure AD Authentication: Secure login with Microsoft accounts using MSAL
  • Job Number Search: Search for all QC reports associated with a campaign number
  • 📊 Real-time Progress Indicator: Visual progress bar during campaign search
  • Dual View Mode:
    • Parsed Data View: Structured display of check results with filtering
    • Embedded Reports: View original HTML reports inline
  • Aggregated Summary: Overview of all checks across multiple files
  • Quick Navigation: Click "View Details" link next to error files to jump directly to that report
  • HTML Export Options:
    • Export Combined Report: All reports in a single HTML file
    • Export Error Reports Only: Filter to only files with errors
  • PDF Export: Export combined reports as a single PDF document (requires WeasyPrint setup)
  • Error Highlighting: Quickly identify files with errors
  • User Session Management: httpOnly cookies with automatic logout
  • Scalable Search: Efficiently searches through 3500+ campaigns with automatic pagination

Requirements

  • Python 3.8+
  • Box.com account with API access
  • Box JWT authentication configured
  • Azure AD application registration (shared with AI QC application)
  • Modern web browser with JavaScript enabled

Authentication

This application uses Azure AD (Microsoft Entra ID) for authentication via MSAL (Microsoft Authentication Library).

Azure AD Configuration

Tenant ID: e519c2e6-bc6d-4fdf-8d9c-923c2f002385 Client ID: 9079054c-9620-4757-a256-23413042f1ef

Required Redirect URIs (must be registered in Azure AD):

  • Development: http://localhost:7183
  • Production: https://your-production-domain.com

Security Features

  • httpOnly cookies - XSS attack prevention
  • PKCE flow - Authorization code protection
  • RS256 JWT signatures - Cryptographic token validation
  • Real-time token validation - Verified against Azure AD on each request
  • SameSite=Lax - CSRF protection
  • Secure flag (production) - HTTPS-only cookies

Installation

1. Clone the Repository

cd /path/to/web_hm_ai_qc_report

2. Create Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

Dependencies include:

  • Flask 3.0.0 - Web framework
  • PyJWT 2.8.0 - JWT token validation
  • cryptography 41.0.7 - Cryptographic operations
  • requests 2.31.0 - HTTP requests for Azure AD
  • boxsdk 3.9.2 - Box.com API integration
  • beautifulsoup4 4.12.2 - HTML parsing
  • weasyprint 60.1 - PDF generation (optional)

4. Configure Azure AD

The application is pre-configured to use the shared Azure AD app with AI QC. No additional setup needed unless creating a new Azure AD application.

5. Configure Box API

  1. Create a Box application at https://app.box.com/developers/console
  2. Configure JWT authentication
  3. Download the JSON config file
  4. Place it at config/box_config.json

6. Environment Configuration

Copy the example environment file and update it:

cp .env.example .env

Edit .env and configure:

# Box Configuration
BOX_CONFIG_PATH=config/box_config.json
BOX_REPORT_FOLDER_ID=133295752718  # CAMPAIGNS folder ID

# Flask Configuration
FLASK_APP=app.py
FLASK_ENV=development
SECRET_KEY=<generate-strong-random-key>

# Azure AD Configuration
AZURE_TENANT_ID=e519c2e6-bc6d-4fdf-8d9c-923c2f002385
AZURE_CLIENT_ID=9079054c-9620-4757-a256-23413042f1ef

# Server Configuration
HOST=0.0.0.0
PORT=7183

Generate SECRET_KEY:

python3 -c "import secrets; print(secrets.token_hex(32))"

Running the Application

Development Mode

# Using the run script
./run.sh

# Or directly
python app.py

The application will be available at http://localhost:7183

Production Mode with Gunicorn

# Using the production run script
./run_prod.sh

# Or directly
gunicorn -c gunicorn_config.py wsgi:app

Project Structure

web_hm_ai_qc_report/
├── app.py                  # Main Flask application
├── auth_middleware.py      # Azure AD authentication middleware
├── jwt_validator.py        # JWT token validation
├── box_client.py           # Box API client
├── report_parser.py        # HTML report parser
├── config.py               # Application configuration
├── wsgi.py                 # WSGI entry point
├── gunicorn_config.py      # Gunicorn configuration
├── requirements.txt        # Python dependencies
├── .env                    # Environment variables (not in git)
├── .env.example            # Example environment file
├── .gitignore             # Git ignore rules
├── templates/              # HTML templates
│   ├── index.html         # Search page with auth UI
│   ├── dashboard.html     # Main dashboard
│   ├── pdf_export.html    # PDF export template
│   ├── 404.html           # 404 error page
│   └── 500.html           # 500 error page
├── static/                 # Static files
│   ├── css/
│   │   └── style.css      # Custom styles
│   └── js/
│       └── auth.js        # MSAL authentication logic
├── config/                 # Configuration files
│   └── box_config.json    # Box JWT config (not in git)
└── logs/                   # Application logs
    ├── access.log
    └── error.log

Usage

1. Sign In

  1. Navigate to http://localhost:7183
  2. You'll see "Authentication Required" screen
  3. Click "Sign in with Microsoft"
  4. Azure AD popup appears - enter your Microsoft credentials
  5. After successful login, you'll be redirected to the dashboard

2. Search for Reports

  1. Enter a campaign number (e.g., 2069052, 2069053)
  2. Click "Search Reports"
  3. Progress indicator appears showing:
    • "Connecting to Box..." - Authenticating with Box API
    • "Searching campaigns folder..." - Looking through 3500+ campaigns
    • "Processing results..." - Retrieving QC reports
    • "Success!" - Found X reports, redirecting...
  4. Search typically takes 5-10 seconds for campaigns later in the alphabetical list

3. View Dashboard

The dashboard shows:

  • Summary: Overall statistics (files checked, total checks, passed/errors/warnings)
  • Files with Errors: Quick list of problematic files with "View Details" links
    • Click the "View Details" link to jump directly to that report and see which checks failed
  • Parsed Data View: Structured check results with filtering options
  • Embedded Reports: Original HTML reports in iframes

4. Filter Results

Use the filter buttons to view:

  • All reports
  • Only files with errors
  • Only files that passed all checks

5. Export Reports

Two export options are available:

Export Combined Report as HTML (Green button):

  • Downloads all reports in a single HTML file
  • Includes all files (passed and failed)
  • Can be converted to PDF using browser's Print function

Export Error Reports Only (Red button):

  • Only appears when there are files with errors
  • Downloads only files that have errors
  • Excludes files that passed all checks
  • Filename includes "ERRORS_ONLY" suffix
  • Shows warning banner indicating filtered content

To convert HTML export to PDF:

  1. Open the downloaded HTML file in your browser
  2. Press Ctrl/Cmd + P to print
  3. Select "Save as PDF" as the destination

Note: Native PDF export requires WeasyPrint system dependencies (see Troubleshooting).

6. Sign Out

Click the "Logout" button in the navbar to sign out.

API Endpoints

Public Endpoints

  • GET /health - Health check endpoint (no authentication required)
  • GET /auth/status - Check authentication status
  • POST /auth/login - Process Azure AD token
  • POST /auth/logout - Clear authentication

Protected Endpoints (Require Authentication)

  • POST /search - Search for reports by job number (JSON API)
  • GET /dashboard/<job_number> - Dashboard for specific job
  • GET /api/report/<file_id> - Get parsed report data (JSON)
  • GET /api/report/<file_id>/raw - Get raw HTML report
  • GET /export/html/<job_number> - Export combined HTML report (all files)
  • GET /export/html/<job_number>/errors - Export HTML report (error files only)
  • GET /export/pdf/<job_number> - Export combined PDF (requires WeasyPrint)

Box Folder Structure

The application expects reports in Box to be organized in the CAMPAIGNS folder structure:

CAMPAIGNS (133295752718)/
├── 2069052/               # Campaign number folder
│   ├── CAMPAIGN_ASSETS/
│   ├── JOBS/
│   └── QC/                # QC reports subfolder
│       ├── file1_QC.html
│       ├── file2_QC.html
│       └── file3_QC.html
└── 2069053/               # Another campaign
    └── QC/
        └── file4_QC.html

Search Behavior:

  • Application searches through all campaign folders (3500+) to find the specified campaign number
  • Automatic pagination handles large folder lists efficiently
  • Progress bar provides real-time feedback during search (typically 5-10 seconds)
  • Once campaign found, retrieves all HTML files from the QC subfolder
  • Falls back to filename-based search if folder structure not found

Deployment

Prerequisites

  1. Azure AD Redirect URI: Add your production domain to Azure AD app registration
  2. HTTPS Certificate: Required for production (httpOnly cookies with Secure flag)
  3. Environment Variables: Update .env for production settings

Apache with mod_wsgi

  1. Install Apache and mod_wsgi:
sudo apt-get install apache2 libapache2-mod-wsgi-py3
  1. Create Apache configuration:
<VirtualHost *:443>
    ServerName your-domain.com

    SSLEngine on
    SSLCertificateFile /path/to/cert.pem
    SSLCertificateKeyFile /path/to/key.pem

    WSGIDaemonProcess qc_dashboard user=www-data group=www-data threads=5
    WSGIScriptAlias / /path/to/web_hm_ai_qc_report/wsgi.py

    <Directory /path/to/web_hm_ai_qc_report>
        WSGIProcessGroup qc_dashboard
        WSGIApplicationGroup %{GLOBAL}
        Require all granted
    </Directory>
</VirtualHost>

Nginx with Gunicorn

  1. Start Gunicorn:
gunicorn -c gunicorn_config.py wsgi:app
  1. Configure Nginx:
server {
    listen 443 ssl;
    server_name your-domain.com;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

    location / {
        proxy_pass http://127.0.0.1:7183;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /static {
        alias /path/to/web_hm_ai_qc_report/static;
    }
}

Systemd Service

Create /etc/systemd/system/qc-dashboard.service:

[Unit]
Description=QC Report Dashboard
After=network.target

[Service]
User=www-data
Group=www-data
WorkingDirectory=/path/to/web_hm_ai_qc_report
Environment="PATH=/path/to/web_hm_ai_qc_report/venv/bin"
Environment="FLASK_ENV=production"
ExecStart=/path/to/web_hm_ai_qc_report/venv/bin/gunicorn -c gunicorn_config.py wsgi:app

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl enable qc-dashboard
sudo systemctl start qc-dashboard

Troubleshooting

Authentication Issues

"Authentication Required" screen won't go away:

  • Clear browser cache and cookies
  • Verify http://localhost:7183 is in Azure AD redirect URIs
  • Check browser console for MSAL errors (F12)
  • Ensure JavaScript is enabled

"Token validation failed":

  • Check system clock (NTP sync)
  • Verify tenant ID matches: e519c2e6-bc6d-4fdf-8d9c-923c2f002385
  • Confirm Azure AD app is properly configured

Popup blocked:

  • Allow popups for localhost:7183 in browser settings
  • Try using a different browser

Box Authentication Failed

  • Verify box_config.json is in the config/ directory
  • Check that the Box service account has access to the report folder
  • Ensure JWT authentication is properly configured in Box

No Reports Found

  • Verify the campaign number exists in the CAMPAIGNS folder
  • Check that campaign has a QC subfolder
  • Ensure HTML files exist in CAMPAIGNS/{CampaignNumber}/QC/
  • Verify BOX_REPORT_FOLDER_ID=133295752718 (CAMPAIGNS folder)
  • Check Box service account has access to the CAMPAIGNS folder

Search Takes Too Long

  • Normal search time: 5-10 seconds for campaigns late in alphabetical order
  • Campaign 2069052 is at position ~3507, requiring pagination through multiple API calls
  • Progress bar shows current status - this is expected behavior
  • Earlier campaign numbers (e.g., 1001391) will be found faster

PDF Export Fails

"PDF export temporarily unavailable" error:

WeasyPrint requires system dependencies. On macOS:

# Install dependencies
brew install gobject-introspection cairo pango gdk-pixbuf libffi

# Set environment variables
export PKG_CONFIG_PATH="/opt/homebrew/opt/libffi/lib/pkgconfig"

# Reinstall weasyprint
source venv/bin/activate
pip uninstall weasyprint
pip install weasyprint

# Re-enable in app.py (remove comment from line 8)
from weasyprint import HTML

On Ubuntu/Debian:

sudo apt-get install python3-cffi python3-brotli libpango-1.0-0 libharfbuzz0b libpangoft2-1.0-0

Performance Issues

  • Increase Gunicorn workers in gunicorn_config.py
  • Implement caching for frequently accessed reports
  • Consider pagination for jobs with many reports

CORS Issues (Production)

If deploying to a different domain than Azure AD expects:

  • Add the domain to Azure AD redirect URIs
  • Update MSAL redirectUri in static/js/auth.js if needed

Security Considerations

Production Checklist

  • Generate strong SECRET_KEY (64+ character random string)
  • Set FLASK_ENV=production
  • Enable HTTPS (required for Secure cookies)
  • Add production domain to Azure AD redirect URIs
  • Restrict Box API access to minimum required permissions
  • Enable firewall rules to restrict server access
  • Set up log monitoring and alerting
  • Regular security updates for all dependencies

Authentication Flow

  1. User accesses application → Redirected to "Sign in with Microsoft"
  2. User clicks login → MSAL opens Azure AD popup
  3. User enters credentials → Azure AD validates
  4. Azure AD returns JWT token → Frontend sends to backend
  5. Backend validates JWT → Creates httpOnly cookie
  6. Subsequent requests include cookie → Validated on each request
  7. User clicks logout → Cookie cleared, MSAL session ended

License

Internal use only - H&M QC System

Support

For issues or questions, contact the development team.

Version History

v2.2.0 - Enhanced Navigation and Export (January 2026)

  • Added Quick Navigation: "View Details" links next to error files for direct navigation
    • Automatically switches to Parsed Data View tab
    • Scrolls to and expands the specific report
    • Highlights the report briefly for easy identification
  • Added Error-Only Export: New export option to download only files with errors
    • Separate red button appears when errors are present
    • Filename includes "ERRORS_ONLY" suffix
    • Export includes warning banner to indicate filtered content
  • Improved User Experience: Streamlined workflow for reviewing and sharing error reports

v2.1.0 - CAMPAIGNS Folder Integration (December 2025)

  • Updated folder structure to support CAMPAIGNS/{CampaignNumber}/QC/ hierarchy
  • Added real-time progress indicator with status updates during search
  • Implemented automatic pagination for searching 3500+ campaign folders
  • Updated Box API integration with proper fields parameter for efficient data retrieval
  • Enhanced search performance with Box SDK iterator optimization
  • Changed folder ID to CAMPAIGNS folder (133295752718)
  • Added visual feedback: spinner, progress bar, and status messages
  • Improved error handling for missing QC subfolders

v2.0.0 - Authentication Update (November 2025)

  • Added Azure AD authentication with MSAL
  • Implemented JWT validation and httpOnly cookies
  • Protected all API endpoints
  • Added user session management
  • Updated to use port 7183 (shared with AI QC)

v1.0.0 - Initial Release

  • Basic QC report viewing functionality
  • Box.com integration
  • PDF export capability