|
|
||
|---|---|---|
| static | ||
| templates | ||
| .env.example | ||
| .gitignore | ||
| app.py | ||
| auth_middleware.py | ||
| box_client.py | ||
| CHANGELOG.md | ||
| config.py | ||
| DOCUMENTATION_SUMMARY.txt | ||
| gunicorn_config.py | ||
| jwt_validator.py | ||
| MIGRATION_GUIDE.md | ||
| README.md | ||
| report_parser.py | ||
| requirements.txt | ||
| run.sh | ||
| run_prod.sh | ||
| setup.sh | ||
| test_local.sh | ||
| wsgi.py | ||
QC Report Dashboard
A Flask-based web application for viewing and analyzing QC reports stored in Box.com. This tool aggregates HTML reports by job number and provides both parsed data views and embedded report displays.
🔐 Secured with Azure AD Authentication - Users must sign in with their Microsoft account to access the application.
Features
- 🔐 Azure AD Authentication: Secure login with Microsoft accounts using MSAL
- Job Number Search: Search for all QC reports associated with a campaign number
- 📊 Real-time Progress Indicator: Visual progress bar during campaign search
- Dual View Mode:
- Parsed Data View: Structured display of check results with filtering
- Embedded Reports: View original HTML reports inline
- Aggregated Summary: Overview of all checks across multiple files
- Quick Navigation: Click "View Details" link next to error files to jump directly to that report
- HTML Export Options:
- Export Combined Report: All reports in a single HTML file
- Export Error Reports Only: Filter to only files with errors
- PDF Export: Export combined reports as a single PDF document (requires WeasyPrint setup)
- Error Highlighting: Quickly identify files with errors
- User Session Management: httpOnly cookies with automatic logout
- Scalable Search: Efficiently searches through 3500+ campaigns with automatic pagination
Requirements
- Python 3.8+
- Box.com account with API access
- Box JWT authentication configured
- Azure AD application registration (shared with AI QC application)
- Modern web browser with JavaScript enabled
Authentication
This application uses Azure AD (Microsoft Entra ID) for authentication via MSAL (Microsoft Authentication Library).
Azure AD Configuration
Tenant ID: e519c2e6-bc6d-4fdf-8d9c-923c2f002385
Client ID: 9079054c-9620-4757-a256-23413042f1ef
Required Redirect URIs (must be registered in Azure AD):
- Development:
http://localhost:7183 - Production:
https://your-production-domain.com
Security Features
- ✅ httpOnly cookies - XSS attack prevention
- ✅ PKCE flow - Authorization code protection
- ✅ RS256 JWT signatures - Cryptographic token validation
- ✅ Real-time token validation - Verified against Azure AD on each request
- ✅ SameSite=Lax - CSRF protection
- ✅ Secure flag (production) - HTTPS-only cookies
Installation
1. Clone the Repository
cd /path/to/web_hm_ai_qc_report
2. Create Virtual Environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
3. Install Dependencies
pip install -r requirements.txt
Dependencies include:
- Flask 3.0.0 - Web framework
- PyJWT 2.8.0 - JWT token validation
- cryptography 41.0.7 - Cryptographic operations
- requests 2.31.0 - HTTP requests for Azure AD
- boxsdk 3.9.2 - Box.com API integration
- beautifulsoup4 4.12.2 - HTML parsing
- weasyprint 60.1 - PDF generation (optional)
4. Configure Azure AD
The application is pre-configured to use the shared Azure AD app with AI QC. No additional setup needed unless creating a new Azure AD application.
5. Configure Box API
- Create a Box application at https://app.box.com/developers/console
- Configure JWT authentication
- Download the JSON config file
- Place it at
config/box_config.json
6. Environment Configuration
Copy the example environment file and update it:
cp .env.example .env
Edit .env and configure:
# Box Configuration
BOX_CONFIG_PATH=config/box_config.json
BOX_REPORT_FOLDER_ID=133295752718 # CAMPAIGNS folder ID
# Flask Configuration
FLASK_APP=app.py
FLASK_ENV=development
SECRET_KEY=<generate-strong-random-key>
# Azure AD Configuration
AZURE_TENANT_ID=e519c2e6-bc6d-4fdf-8d9c-923c2f002385
AZURE_CLIENT_ID=9079054c-9620-4757-a256-23413042f1ef
# Server Configuration
HOST=0.0.0.0
PORT=7183
Generate SECRET_KEY:
python3 -c "import secrets; print(secrets.token_hex(32))"
Running the Application
Development Mode
# Using the run script
./run.sh
# Or directly
python app.py
The application will be available at http://localhost:7183
Production Mode with Gunicorn
# Using the production run script
./run_prod.sh
# Or directly
gunicorn -c gunicorn_config.py wsgi:app
Project Structure
web_hm_ai_qc_report/
├── app.py # Main Flask application
├── auth_middleware.py # Azure AD authentication middleware
├── jwt_validator.py # JWT token validation
├── box_client.py # Box API client
├── report_parser.py # HTML report parser
├── config.py # Application configuration
├── wsgi.py # WSGI entry point
├── gunicorn_config.py # Gunicorn configuration
├── requirements.txt # Python dependencies
├── .env # Environment variables (not in git)
├── .env.example # Example environment file
├── .gitignore # Git ignore rules
├── templates/ # HTML templates
│ ├── index.html # Search page with auth UI
│ ├── dashboard.html # Main dashboard
│ ├── pdf_export.html # PDF export template
│ ├── 404.html # 404 error page
│ └── 500.html # 500 error page
├── static/ # Static files
│ ├── css/
│ │ └── style.css # Custom styles
│ └── js/
│ └── auth.js # MSAL authentication logic
├── config/ # Configuration files
│ └── box_config.json # Box JWT config (not in git)
└── logs/ # Application logs
├── access.log
└── error.log
Usage
1. Sign In
- Navigate to
http://localhost:7183 - You'll see "Authentication Required" screen
- Click "Sign in with Microsoft"
- Azure AD popup appears - enter your Microsoft credentials
- After successful login, you'll be redirected to the dashboard
2. Search for Reports
- Enter a campaign number (e.g.,
2069052,2069053) - Click "Search Reports"
- Progress indicator appears showing:
- "Connecting to Box..." - Authenticating with Box API
- "Searching campaigns folder..." - Looking through 3500+ campaigns
- "Processing results..." - Retrieving QC reports
- "Success!" - Found X reports, redirecting...
- Search typically takes 5-10 seconds for campaigns later in the alphabetical list
3. View Dashboard
The dashboard shows:
- Summary: Overall statistics (files checked, total checks, passed/errors/warnings)
- Files with Errors: Quick list of problematic files with "View Details" links
- Click the "View Details" link to jump directly to that report and see which checks failed
- Parsed Data View: Structured check results with filtering options
- Embedded Reports: Original HTML reports in iframes
4. Filter Results
Use the filter buttons to view:
- All reports
- Only files with errors
- Only files that passed all checks
5. Export Reports
Two export options are available:
Export Combined Report as HTML (Green button):
- Downloads all reports in a single HTML file
- Includes all files (passed and failed)
- Can be converted to PDF using browser's Print function
Export Error Reports Only (Red button):
- Only appears when there are files with errors
- Downloads only files that have errors
- Excludes files that passed all checks
- Filename includes "ERRORS_ONLY" suffix
- Shows warning banner indicating filtered content
To convert HTML export to PDF:
- Open the downloaded HTML file in your browser
- Press Ctrl/Cmd + P to print
- Select "Save as PDF" as the destination
Note: Native PDF export requires WeasyPrint system dependencies (see Troubleshooting).
6. Sign Out
Click the "Logout" button in the navbar to sign out.
API Endpoints
Public Endpoints
GET /health- Health check endpoint (no authentication required)GET /auth/status- Check authentication statusPOST /auth/login- Process Azure AD tokenPOST /auth/logout- Clear authentication
Protected Endpoints (Require Authentication)
POST /search- Search for reports by job number (JSON API)GET /dashboard/<job_number>- Dashboard for specific jobGET /api/report/<file_id>- Get parsed report data (JSON)GET /api/report/<file_id>/raw- Get raw HTML reportGET /export/html/<job_number>- Export combined HTML report (all files)GET /export/html/<job_number>/errors- Export HTML report (error files only)GET /export/pdf/<job_number>- Export combined PDF (requires WeasyPrint)
Box Folder Structure
The application expects reports in Box to be organized in the CAMPAIGNS folder structure:
CAMPAIGNS (133295752718)/
├── 2069052/ # Campaign number folder
│ ├── CAMPAIGN_ASSETS/
│ ├── JOBS/
│ └── QC/ # QC reports subfolder
│ ├── file1_QC.html
│ ├── file2_QC.html
│ └── file3_QC.html
└── 2069053/ # Another campaign
└── QC/
└── file4_QC.html
Search Behavior:
- Application searches through all campaign folders (3500+) to find the specified campaign number
- Automatic pagination handles large folder lists efficiently
- Progress bar provides real-time feedback during search (typically 5-10 seconds)
- Once campaign found, retrieves all HTML files from the QC subfolder
- Falls back to filename-based search if folder structure not found
Deployment
Prerequisites
- Azure AD Redirect URI: Add your production domain to Azure AD app registration
- HTTPS Certificate: Required for production (httpOnly cookies with Secure flag)
- Environment Variables: Update
.envfor production settings
Apache with mod_wsgi
- Install Apache and mod_wsgi:
sudo apt-get install apache2 libapache2-mod-wsgi-py3
- Create Apache configuration:
<VirtualHost *:443>
ServerName your-domain.com
SSLEngine on
SSLCertificateFile /path/to/cert.pem
SSLCertificateKeyFile /path/to/key.pem
WSGIDaemonProcess qc_dashboard user=www-data group=www-data threads=5
WSGIScriptAlias / /path/to/web_hm_ai_qc_report/wsgi.py
<Directory /path/to/web_hm_ai_qc_report>
WSGIProcessGroup qc_dashboard
WSGIApplicationGroup %{GLOBAL}
Require all granted
</Directory>
</VirtualHost>
Nginx with Gunicorn
- Start Gunicorn:
gunicorn -c gunicorn_config.py wsgi:app
- Configure Nginx:
server {
listen 443 ssl;
server_name your-domain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://127.0.0.1:7183;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /static {
alias /path/to/web_hm_ai_qc_report/static;
}
}
Systemd Service
Create /etc/systemd/system/qc-dashboard.service:
[Unit]
Description=QC Report Dashboard
After=network.target
[Service]
User=www-data
Group=www-data
WorkingDirectory=/path/to/web_hm_ai_qc_report
Environment="PATH=/path/to/web_hm_ai_qc_report/venv/bin"
Environment="FLASK_ENV=production"
ExecStart=/path/to/web_hm_ai_qc_report/venv/bin/gunicorn -c gunicorn_config.py wsgi:app
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl enable qc-dashboard
sudo systemctl start qc-dashboard
Troubleshooting
Authentication Issues
"Authentication Required" screen won't go away:
- Clear browser cache and cookies
- Verify
http://localhost:7183is in Azure AD redirect URIs - Check browser console for MSAL errors (F12)
- Ensure JavaScript is enabled
"Token validation failed":
- Check system clock (NTP sync)
- Verify tenant ID matches:
e519c2e6-bc6d-4fdf-8d9c-923c2f002385 - Confirm Azure AD app is properly configured
Popup blocked:
- Allow popups for localhost:7183 in browser settings
- Try using a different browser
Box Authentication Failed
- Verify
box_config.jsonis in theconfig/directory - Check that the Box service account has access to the report folder
- Ensure JWT authentication is properly configured in Box
No Reports Found
- Verify the campaign number exists in the CAMPAIGNS folder
- Check that campaign has a QC subfolder
- Ensure HTML files exist in CAMPAIGNS/{CampaignNumber}/QC/
- Verify
BOX_REPORT_FOLDER_ID=133295752718(CAMPAIGNS folder) - Check Box service account has access to the CAMPAIGNS folder
Search Takes Too Long
- Normal search time: 5-10 seconds for campaigns late in alphabetical order
- Campaign 2069052 is at position ~3507, requiring pagination through multiple API calls
- Progress bar shows current status - this is expected behavior
- Earlier campaign numbers (e.g., 1001391) will be found faster
PDF Export Fails
"PDF export temporarily unavailable" error:
WeasyPrint requires system dependencies. On macOS:
# Install dependencies
brew install gobject-introspection cairo pango gdk-pixbuf libffi
# Set environment variables
export PKG_CONFIG_PATH="/opt/homebrew/opt/libffi/lib/pkgconfig"
# Reinstall weasyprint
source venv/bin/activate
pip uninstall weasyprint
pip install weasyprint
# Re-enable in app.py (remove comment from line 8)
from weasyprint import HTML
On Ubuntu/Debian:
sudo apt-get install python3-cffi python3-brotli libpango-1.0-0 libharfbuzz0b libpangoft2-1.0-0
Performance Issues
- Increase Gunicorn workers in
gunicorn_config.py - Implement caching for frequently accessed reports
- Consider pagination for jobs with many reports
CORS Issues (Production)
If deploying to a different domain than Azure AD expects:
- Add the domain to Azure AD redirect URIs
- Update MSAL
redirectUriinstatic/js/auth.jsif needed
Security Considerations
Production Checklist
- Generate strong
SECRET_KEY(64+ character random string) - Set
FLASK_ENV=production - Enable HTTPS (required for Secure cookies)
- Add production domain to Azure AD redirect URIs
- Restrict Box API access to minimum required permissions
- Enable firewall rules to restrict server access
- Set up log monitoring and alerting
- Regular security updates for all dependencies
Authentication Flow
- User accesses application → Redirected to "Sign in with Microsoft"
- User clicks login → MSAL opens Azure AD popup
- User enters credentials → Azure AD validates
- Azure AD returns JWT token → Frontend sends to backend
- Backend validates JWT → Creates httpOnly cookie
- Subsequent requests include cookie → Validated on each request
- User clicks logout → Cookie cleared, MSAL session ended
License
Internal use only - H&M QC System
Support
For issues or questions, contact the development team.
Version History
v2.2.0 - Enhanced Navigation and Export (January 2026)
- Added Quick Navigation: "View Details" links next to error files for direct navigation
- Automatically switches to Parsed Data View tab
- Scrolls to and expands the specific report
- Highlights the report briefly for easy identification
- Added Error-Only Export: New export option to download only files with errors
- Separate red button appears when errors are present
- Filename includes "ERRORS_ONLY" suffix
- Export includes warning banner to indicate filtered content
- Improved User Experience: Streamlined workflow for reviewing and sharing error reports
v2.1.0 - CAMPAIGNS Folder Integration (December 2025)
- Updated folder structure to support CAMPAIGNS/{CampaignNumber}/QC/ hierarchy
- Added real-time progress indicator with status updates during search
- Implemented automatic pagination for searching 3500+ campaign folders
- Updated Box API integration with proper fields parameter for efficient data retrieval
- Enhanced search performance with Box SDK iterator optimization
- Changed folder ID to CAMPAIGNS folder (133295752718)
- Added visual feedback: spinner, progress bar, and status messages
- Improved error handling for missing QC subfolders
v2.0.0 - Authentication Update (November 2025)
- Added Azure AD authentication with MSAL
- Implemented JWT validation and httpOnly cookies
- Protected all API endpoints
- Added user session management
- Updated to use port 7183 (shared with AI QC)
v1.0.0 - Initial Release
- Basic QC report viewing functionality
- Box.com integration
- PDF export capability