The previous in-memory dict only worked with a single gunicorn worker.
With workers=2 in gunicorn_config.py, the async-search worker stored
the result in its own process memory while the dashboard request
landed on the other worker ~50% of the time — cache miss → fell
through to a synchronous Box fetch → exceeded the GCP load
balancer's 30s timeout, returning "stream timeout" to the user even
though the search itself succeeded.
Now stores cache entries as pickled files at storage/cache/<key>.pkl,
shared across workers via the existing volume mount. Atomic writes
via tempfile + os.replace. TTL still 30 minutes. Public API
(cache_set/get/delete/cleanup) is unchanged so call sites in
reporting/routes.py continue to work.
Lifted JWT-cookie auth pattern from the AI QC sibling project:
core/auth/middleware.py validates Azure AD JWTs and stores them in
an httpOnly cookie (hm_aiqc_auth_token). Tenant membership is
enforced by JWTValidator's tid check, which is sufficient for the
tenant-wide access policy chosen for this project.
templates/login.html now drives an MSAL.js popup that POSTs the
ID token to /auth/login. base.html exposes Azure config to all
pages so the logout button can also clear the MSAL session.
app.py's @before_request now checks the JWT cookie and exposes
g.user; modules read user identity via core.auth.current_user_email
so usage logs and created_by columns now record the signed-in
user's email rather than a session value.
Legacy username/password code removed: top-level auth_middleware.py,
jwt_validator.py, deploy/generate_password.py.
- core/health blueprint exposes GET /health for deploy smoke tests
- Replace db.create_all() + ensure_schema() ALTER patch with Alembic
- Initial migration captures current schema (5 tables, all indexes)
- docker-entrypoint runs wait_for_db.py + flask db upgrade before gunicorn
The "Global Pricing Reference" is no longer a single file at
storage/reference/global_pricing.json. Pricing references are now
first-class DB rows (PricingReference model), uploadable as a library
in the Campaigns tab and selectable per-run alongside the campaign
presentation dropdown on the HM QC and Video QC configure pages.
New:
- core/models/pricing_reference.py — PricingReference model: id, name,
pdf_filename, pdf_path, parsed_content, parsed_data_json, status,
created_at/by. get_lookup() deserializes parsed_data_json; to_dict()
powers the dropdown API.
- /campaigns/pricing/upload — creates a PricingReference row, saves PDF
under storage/pricing_references/<id>/, kicks off background parse.
- /campaigns/pricing/<id> DELETE, /campaigns/api/pricing/list,
/campaigns/api/pricing/status/<id>.
- Campaigns index: "Pricing References" table card (mirrors the
presentations card) + upload form with optional name field.
Changed:
- pricing_parser: parse_pricing_pdf_to_dict returns (dict, raw_text);
new parse_pricing_reference(id) runs the parse against a DB row and
sets status to ready/error. Legacy file-based path removed.
- QCExecutor and VideoQCExecutor accept pricing_reference_id; load the
row into context['pricing_reference']={id, name, lookup}.
- BatchQCExecutor and BatchVideoQCExecutor thread pricing_reference_id
through to per-file executors.
- price_currency_check._validate_currency reads context instead of the
disk file; returns 'skipped_no_reference' if no ref attached.
- HM QC + Video QC /execute and /execute/batch routes pass
pricing_reference_id from the JSON payload.
- Configure templates for HM QC and Video QC add a second dropdown
"Pricing Reference (Optional)" loaded from /campaigns/api/pricing/list.
Backwards compatibility:
- app.py: on startup, if storage/reference/global_pricing.json exists
and the pricing_references table is empty, import it as a
"Default (legacy global)" PricingReference row so existing installs
keep a valid reference attached (user can pick it at configure time).
- config.py: retains GLOBAL_PRICING_{PDF,JSON}_PATH for the legacy
importer; adds PRICING_REF_STORAGE_PATH for the new per-row storage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Video QC: Switch to Google Gemini direct video analysis as default (OpenAI frame grid fallback)
- HM QC: Group reports by batch with collapsible sections, ZIP download per batch
- HM QC: Generate asset thumbnails (150px) displayed in report listings
- Speed: Remove artificial delays, add ThreadPoolExecutor(2) for parallel batch processing
- Price detection: Improved prompt with country context, detect all prices, increased text limit
- New Printer Check module: CSV-to-PDF cross-referencing ported from CrossMatch Rust app
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces a new Campaigns module for uploading campaign presentation PDFs
that QC checks reference to validate assets against campaign-specific
guidelines (typography, layout, copy, pricing format). Also adds a global
pricing reference system that maps country codes to currency symbols and
formats for deterministic price/currency validation.
- New CampaignPresentation model + campaigns blueprint with CRUD routes
- PDF parsing via LlamaParse (text + multimodal page images)
- Global pricing PDF parsed into structured JSON lookup
- Campaign context injected into both image and video QC executors
- Quality checks enhanced with campaign guidelines in LLM prompts
- Price/currency check uses global pricing lookup (saves an LLM call)
- Campaign dropdown added to HM QC and Video QC configure pages
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New UsageLog model tracking every LLM API call (provider, model,
tokens, estimated cost, user, module, check name)
- Instrument LLMConfig.call_vision_api() to auto-log each call
- New /usage tab in nav bar with dashboard showing:
- Summary cards (total calls, tokens, estimated cost)
- Breakdowns by provider, model, tool, and user
- Recent API calls table
- Time filters (All Time, 30 Days, 7 Days, Today)
- Cost estimates based on per-model token pricing
- Pass logged-in user through executor context for tracking
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The require_auth decorator was never applied to routes, leaving
the entire app publicly accessible. Added a before_request hook
that redirects unauthenticated users to the login page.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add Dockerfile, docker-compose.yml, .dockerignore for containerised deployment
- Add deploy/ scripts (deploy.sh, nginx/apache configs, password generator)
- Replace MSAL/Azure AD auth with local username/password authentication
- Add login.html template
- Simplify app.py, middleware, and auth routes for production use
- Update gunicorn_config.py and wsgi.py for Docker/production
- Update templates to work with new auth and URL prefix handling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The popup login flow was broken because the Flask 302 redirect from
/ to /reporting/index caused MSAL in the popup to consume the auth
code hash before the parent window could detect it, leaving the
parent stuck on "Authenticating..." while the popup rendered the
full app.
- Switch signIn() from loginPopup() to loginRedirect()
- Add handleRedirectPromise() at start of initAuth() to process
the auth code on page load after returning from Microsoft
- Change root route from 302 redirect to direct template render
so the #code=... hash fragment is preserved for MSAL
- Switch signOut() from logoutPopup() to clearCache()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New blueprint-based module system (hm_qc, video_qc, video_master,
reporting), core framework (database, config, templates), and
unified web interface with progress tracking and tab navigation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>