ai_qc

Author	SHA1	Message	Date
nickviljoen	71bb9a6295	fix(hp_copy_review): correct llm casing + route HP reports to /hp/ folder Two bugs surfaced by the first dev smoke test: 1. Profile JSON declared "llm": "gemini" (lowercase). llm_config's dispatcher compares model_name == "Gemini" case-sensitively (matches the rest of the codebase), so the check fell through to "Invalid model selected" and never reached the API. Every other profile uses "Gemini" with capital G. Spec mistake — fixed. 2. get_client_from_profile() resolves the per-report output folder from the profile_id via hardcoded prefix matches. No 'hp_' branch existed, so hp_copy_review reports landed under output-dev/general/ instead of output-dev/hp/ — the UI then couldn't find them. Added 'hp_' → 'hp' alongside the existing mappings. The check itself works correctly otherwise: profile_source was user_selected, brand resolved to 'hp', and the reference asset was successfully attached. Bug 1 just prevented Gemini from being called. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 22:07:25 +02:00
nickviljoen	68a2360811	feat(report): render hp_copy_review findings as a structured table Both HTML report generators (generate_html_content and generate_comprehensive_html_report) get a small case: when a check result has a 'findings' array in its json_data, render it as a priority-coloured table with quote/issue/suggested-fix/source columns instead of the default response-text block. The summary field (when present) renders above the table. Fallback to text rendering when findings is absent — every existing check is unaffected. All string fields from the LLM are HTML-escaped via html.escape() to neutralise stray <, >, &, or quote characters. Inline CSS for .findings-table / .priority-pill / .priority-high\|medium\|low / .muted is added to both stylesheets so the two generators stay visually in sync.	2026-05-17 21:37:35 +02:00
nickviljoen	0e833447c0	fix(brand-guidelines): inject xlsx Source Messaging summary into check prompts Task 5 review found that get_reference_asset_content treated all non-localization-matrix .xlsx files as opaque ('reference file uploaded'), never reading the Gemini summary that excel_processor writes. That meant hp_copy_review would see no canonical messaging and fire its score-0 fallback on every real asset. Extend the .xlsx branch to mirror the PDF pattern: when the file record has a summary_path (set by excel_processor after a successful source-messaging summary), read and inject the Markdown into the reference content block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:28:32 +02:00
nickviljoen	568465f9be	fix(brand-guidelines): preserve localization-matrix parsing in xlsx dispatch The prior Task 2 commit (`295305e`) over-replaced existing logic that recognised certain .xlsx/.xls uploads as localization matrices and set asset_type='localization_matrix'. That field is load-bearing in two downstream sites (api_server.py:1628 and :1986) that build localization context for QC checks; destroying it would silently break any existing client using localization matrices. Restore the original try-localization-matrix-first path; only fall through to excel_processor (HP Source Messaging summary) when the file isn't a parseable localization matrix. Also restore .xls support and tag Source Messaging uploads as asset_type='source_messaging' so downstream code can distinguish them from localization matrices. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:03:56 +02:00
nickviljoen	295305ef2d	feat(brand-guidelines): route .xlsx uploads to excel_processor The /api/brand_guidelines POST handler now dispatches by extension: .pdf → pdf_processor.process_pdf_file (existing), .xlsx → excel_processor.process_excel_file (new). Same DB record shape; cover image is null for Excel since there's no first-page analogue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:02:05 +02:00
nickviljoen	bf89466d06	feat(settings): default-profile UI per client (admin-only) for Box webhook flow Adds a "Default Profile" sub-tab to the Settings modal. Lists the current client's profiles as radio buttons, shows which is the active default and whether it's a runtime override or the static value from client_config.py. Admins click a different profile + Set to override; clear-override button reverts to the static value. Storage layer: backend/client_defaults.json (gitignored, per-server), following the same pattern as user_access.json. Resolution order in client_config.get_default_profile(): override → static default_profile field → None. The Box webhook handler is the sole consumer that needs profile selection without a logged-in user; it now reads via get_default_profile() so overrides take effect. Why a separate JSON, not rewriting client_config.py: a buggy override write can never break server boot — worst case the override is ignored and the static value applies. Cleaner separation between "static config you check in" and "runtime overrides admins make". Backend: - client_config.get_default_profile(client_id) — resolver - client_config.set_default_profile(client_id, profile_id) — validates + writes (rejects profiles not in client's profile list) - client_config.clear_default_profile_override(client_id) - GET /api/clients/<id>/default_profile (any auth'd user) - PUT /api/clients/<id>/default_profile (admin-only, _require_admin) - DELETE /api/clients/<id>/default_profile (admin-only) - Box webhook handler in api_server.py now uses get_default_profile() Frontend: - New "Default Profile" tab button + tab content in Settings modal - showTab hook loads settings when tab activates - loadDefaultProfileSettings / saveDefaultProfile / clearDefaultProfileOverride functions - DOM-construction (createElement + textContent) used throughout — no innerHTML with interpolated values, so user-controllable strings (client_id, profile_id) can never cause XSS Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:50:20 +02:00
nickviljoen	b7e9c483de	feat(box-jwt): move source file to _PROCESSED after successful run Solves two problems at once: 1. Folder cleanliness — INCOMING accumulates indefinitely otherwise. 2. Duplicate-upload re-trigger — Box V2's FILE.UPLOADED trigger doesn't fire when the same filename is "uploaded as new version" of an existing file. By moving the source out of INCOMING after success, re-uploading the same filename becomes a genuinely-new file event again and the webhook fires normally. After report uploads successfully to the REPORTS folder, the worker: 1. find_or_create_subfolder(<INCOMING>, '_PROCESSED') — idempotent 2. move_file(file_id, <_PROCESSED>, new_name=f'{session_id}_{filename}') The session_id prefix gives the archived file a sortable timestamp and ties it back to the matching QC_Report_<session_id>_*.html in REPORTS. Defensive: the move only runs if the report upload to Box succeeded. If Box delivery failed, the source stays in INCOMING so a retry just means re-uploading. Move failures are non-fatal — logged + recorded in result_data['box_source_move_error'], analysis still marked complete. Adds four helpers to box_jwt_client.py: - find_subfolder_by_name(parent, name) → Optional[str] - create_subfolder(parent, name) → str - find_or_create_subfolder(parent, name) → str (idempotent) - move_file(file_id, target_folder, new_name=None) → Dict Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:29:45 +02:00
nickviljoen	c75f3a99b9	fix(reports): render check details for status='success' in generate_comprehensive_html_report generate_comprehensive_html_report filtered check rendering with `status == 'completed'`, but the modern check pipeline (process_single_check via /api/start_analysis and the Phase 4 Box webhook flow) returns `status == 'success'`. Only the legacy process_single_check_with_triage returns 'completed'. Result: every report produced by the modern pipeline had an empty "Detailed Analysis Results" section — just the heading with nothing below it. Surfaced when Nick ran a LOREAL Box-webhook test on 2026-05-17: webhook fired correctly, 4 LLM checks ran, scores came back, technical pre-flight rendered, but the per-check accordion was empty. Fix: accept either status value, so both modern and legacy code paths render correctly. Errored checks (status='error') still skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:01:21 +02:00
nickviljoen	a99c8601f0	Merge develop into feature/box-jwt-integration Brings in the 4 commits that landed on develop after this branch was cut: the chore/untrack-env-files PR (#7) and the fix/tech-section-in-html-content PR (#8). Conflict resolution: - .gitignore: both branches added `backend/config/box_jwt_config.json` in slightly different positions. Kept both sets of additions — development.env + production.env (from develop) and box_jwt_config.json (from this branch). - api_server.py: auto-merged cleanly; the Phase 4 webhook endpoint and the Phase 3 technical-section fix touch different regions of the file. Verified after merge: api_server imports cleanly, box_webhook route registered, _render_technical_section_html callable, 60 QC apps and 15 profiles load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 23:42:00 +02:00
nickviljoen	096eba747d	fix(tech-check): also render Technical section in generate_html_content Phase 3 patched generate_comprehensive_html_report() but missed the older generate_html_content() generator. The /api/start_analysis flow with output_mode='html' (the path the web UI's download button actually triggers) routes through generate_html_content, so the Technical Details section never appeared in user-downloaded reports despite the technical_report data being present in the underlying result_data. Mirrors the Phase 3 treatment exactly: pre-builds technical_html via _render_technical_section_html(), adds the .technical / .technical-grid / .tech-row CSS rules, and injects {technical_html} between the summary block and the Detailed Analysis Results header. generate_comprehensive_html_report() retains the same logic for the /api/process_file path (line 4187) and the new Box webhook flow (_run_box_triggered_analysis on the Phase 4 branch). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 23:28:52 +02:00
nickviljoen	8f995d557b	feat(box-jwt): JWT service-account client + webhook ingestion endpoint Adds machine-to-machine Box integration alongside the existing per-user OAuth scaffolding. The new JWT client (backend/box_jwt_client.py) is the auth/file/webhook surface used for unattended workflows: load the Custom App JSON config, sign a JWT assertion, exchange for a 60-minute service-account access token (cached + refreshed automatically), and expose file download/upload + V2 webhook CRUD + HMAC signature verification. Wires a new POST /api/box/webhook endpoint (NOT @auth.require_auth — it authenticates each delivery via Box's HMAC signature headers) that: 1. Verifies the signature against env-configured signing keys (BOX_WEBHOOK_PRIMARY_KEY / BOX_WEBHOOK_SECONDARY_KEY). 2. Dedups deliveries by box-delivery-id with a bounded in-memory cache. 3. Maps the source folder to a client via a new get_client_by_box_folder() helper on client_config. 4. Spawns a background thread that downloads the file, runs the same technical pre-flight + LLM check pipeline as the user-uploaded path, writes the HTML report to output/<client>/, uploads the report back to the client's box_reports_folder_id, and logs the run with a synthetic 'box_webhook' user. Webhook runs skip media-plan / localization / OCR context — those are user-UI concepts without a meaningful source in unattended runs. The existing /api/start_analysis path is unchanged. client_config.py gains three optional per-client fields used by the new flow when present: `box_folder_id`, `box_reports_folder_id`, and `default_profile`. Existing client entries keep working without them. .gitignore now excludes backend/config/box_jwt_config.json so the JWT config (with its embedded private key + passphrase) never lands in git. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 22:51:34 +02:00
nickviljoen	377efe30e5	feat(tech-check): show Technical Details section in HTML report Adds a new "Technical Details" card to generate_comprehensive_html_report() between the summary and the per-check detailed results. Renders only the fields present on the technical_report dict (file size, dimensions, DPI, page count, duration, fonts, etc. — vary by file type) and shows a prominent filename-vs-actual match badge when filename hints were parsed. If technical_report is absent or kind==unknown, the section is omitted entirely so reports for assets we can't inspect (e.g. exotic extensions) keep the existing layout unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 22:00:25 +02:00
nickviljoen	2b287f3dbb	feat(tech-check): wire pre-flight into visual + document analysis Runs technical_check.inspect() immediately after file save on both /api/start_analysis (visual flow) and /api/document/start_analysis (document flow). The report is stashed on progress_tracker[session_id] so it survives across the background thread boundary, then surfaces two ways: 1. Each LLM check in the visual flow gets a "Technical metadata" preamble prepended to its prompt via format_for_llm_prompt(), so the model knows the file's actual dimensions, format, page count, etc. without having to infer them visually. 2. result_data['technical_report'] in both flows carries the same dict through to the frontend for UI rendering (next commit). Pre-flight is best-effort: if it fails for any reason, analysis still proceeds without the preamble (silent except for the report.errors list). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 21:57:11 +02:00
nickviljoen	90563b8cf2	Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5) Multi-page PDF QC for AXA Ireland policy documents. Runs as a third mode alongside static + video, gated on profile.mode. New code isolated under backend/document_mode/ with new endpoints under /api/document/*. Phase 1 — Spine + 6 deterministic doc-scope checks ($0, runs in seconds): - Scope-aware dispatcher (document/targeted/page_sample/page_pair/page_each) - axa_font_inventory, axa_phone_inventory, axa_bold_words_definitions, axa_page_numbering, axa_print_code, axa_omg_versioning - Bootstrap bold-words dictionary extracted from Example 1 General Definitions Phase 3 — Old-vs-new diff (~$0.50/run, 3-5 min): - Page alignment via difflib SequenceMatcher (windowed fuzzy match) - Vision-LLM page-pair diff via Gemini 2.5 Pro (8 concurrent) - Two-slot upload UX, axa_policy_document_diff profile, mode=document_diff Phase 4 — PDF accessibility (PyMuPDF, $0): - 9 PDF/UA-1 aligned criteria (tagged structure, /MarkInfo, title, /Lang, encryption, font embedding, PDF version, XMP UA-conformance, alt-text) - _run_verapdf() stub for optional Java-based veraPDF integration later Phase 5 — Print preflight (PyMuPDF, $0): - 7 criteria (page geometry, bleed, image colour spaces, image DPI, transparency, PDF/X conformance, spot colours) Profile additions: - axa_policy_document — 8 deterministic checks, $0 cost - axa_policy_document_diff — 1 page-pair LLM check, ~$0.50/run API additions: - POST /api/document/start_analysis (single PDF) - POST /api/document/start_diff (old + new PDFs) Frontend additions: - Third profile.mode value (document_diff) in applyProfileMode() - Two-slot upload UX with PDF-only file pickers - checkFormValidity() branches by mode for the analyse-button gate Smoke-tested locally against Example 1 (Home Insurance V8, 86pp) and Example 2 (Landlord V1 vs V10, 68→74pp) with real findings caught including bold-words gaps, missing PDF/UA flag, transparency on press, V1→V10 bold-formatting fixes. Plan + integration map + gotchas in backend/AXA_DOCUMENT_MODE_PLAN.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 18:38:14 +02:00
nickviljoen	b32e8f0c8b	Add wsj podcast profile to Dow Jones client, File naming check added to all profiles	2026-04-29 18:09:58 +02:00
nickviljoen	24ea62b082	Fix /api/access_request iterating list_access_entries() as a list list_access_entries() returns a dict {default_clients, entries} but the endpoint iterated it directly, which yields the dict keys (strings) and then crashed on .get('is_admin') with "'str' object has no attribute 'get'". Read access_data['entries'] instead so admin recipients are collected correctly and the request email actually sends. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 08:20:25 +02:00
nickviljoen	f17a4ed6da	Box redirect URI: infer from hostname when X-Forwarded-Host is absent The previous fix relied on Apache forwarding X-Forwarded-Host, but on optical-dev that header isn't set. Apache uses ProxyPreserveHost (so request.host correctly resolves to optical-dev.oliver.solutions) but the backend connection is plain http and Flask sees no path prefix, so the fallback emitted "http://optical-dev.oliver.solutions/auth/box/callback" — which Box rejected as "insecure_redirect_uri" (no HTTPS) and which is also missing the required /ai_qc/ prefix. Resolution order is now: 1. BOX_REDIRECT_URI env var (escape hatch / unusual deploys). 2. X-Forwarded-Host header if Apache happens to send it. 3. Otherwise: infer from request.host. Any host that isn't localhost or 127.0.0.1 is treated as the optical-dev / optical-prod proxy and gets HTTPS + the /ai_qc/ prefix. localhost stays http and rootless. Verified all five paths (dev with and without XF-Host, laptop on localhost and 127.0.0.1, explicit override) produce the right URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 15:55:14 +02:00
nickviljoen	7c3945417a	Compute Box OAuth redirect URI from the request Caught a redirect_uri_mismatch on the dev server: the env file was the localhost one (BOX_REDIRECT_URI=http://localhost:7183/auth/box/callback) which deploy.sh resets on every deploy, so the dev server kept telling Box "redirect me to localhost". Same thing would have hit prod. Switched to request-based detection so the same code works on laptop, dev, and prod: - box_client.build_authorize_url and exchange_code_for_tokens now take redirect_uri as an explicit parameter (the two URIs MUST match — Box rejects the token exchange otherwise). - New _box_redirect_uri() helper in api_server: prefers BOX_REDIRECT_URI if explicitly set (escape hatch), otherwise reads X-Forwarded-Host (set by Apache when behind the optical-dev / optical-prod reverse proxy, where the app is mounted at /ai_qc/), and falls back to request.host for direct local access. - Dropped the per-env BOX_REDIRECT_URI from the four env files. Templates keep it commented out as documentation, and now also list all three redirect URIs you'll need to register in the Box developer console. - box_client.is_configured() no longer gates on the redirect URI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 15:50:59 +02:00
nickviljoen	c4e18fcd99	PR1: Box.com OAuth + token storage First slice of the Box automation work. Adds the OAuth round-trip and a smoke-test endpoint, but no automation logic or watcher yet — those land in PR2 and PR3. - New `backend/box_client.py`: OAuth helpers (build_authorize_url, exchange_code_for_tokens, refresh_tokens, revoke_tokens), JWT-signed state for CSRF protection, get_box_user, get_valid_access_token (refreshes if expired and persists the rotated refresh token Box returns on every refresh), and a list_folder_items helper used by the smoke test. - New `backend/box_tokens.py`: thread-safe JSON-backed per-user token store at backend/box_tokens.json (gitignored — refresh tokens grant long-lived Box access). Persists access_token, refresh_token, computed access_token_expires_at, and the connected Box identity (id / login / name). - New endpoints in `backend/api_server.py`: - `GET /auth/box/login` — auth-required, redirects the signed-in user to Box's authorize URL with a JWT-signed state. - `GET /auth/box/callback` — verifies the state, exchanges the code, fetches /users/me, persists the tokens, and returns a small self-closing HTML page (closes the popup if opened from one). - `GET /api/box/status` — auth-required, returns {connected, configured, box_user_login, …} for the current user. - `POST /api/box/disconnect` — auth-required, best-effort revoke at Box and clear the local tokens. - `GET /api/box/test_folder?folder_id=…` — auth-required smoke test that lists a Box folder using the user's stored tokens. Default folder_id is "0" (the user's All Files root). Used to prove the OAuth round-trip works end-to-end before PR3 wires the watcher. - Box config in env (`BOX_CLIENT_ID` / `BOX_CLIENT_SECRET` / `BOX_REDIRECT_URI`) added to all four env files and both .env.template files (placeholders). Box rotates refresh tokens — every successful refresh returns a new pair and invalidates the previous one. `get_valid_access_token()` always writes the new pair back via `box_tokens.save_tokens()`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 15:39:27 +02:00
nickviljoen	125c5e7064	Simplify settings UX and add client access request flow Settings panel: - Reference Assets tab: collapse the Brand Name + Tags + Description form to a single Name field; the user-entered name now drives the dropdown label on the main configuration page (falls back to filename for legacy records). - Media Plan tab: add a Name field. Backend stores display_name on the plan record, and both the active-plan card and the main-page dropdown prefer display_name (falling back to original_filename for old plans). - Modal footer is now context-aware: Save Profile + Cancel show only on the Profile / Create Profile tabs; Reference Assets / QC Tools / Media Plan show a single green Save button that closes the modal. Client access request: - New "Request Client Access" tile on the client picker, alongside the user's existing client tiles. Opens a modal that auto-fills name + email from the MSAL session (read-only), shows checkboxes for clients the user does not already have, and accepts an optional reason. - New POST /api/access_request endpoint (auth-required) that takes identity from the verified session, validates the requested clients, looks up admin recipients via user_access.list_access_entries, and emails them via the new email_service module (Mailgun SMTP with STARTTLS). Reply-To is set to the requester. Logs an access_request event to the daily JSONL usage logs. - New GET /api/all_clients endpoint so the form can list clients the requester currently cannot see. - Mailgun SMTP credentials added to the four env files (and placeholders in the .env.template files) under SMTP_SERVER / SMTP_PORT / SMTP_USER / SMTP_PASSWORD / SENDER_EMAIL / ERROR_EMAIL / REPORT_EMAILS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 14:02:40 +02:00
nickviljoen	f85ba2069f	Refresh pricing and surface input/output tokens in reporting - Update Gemini 2.5 Pro output pricing from $5 to $10 per 1M tokens (verified against ai.google.dev on 2026-04-22); OpenAI GPT-4o unchanged. - Extend /api/client_usage_stats and /api/admin/users to return input tokens, output tokens, and per-provider cost breakdown. - Surface the new data in the client Reporting tab and admin users table, with K/M token formatting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 20:59:41 +02:00
nickviljoen	6592c38b0a	Add per-user client access control and admin management Default-deny access model with admin grant/revoke via new User Access tab. /api/clients filters by user grants; client-scoped endpoints enforce access server-side. Admin role and client grants persist in user_access.json with audit trail in usage logs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:33:09 +02:00
nickviljoen	20259dcad0	Add Honda client, video QC, session refresh, Amazon check tuning - Add Honda client with static_general and video_general profiles - Add video QC capability using Gemini native video analysis (4 checks: visual_quality, brand_consistency, text_legibility, pacing_flow) - Add video_general profile assigned to all 8 clients - Extend session lifetime with MSAL silent token refresh (proactive every 45min + reactive on expiry), switch cache to localStorage - Re-enable OCR layout measurements for Amazon checks - Add scope boundary notes to all 6 Amazon checks to prevent cross- check penalization (locale errors isolated to logo_country only) - Relax margins left-alignment tolerance from 1% to 4% to account for logo lockup internal padding - Update brand guidelines DB with Amazon localization matrix and processed Dove PDF summary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 14:53:52 +02:00
nickviljoen	0e211a3600	Fix client detection for Dow Jones, WSJ, MarketWatch and Boots profiles Reports for WSJ/MarketWatch/Dow Jones profiles were falling through to 'general' because get_client_from_profile() and the inline fallback mapping only handled loreal, diageo, unilever, and amazon. Added mappings for dow_jones/dj_/marketwatch/mw_/wsj_ -> dow_jones and boots_ -> boots in both locations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 13:20:09 +02:00
nickviljoen	6f3528b54f	Add Boots client QC profile with 5 compliance checks and split CLAUDE.md client docs New boots_static profile (5 checks, 2.0 weight each) for retail promotional artwork compliance: caveat rules, brand name accuracy (~170 names), offer mechanics, T&C wording, and currency/locale. Strict grading override (any check <6 = Fail). Guidelines embedded from 7 thematic guidance documents. Also splits client-specific documentation out of CLAUDE.md into separate CLAUDE_LOREAL.md, CLAUDE_AMAZON.md, CLAUDE_BOOTS.md, and CLAUDE_DOW_JONES.md files to reduce main file size. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 09:25:58 +02:00
nickviljoen	d5acdd962b	Fix long filename overflow in file queue, reports, and consolidated report - web_ui: add overflow:hidden and min-width:0 to .form-section to prevent long filenames breaking the CSS grid layout - web_ui: add overflow-x:hidden to queue list container - api_server: add word-break:break-all to .filename in individual reports - api_server: add table-layout:fixed and word-break to consolidated report table cells for proper text wrapping Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:09:06 +02:00
nickviljoen	eae86a27e9	Fix consolidated report using wrong grade: read from report data not recalculated The consolidated report was recalculating grades with determine_grade() which doesn't account for profile-specific grading overrides (e.g. Amazon checks where individual check failures force overall Fail). Now reads the actual grade stored in the report's summary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 14:13:41 +02:00
nickviljoen	c06b4d5848	Temporarily disable OCR measurements for demo stability OCR calibration needs more work - correct files were failing due to Tesseract bounding box inaccuracies on different server versions. Code is commented out and ready to re-enable after proper tuning. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 14:03:32 +02:00
nickviljoen	2856e175c3	Fix consolidated report: show human-readable explanations instead of raw JSON The Issue Summary column was showing raw LLM JSON output. Now extracts the explanation and recommendations from parsed json_data. Also uses display_name for check names (e.g. "Amazon Margins" not "Amazon_margins"). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 13:28:09 +02:00
nickviljoen	4d456f45e5	Add delete functionality for saved output reports - New API endpoint POST /api/delete_output_files (auth required) - Delete Selected button in saved files controls bar - Confirmation prompt before deletion - Auto-refreshes file list after successful delete - Path traversal protection via os.path.basename sanitization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 12:24:28 +02:00
nickviljoen	7b09897869	Fix OCR import error in process_single_check causing 500 on analysis Wrap OCR_RELEVANT_CHECKS import in try/except so analysis continues gracefully if ocr_measurement module fails to load. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 11:15:04 +02:00
nickviljoen	9f9777240a	Add OCR layout measurement module for precise spatial QC checks Adds Tesseract-based OCR pre-processing that computes pixel-level text positions, margins, spacing, and alignment before LLM analysis. This enables detection of subtle layout differences that vision models miss (e.g. 2.8% vs 6.4% headline margin, 83px vs 39px date gap). OCR measurements injected into 10 checks across all client profiles: - Amazon: margins, typography, headline_layout - Static General: element_alignment, safety_area, visual_hierarchy_general, text_readability_general, text_edge_clearance - L'Oreal: text_readability - Diageo/Unilever KV: visual_hierarchy Non-blocking: if Tesseract is unavailable, checks run with visual estimation only. Production requires: sudo apt install tesseract-ocr Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 11:00:07 +02:00
nickviljoen	d80a9fc9cf	Add localization matrix support via reference asset upload - New localization_processor.py: parses Excel localization matrices with MESSAGE A/B sections, extracting expected headline, dates, logo, legal per country - Excel files uploaded as reference assets are auto-detected and parsed as localization matrices if they contain MESSAGE A/B structure - During analysis, cross-references media plan creative_name (Message A/B) and country with parsed matrix to inject expected copy into QC prompts - LLM checks can now verify asset text matches the correct message version and market localization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 17:54:24 +02:00
nickviljoen	34a03be6cb	Amazon prompt tuning, report details for passing checks, and queue fix - Fix process queue bug: reset queue items to pending after completion so reprocessing works without page refresh - Show report details for 10/10 scores by extracting explanation and elements_found fields from Amazon check JSON responses, with a fallback that renders all JSON data as a structured summary - Add Amazon Static grade override: any individual check scoring below 6 forces overall grade to Fail (same logic as L'Oreal) - Box placement prompt: relax tape visibility rule from "all edges" to "at least one edge visible", prevent false positives on landscape formats where box sits near right edge - Headline layout prompt: fix LLM misreading one-word-per-line as combined lines in tall/portrait formats, score multi-sentence headlines in 9:16 as 6/10 (pass with recommendation) instead of fail Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 16:19:42 +02:00
nickviljoen	ff81d5aa38	Fix report save crash when recommended_adjustments is a list The background_contrast check now returns recommended_adjustments as a list instead of a string. The HTML generation called .lower() on it, causing an AttributeError that silently prevented reports from being saved to disk. Analysis completed fine but no output file was written. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 13:46:25 +02:00
nickviljoen	1c9c6be5ae	Cap overall score at 100 (120 for Unilever) across all code paths Profiles with weights totalling slightly over 10.0 (e.g. 6 x 1.67 = 10.02) could produce scores like 100.2. Added min() caps to all 4 score calculation locations to prevent scores exceeding the maximum. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 12:00:57 +02:00
nickviljoen	dd9ed44ad1	Fix saved files button alignment and add media plan selector to main page Long filenames no longer push View/Download buttons out of line. Added Media Plan dropdown in QC Configuration so users can opt-in to using a media plan per analysis. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-30 08:07:20 +02:00
nickviljoen	5429e4c684	Add media plan upload with automatic asset matching and validation Upload Excel media plans per client. On QC analysis, automatically match the uploaded file's name against the media plan's Asset IDs to validate dimensions and file type, and include the media plan context (country, language, placement, vendor) in QC check prompts. - New backend/media_plan_processor.py: Excel parsing, fuzzy filename matching, dimension/file-type validation, prompt context builder - New backend/media_plans/ directory for storage - API endpoints: POST/GET/DELETE /api/media_plan - Settings modal: new "Media Plan" tab for upload/manage - Analysis flow: auto-match + validation in response + context in prompts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 22:30:15 +02:00
nickviljoen	3f4cc149ad	Process multi-page PDF reference assets with LLM summarization PDF brand guidelines were previously ignored - QC checks received no content from uploaded PDFs. Now on upload, all pages are text-extracted, summarized by Gemini into a structured brand guidelines summary, and a cover image is extracted. QC checks receive the full summary in their prompt and the cover image as visual reference. - New backend/pdf_processor.py: text extraction, cover image, LLM summary - brand_guidelines_db.py: summary/cover path tracking, cleanup on delete - api_server.py: background processing on upload, summary-aware content retrieval, PDF cover image support, status/reprocess endpoints, startup backfill for existing unprocessed PDFs - web_ui.html: processing status badges and upload feedback for PDFs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 22:02:47 +02:00
nickviljoen	f3b09e2a09	Fix saved output files showing across all clients - Auto-save reports to client-specific subfolder (was saving to root) - Read client_id from form data (matching what frontend sends) - Add Amazon to profile-based client detection fallback - Fix get_client_from_profile matching 'static' as L'Oreal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 13:10:52 +02:00
nickviljoen	06a6562a92	Admin as full page, track user logins, fix reporting bugs - Convert admin panel from popup modal to dedicated full page section - Log user_login events on auth status check to track all visitors - Admin user list now shows all users (login visitors + analysts) - Fix cost field name (total_cost_usd vs estimated_cost_usd fallback) - Round scores to 2 decimal places in reporting tables - Reset reporting dashboard when switching clients - Clear stale data on client switch before loading new client Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 13:03:52 +02:00
nickviljoen	8a82df4438	Add client-scoped reporting dashboard and admin panel Move reporting from settings modal into a dedicated tab within each client's main view with date range filtering, usage stats, and cost tracking. Add admin panel with platform-wide user activity overview, accessible only to configured admin users. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 12:34:20 +02:00
nickviljoen	b84b749ffe	L'Oreal Static: fail asset if any individual check fails For the loreal_static profile only, override the grade to Fail if any single check scores below 6, regardless of the overall weighted score. Both checks must pass for the asset to pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 19:56:35 +02:00
nickviljoen	8c9401bf3c	Add combined check, client-scoped assets, UI restructure, reporting, and report consolidation - Create visual_readability_contrast combined check merging text readability and background contrast into a single LLM call for L'Oreal Static profile - Update loreal_static.json to use combined check (2 checks, 100-point scale) - Add client_id filtering to brand guidelines (upload, fetch, backfill migration) - Restructure settings modal from 5 tabs to 4: Profile, Create New Profile, Reference Assets, Reporting (removed Model Selection, merged Tools into Profile) - Add GET /api/profile_usage_stats endpoint with summary cards and recent analyses - Add POST /api/consolidate_reports endpoint generating HTML summary with pass/fail highlighting from multiple selected reports - Add report selection checkboxes and consolidation controls to saved files list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-04 19:10:32 +02:00
nickviljoen	8bc1256e82	Add usage tracking reports, profile versioning, and token tracking Implements three major feature enhancements: 1. Usage Tracking Reports - Command-line tool (generate_usage_report.py) for comprehensive usage reports - Supports text, JSON, and CSV output formats - Filters by date range, client, and user - Aggregates statistics by client, user, profile, and date - Automated report generation via cron jobs 2. Profile Auto-Versioning & Visibility Control - Automatic version control: edits create new versions (v2, v3, etc.) - Original profiles preserved for rollback capability - Profile visibility control (all clients vs client-specific) - Client-profile relationship management with dynamic updates - Audit trail with timestamps and user tracking 3. Actual Token Usage Tracking - Captures real token counts from OpenAI and Gemini APIs - Precise cost calculations instead of estimates (99% accuracy) - Per-check and per-provider token breakdowns - Pricing validation tool (validate_pricing.py) - Token usage optimization recommendations Key Files Added: - backend/generate_usage_report.py - Usage report generator - backend/validate_pricing.py - Pricing validation tool - backend/USAGE_REPORTS.md - Usage reports documentation - backend/PROFILE_MANAGEMENT.md - Profile versioning guide - backend/TOKEN_TRACKING_ENHANCEMENT.md - Token tracking guide - backend/PRICING_GUIDE.md - Pricing validation guide - backend/NEW_FEATURES_QUICKSTART.md - Quick start guide - IMPLEMENTATION_SUMMARY.md - Complete implementation overview Key Files Modified: - backend/api_server.py - Profile versioning, token passthrough - backend/client_config.py - Visibility-aware profile filtering - backend/llm_config.py - Token usage extraction from APIs - backend/usage_tracker.py - Actual token tracking and cost calculation - CLAUDE.md - Updated documentation with new features Benefits: - Accurate cost tracking with real token usage - Safe profile editing with version history - Flexible profile visibility for multi-tenant setup - Comprehensive usage analytics for optimization - Better budget forecasting and client billing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-02 13:22:33 +02:00
nickviljoen	2d27356478	Add debug logging for client filtering - Log client_filter parameter received by backend - Log selectedClient value in frontend - Log number of files returned - Help diagnose why all files are showing for all clients Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-02 11:27:34 +02:00
nickviljoen	477780df09	Add client-specific output folders and 14-day auto-cleanup ## New Features ### Client-Specific Output Folders - Files now saved in client-specific subdirectories (loreal/, diageo/, unilever/, general/) - Automatic client detection from profile ID - Better organization for multi-client environment - Each client only sees their own QC reports ### Automatic File Cleanup - Auto-delete reports older than 14 days on every file listing request - Keeps output folder clean and manageable - Configurable cleanup age (default: 14 days) ### File Filtering by Client - API endpoint `/api/output_files` now accepts `?client=<client_id>` parameter - Frontend automatically filters files by selected client - No more cluttered file lists for clients ### Migration Script - `migrate_output_files.py` - Move existing files to client folders - Dry-run mode by default (use --execute to run) - Deletes files older than 14 days during migration - Supports both development (--dev) and production (--production) ## API Changes ### Modified Endpoints - `GET /api/output_files?client=<client_id>` - List files filtered by client - `GET /output/<client>/<filename>` - Serve files from client folders - `GET /output/<filename>` - Legacy route for backward compatibility ### New Functions - `get_client_from_profile(profile_id)` - Detect client from profile - `ensure_client_output_folder(client)` - Create client folders - `cleanup_old_files(max_age_days)` - Delete old files ## File Structure ``` output-dev/ ├── loreal/ │ └── 20260202_102514_Missing_text_report.html ├── diageo/ │ └── 20260202_103423_Product_shot_report.html ├── unilever/ │ └── 20260202_104512_Key_visual_report.html └── general/ └── 20260202_105634_Other_report.html ``` ## Frontend Changes - `loadSavedFiles()` now includes client parameter in API calls - Automatically filters saved files by selected client - Clean UI showing only relevant reports ## Usage ### Migration (Development) ```bash # Dry-run (no changes) python3 migrate_output_files.py # Execute migration python3 migrate_output_files.py --execute ``` ### Migration (Production) ```bash # Dry-run for production folder python3 migrate_output_files.py --production # Execute migration python3 migrate_output_files.py --production --execute ``` Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-02 11:18:05 +02:00
nickviljoen	16741a96d6	Add L'Oréal Static General profile with multi-file queue and enhanced reporting ## New Features ### L'Oréal Static General Profile - Created new profile with 3 checks optimized for digital marketing assets - Even weighting (33.3% each) for 100-point scoring scale - Removed print-specific requirements (3m viewing distance) - Focus on marketing text vs product packaging distinction ### Multi-File Queue System (web_ui.html) - Added file queue functionality for batch processing - Users can now upload and process multiple files simultaneously - Queue displays file status (pending, analyzing, complete, error) - Individual file removal and queue clearing options - Progress tracking for batch operations ### New General QC Checks 1. background_contrast_general - Optimized for digital assets (no distance requirements) - Checks logo, product, and marketing text contrast - Detects overlapping and blending issues - Provides element-by-element breakdown 2. text_readability_general - Focus on marketing text only (excludes product packaging) - Checks for overlapping elements - Digital readability optimization - Specific issue identification 3. language_consistency (enhanced) - Better distinction between marketing and packaging text - Detailed language detection and reporting - Lists specific text analyzed ### Usage Tracking System - Added usage_tracker.py for analysis logging - Tracks user activity, profile usage, and costs - Daily log files in JSONL format - Cost estimation per LLM provider ## Bug Fixes ### Authentication & User Management - Fixed Flask 'g' import missing issue - Fixed user info access in background threads - Pass user_info to threads instead of accessing g.user - Improved error handling for usage logging ### HTML Report Generation - Fixed missing analysis details in reports - Now extracts and displays all JSON fields properly - Shows comprehensive breakdowns: - Analysis details - Elements checked (logo, product, text) - Marketing text found - Issues identified - Specific recommendations - No more blank "Pass/Fail" results ### Scoring System - Fixed usage_tracker to handle dict of check results (not list) - Better handling of model_used field variations - Skip non-dict check results gracefully ## Configuration Changes ### Model Versions (llm_config.py) - Fixed invalid GPT-4.1 model ID to gpt-4o - Added Gemini 3 Pro beta model option - AVAILABLE_MODELS dict for UI selection - Model version override support ### Profile Updates - Static General: 3 checks, total weight 10.0 - Each check: text_readability_general (3.33), background_contrast_general (3.33), language_consistency (3.34) - Maximum score: 100 points ## Technical Improvements - Enhanced prompt engineering for consistent LLM outputs - Mandatory detailed explanations in all checks - Structured JSON responses with comprehensive fields - Better error messages and fallback handling - Client configuration support (client_config.py) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-02 10:58:39 +02:00
nickviljoen	ef1abdcfc5	Sync profile weighting and weight_scale support to backend api_server.py - Added weight_scale support for profiles (default 100) - Enhanced /api/profiles endpoint to return detailed profile data - Includes check weights, LLM assignments, and enabled status - Matches functionality from root api_server.py for production deployment	2025-12-06 15:16:41 +02:00
nickviljoen	70237126f1	Apply tool descriptions update to backend api_server.py for production	2025-12-06 15:05:46 +02:00

1 2

53 commits