ai_qc

Author	SHA1	Message	Date
Nick Viljoen	bc6b4e8482	Merged in feature/axa-formatting-diff (pull request #24 ) Feature/axa formatting diff	2026-05-19 10:45:59 +00:00
nickviljoen	29ee941037	refactor(formatting_diff): narrow scope to bold + italic only First real-data test against the AXA car-insurance PDFs surfaced a noise problem: the new document is a brand refresh — every page flips font (PublicoBanner-Bold→PublicoHeadline-Bold) and colour (#893f4a→#2e3092). At medium-per-finding that crashed the diff score to 0.0 and drowned the bold-regression signal AXA actually flagged. Drop font, size, colour comparators. Keep bold + italic — the attributes the vision-LLM consistently misses on dense layouts. The LLM already narrates colour-scheme rebrands and font swaps in its Modified / Style-changes blocks; running both layers on the same visual change just double-counts it. Tests inverted from "X change is flagged" to "X change is NOT flagged" to lock the scope decision in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 12:37:19 +02:00
nickviljoen	d327776c70	fix(diff_engine): guard compute_formatting_diff against per-pair failure If the deterministic formatting comparator raises on any single page-pair (e.g. unexpected span shape from a future PyMuPDF version), degrade to zero formatting findings for that pair instead of aborting the whole 52-page diff run. Logged for visibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:31:16 +02:00
nickviljoen	640bbe4671	docs(axa): note deterministic formatting layer added to axa_pdf_diff Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:23:15 +02:00
nickviljoen	0fd6a35562	fix(diff_report): _fmt_value labels italic flips correctly Previously every boolean attribute rendered as "Bold → Regular", producing "Italic: Bold → Regular" for italic flips. Now the helper takes the attribute name and emits "Italic → Regular" or "Bold → Regular" depending on which boolean attribute is being shown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:22:39 +02:00
nickviljoen	7eaac85df3	feat(diff_report): render formatting_changes as a per-pair block Adds a "🎨 Formatting changes" block to the per-page diff report when the deterministic formatting layer finds typographic flips. Distinguishes page-wide style shifts from local span flips, lists up to three example quotes per aggregated finding, and HTML-escapes all user-controlled strings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:08:47 +02:00
nickviljoen	2b1bb9ccf0	feat(diff_engine): merge formatting_diff findings into pair_diffs run_page_pair_diff now invokes compute_formatting_diff alongside the LLM call for each aligned pair. When the deterministic layer finds typographic flips on a page the LLM saw as identical, the pair is re-classified as having differences with medium severity. Each aggregated finding contributes to the global medium-severity tally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:03:54 +02:00
nickviljoen	d21a8a276d	refactor(formatting_diff): harden page_wide threshold + None-key handling Three review-driven hardening tweaks: - page_wide now requires ≥3 matched spans (PAGE_WIDE_MIN_SPANS). Avoids labelling section-break pages with a single flipped heading as page-wide. - _collect_flips normalises bold/italic via bool() and font/color via "or ''" so callers passing dicts without those keys do not produce phantom flips against False/''. - Adds tests for empty span lists and the missing-bold-key case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 10:01:23 +02:00
nickviljoen	98679e7329	feat(document_mode): add deterministic span formatting diff New formatting_diff module compares span-level bold/italic/font/size/ color attributes between aligned page-pairs. Pure-Python; reads PyMuPDF metadata already captured during ingest. Aggregates identical flips into single findings and flags page-wide style shifts. Powers the AXA document_diff fix for missed formatting changes that the vision-LLM does not reliably detect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 09:56:34 +02:00
nickviljoen	f69e181520	feat(ingest): capture span color as #rrggbb string Adds a 'color' field to each span dict extracted by _extract_page_spans. Powers the upcoming deterministic formatting-diff layer for AXA document_diff mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 09:45:21 +02:00
nickviljoen	25bb472a53	docs: plan for AXA diff deterministic formatting layer Six-task TDD plan implementing the spec at docs/superpowers/specs/2026-05-19-axa-formatting-diff-design.md. Local execution only this cycle — Nick tests locally before any push to dev, and the prod bundle ships tonight alongside other AXA changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 09:39:40 +02:00
nickviljoen	5e1263380e	docs: spec for AXA diff deterministic formatting layer New formatting_diff module compares PyMuPDF span-level (bold, italic, font, size, color) attributes between aligned page-pairs to catch formatting changes the vision-LLM misses. Addresses AXA client feedback that page 18+ un-bolding of blue text was not surfaced in the diff report. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 09:32:00 +02:00
Nick Viljoen	77276f2a7c	Merged in feature/hp-cycle-1-onboarding (pull request #22 ) fix(hp_copy_review): correct llm casing + route HP reports to /hp/ folder	2026-05-17 20:10:50 +00:00
nickviljoen	71bb9a6295	fix(hp_copy_review): correct llm casing + route HP reports to /hp/ folder Two bugs surfaced by the first dev smoke test: 1. Profile JSON declared "llm": "gemini" (lowercase). llm_config's dispatcher compares model_name == "Gemini" case-sensitively (matches the rest of the codebase), so the check fell through to "Invalid model selected" and never reached the API. Every other profile uses "Gemini" with capital G. Spec mistake — fixed. 2. get_client_from_profile() resolves the per-report output folder from the profile_id via hardcoded prefix matches. No 'hp_' branch existed, so hp_copy_review reports landed under output-dev/general/ instead of output-dev/hp/ — the UI then couldn't find them. Added 'hp_' → 'hp' alongside the existing mappings. The check itself works correctly otherwise: profile_source was user_selected, brand resolved to 'hp', and the reference asset was successfully attached. Bug 1 just prevented Gemini from being called. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 22:07:25 +02:00
Nick Viljoen	03410cb271	Merged in feature/hp-cycle-1-onboarding (pull request #21 ) Feature/hp cycle 1 onboarding	2026-05-17 19:43:10 +00:00
nickviljoen	68a2360811	feat(report): render hp_copy_review findings as a structured table Both HTML report generators (generate_html_content and generate_comprehensive_html_report) get a small case: when a check result has a 'findings' array in its json_data, render it as a priority-coloured table with quote/issue/suggested-fix/source columns instead of the default response-text block. The summary field (when present) renders above the table. Fallback to text rendering when findings is absent — every existing check is unaffected. All string fields from the LLM are HTML-escaped via html.escape() to neutralise stray <, >, &, or quote characters. Inline CSS for .findings-table / .priority-pill / .priority-high\|medium\|low / .muted is added to both stylesheets so the two generators stay visually in sync.	2026-05-17 21:37:35 +02:00
nickviljoen	0e833447c0	fix(brand-guidelines): inject xlsx Source Messaging summary into check prompts Task 5 review found that get_reference_asset_content treated all non-localization-matrix .xlsx files as opaque ('reference file uploaded'), never reading the Gemini summary that excel_processor writes. That meant hp_copy_review would see no canonical messaging and fire its score-0 fallback on every real asset. Extend the .xlsx branch to mirror the PDF pattern: when the file record has a summary_path (set by excel_processor after a successful source-messaging summary), read and inject the Markdown into the reference content block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:28:32 +02:00
nickviljoen	4c19a0fb9d	feat(hp_copy_review): single-check LLM grader against Source Messaging Single Gemini call per asset. Prompt assembles attached Source Messaging summaries + media-plan language context + the asset image. Returns structured JSON with score, summary, and a findings array (priority, category, quote, issue, suggested fix, source reference). Empty findings = clean asset; missing reference -> score 0 with a clear message rather than running blind. Mirrors the boots_tandc_wording pattern: subclass FlaskAppTemplate, expose a static prompt template, let process_single_check inject reference-asset content and media-plan context at runtime. A standalone build_prompt() helper mirrors that assembly for unit- style smoke tests and ad-hoc prompt inspection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:25:30 +02:00
nickviljoen	014a9cb8ff	feat(hp): promote HP client + add hp_copy_review profile HP is no longer a placeholder. The client gets a new hp_copy_review profile (single weighted check, client-specific visibility) as its default, plus the generic static_general and video_general profiles it already had visibility into.	2026-05-17 21:08:18 +02:00
nickviljoen	568465f9be	fix(brand-guidelines): preserve localization-matrix parsing in xlsx dispatch The prior Task 2 commit (`295305e`) over-replaced existing logic that recognised certain .xlsx/.xls uploads as localization matrices and set asset_type='localization_matrix'. That field is load-bearing in two downstream sites (api_server.py:1628 and :1986) that build localization context for QC checks; destroying it would silently break any existing client using localization matrices. Restore the original try-localization-matrix-first path; only fall through to excel_processor (HP Source Messaging summary) when the file isn't a parseable localization matrix. Also restore .xls support and tag Source Messaging uploads as asset_type='source_messaging' so downstream code can distinguish them from localization matrices. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:03:56 +02:00
nickviljoen	295305ef2d	feat(brand-guidelines): route .xlsx uploads to excel_processor The /api/brand_guidelines POST handler now dispatches by extension: .pdf → pdf_processor.process_pdf_file (existing), .xlsx → excel_processor.process_excel_file (new). Same DB record shape; cover image is null for Excel since there's no first-page analogue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:02:05 +02:00
nickviljoen	c51e0729ce	fix(excel-processor): wrap extraction in try/except to honour 'never raises' Code review found that _extract_workbook_text was unwrapped — a corrupt/locked .xlsx or InvalidFileException would leak out of process_excel_file despite the docstring promising 'Never raises'. Wrap the extraction call too; on extraction failure, write a degraded summary explaining the failure and return cleanly. Verified by passing a non-existent file: the function returns a degraded summary instead of raising FileNotFoundError. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 20:55:54 +02:00
nickviljoen	abd36a9abe	fix(excel-processor): use literal trademark glyphs in summary prompt Spec requires "™, ®, ©" in the Approved Brand and Product Names section instructions; first pass wrote "TM, R, C" out of unfounded caution about encoding. Python 3 source handles UTF-8 fine and pdf_processor.py uses smart punctuation throughout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 20:52:38 +02:00
nickviljoen	ed46504ac6	feat(excel-processor): add openpyxl + Gemini summary pipeline for HP Source Messaging Mirrors pdf_processor.py — public process_excel_file() reads any HP Source Messaging Excel, extracts cells via openpyxl (skipping empty rows, capped at 50K chars), and summarises into structured Markdown via Gemini 2.5 Pro. Output saved as brand_guidelines/files/{file_id}_summary.md. On Gemini failure the processor writes a degraded summary containing the raw extraction so the reference asset stays usable. Test fixtures (real HP Excels) live under backend/tests/fixtures/hp/ and are gitignored.	2026-05-17 20:49:50 +02:00
nickviljoen	7d178f11ee	docs(plan): HP onboarding cycle 1 implementation plan 7-task plan against 2026-05-17-hp-cycle-1-onboarding-design.md: excel_processor → .xlsx dispatch → media-plan language field → HP client+profile → hp_copy_review check → findings-table renderer → dev smoke + deploy. Lightweight verification posture (py_compile + imports + profile load + python3 -c mini-tests + dev smoke runs) to match the project's existing style — no pytest scaffolding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 20:32:56 +02:00
nickviljoen	53ba67c2c0	docs(spec): HP onboarding cycle 1 — hp_copy_review check Captures the brainstorm outcome for migrating HP off the deprecated hp-copy PHP/Make.com POC onto AI QC. Cycle 1 of 3 in HP onboarding (cycles 2 = Word/PPT processor, 3 = Box picker — both independent and shipped later). Locks the four design decisions reached during the brainstorm: - User selects the canonical Source Messaging reference asset at QC-run time (matches existing brand-guidelines UX) - Single hp_copy_review check, single Gemini call per asset, structured findings JSON output matching the Messi Copy Review document format - Excel processor mirrors pdf_processor.py: openpyxl extracts raw cell content, Gemini summarises into structured Markdown, saved as {file_id}_summary.md alongside the file - Media-plan `language` field is free-form text, included in the check prompt when present, omitted gracefully when absent No code yet — pick up with the writing-plans skill to draft the implementation plan against this spec. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 20:17:43 +02:00
Nick Viljoen	fc588c626d	Merged in feature/untrack-tracked-env-files (pull request #19 ) chore(env): untrack legacy env files so deploys stop clobbering them	2026-05-17 17:01:32 +00:00
nickviljoen	1057c5660f	chore(env): untrack legacy env files so deploys stop clobbering them config.env, backend/config.env, config/development.env, and config/production.env still contained real secrets and were getting silently reverted by `git reset --hard` during deploys — manual key-restore was required after both v1.3.0 and v1.3.1 to recover the in-place GOOGLE_API_KEY rotation. Move them to .gitignore alongside the already-untracked backend/config/*.env paths. The next deploy after this lands will delete them from disk one final time (because they were tracked in the prior commit). Same backup/restore dance documented for the previous secrets-untrack is needed for that single deploy; after it, the files are permanently untracked. This does NOT remove historical secrets from git history. Rotation of OPENAI_API_KEY, BOX_CLIENT_SECRET, SECRET_KEY, SMTP_PASSWORD remains a separate open follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 19:00:09 +02:00
Nick Viljoen	8e2d970d61	Merged in feature/boots-ppack-page-parallelism (pull request #17 ) perf(document-mode): parallelize per-page check dispatch in stages 3c/3d	2026-05-17 16:15:52 +00:00
nickviljoen	1c5dd980d4	perf(document-mode): parallelize per-page check dispatch in stages 3c/3d A 4-page Boots PPack run (7 page-scoped checks) was taking ~15 min because the dispatcher processed pages sequentially within each check — 28 Gemini calls in a single file. Asset-mode's ThreadPoolExecutor parallelism was bypassed because doc-mode called process_checks_in_batches once per page in a loop. Wrap the per-page dispatch in both Stage 3c (page_sample) and Stage 3d (page_each) with a ThreadPoolExecutor (max_workers=4). Extract the per-page work into a single nested helper used by both stages, which also tags each result with page_type so the existing artwork vs informational aggregation in Stage 3d keeps working. Aggregation logic, scoring, strict-grade override, and report shape are all unchanged. process_checks_in_batches is already reentrant (asset-mode uses it under its own internal ThreadPoolExecutor), so concurrent calls are safe. Progress-tracker writes intentionally tolerate races (visual only). Per-page exceptions are caught inside the helper so one bad page doesn't kill the doc — it just records a score-0 result. Expected: 15 min → ~3-4 min on the same 4-page PDF. Needs wall-time confirmation on dev with a real run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 18:14:27 +02:00
nickviljoen	8e50413b53	docs(spec): Phase 5 cycle 1 — Postgres database design Captures the brainstorm outcome for adding a Postgres database alongside the existing JSONL usage logs, ahead of the dashboard work. Decomposes Phase 5 into three independent cycles (DB first, then Docker, then dashboard) and locks the schema, transition strategy (dual-write), hosting (Docker on each VM), backup approach (pg_dump → GCS), and rollback escape hatch. No code changes yet — pick up with the writing-plans skill when returning to Phase 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:02:03 +02:00
Nick Viljoen	86dec44124	Merged in fix/deploy-sigpipe-when-many-commits (pull request #16 ) fix(deploy): use git's own -n limit instead of \| head -20	2026-05-17 13:26:41 +00:00
nickviljoen	a3b3f45f01	fix(deploy): use git's own -n limit instead of \| head -20 When the deploy batch has more than 20 commits, the `git log ... \| head -20` pipeline closes the pipe after 20 lines. git log gets SIGPIPE (exit 141), which `set -o pipefail` propagates, and `set -e` then exits the script silently — no prompt shown, no error message. Only bites for release-sized batches (>20 commits). First seen on the v1.3.0 prod deploy: 20 commits displayed, then the script returned to the shell without prompting. dev deploys never hit this because they typically only have 1-3 commits ahead. Fix: tell git to limit its own output via `-n 20`. Same display, no broken pipe. Also swap the count-by-wc-l for `git rev-list --count` which is more idiomatic and avoids any further pipe shenanigans. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 15:25:38 +02:00
Nick Viljoen	e25006039f	Merged in docs/box-client-onboarding-runbook (pull request #14 ) docs: add Box client onboarding runbook	2026-05-17 12:13:53 +00:00
nickviljoen	31b059de79	docs: add Box client onboarding runbook Documents the end-to-end process for adding a new client to the Box-webhook-driven QC pipeline: 1. Box admin: create INCOMING + REPORTS folders, invite service account 2. Code: add box_folder_id / box_reports_folder_id / default_profile to client_config.py, ship via PR 3. Verify service account access with `box_setup.py list-folder` 4. Register webhook via `box_setup.py register-all-clients` (or UI) 5. End-to-end test by uploading a sample asset, watching logs, confirming report appears + source moves to _PROCESSED 6. Optional: tune default_profile from the Settings UI without a code deploy 7. Promote to prod (develop→main PR, tag, deploy.sh prod) Includes a gotchas table for the issues most likely to come up: 403s from missing collaborator invites, signature verification failures, folder ID mismatches, replace-upload behavior, etc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 14:12:48 +02:00
Nick Viljoen	432162f167	Merged in feature/default-profile-ui (pull request #13 ) feat(settings): default-profile UI per client (admin-only) for Box webhook flow	2026-05-17 11:51:50 +00:00
nickviljoen	bf89466d06	feat(settings): default-profile UI per client (admin-only) for Box webhook flow Adds a "Default Profile" sub-tab to the Settings modal. Lists the current client's profiles as radio buttons, shows which is the active default and whether it's a runtime override or the static value from client_config.py. Admins click a different profile + Set to override; clear-override button reverts to the static value. Storage layer: backend/client_defaults.json (gitignored, per-server), following the same pattern as user_access.json. Resolution order in client_config.get_default_profile(): override → static default_profile field → None. The Box webhook handler is the sole consumer that needs profile selection without a logged-in user; it now reads via get_default_profile() so overrides take effect. Why a separate JSON, not rewriting client_config.py: a buggy override write can never break server boot — worst case the override is ignored and the static value applies. Cleaner separation between "static config you check in" and "runtime overrides admins make". Backend: - client_config.get_default_profile(client_id) — resolver - client_config.set_default_profile(client_id, profile_id) — validates + writes (rejects profiles not in client's profile list) - client_config.clear_default_profile_override(client_id) - GET /api/clients/<id>/default_profile (any auth'd user) - PUT /api/clients/<id>/default_profile (admin-only, _require_admin) - DELETE /api/clients/<id>/default_profile (admin-only) - Box webhook handler in api_server.py now uses get_default_profile() Frontend: - New "Default Profile" tab button + tab content in Settings modal - showTab hook loads settings when tab activates - loadDefaultProfileSettings / saveDefaultProfile / clearDefaultProfileOverride functions - DOM-construction (createElement + textContent) used throughout — no innerHTML with interpolated values, so user-controllable strings (client_id, profile_id) can never cause XSS Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:50:20 +02:00
Nick Viljoen	be00f24416	Merged in feature/box-processed-folder-move (pull request #12 ) feat(box-jwt): move source file to _PROCESSED after successful run	2026-05-17 11:31:01 +00:00
nickviljoen	b7e9c483de	feat(box-jwt): move source file to _PROCESSED after successful run Solves two problems at once: 1. Folder cleanliness — INCOMING accumulates indefinitely otherwise. 2. Duplicate-upload re-trigger — Box V2's FILE.UPLOADED trigger doesn't fire when the same filename is "uploaded as new version" of an existing file. By moving the source out of INCOMING after success, re-uploading the same filename becomes a genuinely-new file event again and the webhook fires normally. After report uploads successfully to the REPORTS folder, the worker: 1. find_or_create_subfolder(<INCOMING>, '_PROCESSED') — idempotent 2. move_file(file_id, <_PROCESSED>, new_name=f'{session_id}_{filename}') The session_id prefix gives the archived file a sortable timestamp and ties it back to the matching QC_Report_<session_id>_*.html in REPORTS. Defensive: the move only runs if the report upload to Box succeeded. If Box delivery failed, the source stays in INCOMING so a retry just means re-uploading. Move failures are non-fatal — logged + recorded in result_data['box_source_move_error'], analysis still marked complete. Adds four helpers to box_jwt_client.py: - find_subfolder_by_name(parent, name) → Optional[str] - create_subfolder(parent, name) → str - find_or_create_subfolder(parent, name) → str (idempotent) - move_file(file_id, target_folder, new_name=None) → Dict Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:29:45 +02:00
Nick Viljoen	4d08a23322	Merged in fix/comprehensive-report-status-filter (pull request #11 ) fix(reports): render check details for status='success' in generate_comprehensive_html_report	2026-05-17 11:05:25 +00:00
nickviljoen	c75f3a99b9	fix(reports): render check details for status='success' in generate_comprehensive_html_report generate_comprehensive_html_report filtered check rendering with `status == 'completed'`, but the modern check pipeline (process_single_check via /api/start_analysis and the Phase 4 Box webhook flow) returns `status == 'success'`. Only the legacy process_single_check_with_triage returns 'completed'. Result: every report produced by the modern pipeline had an empty "Detailed Analysis Results" section — just the heading with nothing below it. Surfaced when Nick ran a LOREAL Box-webhook test on 2026-05-17: webhook fired correctly, 4 LLM checks ran, scores came back, technical pre-flight rendered, but the per-check accordion was empty. Fix: accept either status value, so both modern and legacy code paths render correctly. Errored checks (status='error') still skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:01:21 +02:00
Nick Viljoen	57ce396860	Merged in feature/loreal-box-folders (pull request #10 ) feat(clients): wire LOREAL Box folders for webhook-driven QC	2026-05-15 07:51:39 +00:00
nickviljoen	4a9ddee87f	feat(clients): wire LOREAL Box folders for webhook-driven QC First client to use the Phase 4 unattended-QC pipeline. Adds three optional fields to the loreal entry in client_config.py: - box_folder_id=381501258415 (AI-QC > INCOMING > AI QC LOREAL IN) - box_reports_folder_id=382076841334 (AI-QC > REPORTS > AI QC LOREAL REPORTS) - default_profile=loreal_static When a file lands in the INCOMING folder, /api/box/webhook will pick it up, run loreal_static (strict-grade), and upload the HTML report to the REPORTS folder. Other clients remain unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 09:50:40 +02:00
Nick Viljoen	1c8e1ea1a7	Merged in feature/box-jwt-integration (pull request #9 ) Feature/box jwt integration	2026-05-14 21:42:43 +00:00
nickviljoen	a99c8601f0	Merge develop into feature/box-jwt-integration Brings in the 4 commits that landed on develop after this branch was cut: the chore/untrack-env-files PR (#7) and the fix/tech-section-in-html-content PR (#8). Conflict resolution: - .gitignore: both branches added `backend/config/box_jwt_config.json` in slightly different positions. Kept both sets of additions — development.env + production.env (from develop) and box_jwt_config.json (from this branch). - api_server.py: auto-merged cleanly; the Phase 4 webhook endpoint and the Phase 3 technical-section fix touch different regions of the file. Verified after merge: api_server imports cleanly, box_webhook route registered, _render_technical_section_html callable, 60 QC apps and 15 profiles load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 23:42:00 +02:00
Nick Viljoen	c99b8b7770	Merged in fix/tech-section-in-html-content (pull request #8 ) fix(tech-check): also render Technical section in generate_html_content	2026-05-14 21:29:37 +00:00
nickviljoen	096eba747d	fix(tech-check): also render Technical section in generate_html_content Phase 3 patched generate_comprehensive_html_report() but missed the older generate_html_content() generator. The /api/start_analysis flow with output_mode='html' (the path the web UI's download button actually triggers) routes through generate_html_content, so the Technical Details section never appeared in user-downloaded reports despite the technical_report data being present in the underlying result_data. Mirrors the Phase 3 treatment exactly: pre-builds technical_html via _render_technical_section_html(), adds the .technical / .technical-grid / .tech-row CSS rules, and injects {technical_html} between the summary block and the Detailed Analysis Results header. generate_comprehensive_html_report() retains the same logic for the /api/process_file path (line 4187) and the new Box webhook flow (_run_box_triggered_analysis on the Phase 4 branch). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 23:28:52 +02:00
Nick Viljoen	33278e4f62	Merged in chore/untrack-env-files (pull request #7 ) chore(secrets): untrack env files + add JWT path to .gitignore	2026-05-14 21:17:47 +00:00
nickviljoen	cfb13eb870	chore(secrets): untrack env files + add JWT path to .gitignore backend/config/development.env and backend/config/production.env were committed to the repo with real API keys, SMTP passwords, and Flask SECRET_KEY values. This commit: 1. Adds both files to .gitignore so future edits stop landing in git. 2. git rm --cached's them (local copies preserved on disk, just untracked). 3. Also pre-emptively adds backend/config/box_jwt_config.json to .gitignore — Phase 4 already gitignores it on a separate branch, but listing it here protects the file regardless of merge order. 4. Updates backend/config/.env.template with the new Box JWT-related vars (BOX_JWT_CONFIG_PATH, BOX_WEBHOOK_PRIMARY_KEY, BOX_WEBHOOK_SECONDARY_KEY) so the template is a complete reference for setting up a new environment from scratch. IMPORTANT — secrets still in git history after this commit. Removing them from history requires a destructive rewrite (git filter-repo + force-push every branch). Pragmatic alternative: rotate any secret that was ever in the files. Candidates: OPENAI_API_KEY, BOX_CLIENT_SECRET, SECRET_KEY, SMTP_PASSWORD. AZURE_TENANT_ID and AZURE_CLIENT_ID are public-ish identifiers and don't need rotating. GOOGLE_API_KEY just rotated this session. DEPLOY GOTCHA: deploy.sh does git reset --hard, which will delete the env files from /opt/ai_qc/backend/config/ on the server when this commit lands. Back them up before deploying, restore after: sudo cp /opt/ai_qc/backend/config/development.env /tmp/dev.env.bak # ...deploy... sudo cp /tmp/dev.env.bak /opt/ai_qc/backend/config/development.env sudo systemctl restart ai-qc.service Same dance on prod with production.env when promoting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 23:13:18 +02:00
nickviljoen	65848bcda1	feat(box-jwt): add box_setup.py bootstrap CLI for webhook management One-off script used to register/inspect Box V2 webhooks against the service account. Subcommands: list-webhooks, list-folder, list-clients, create-webhook, delete-webhook, register-all-clients. Typical bootstrap flow on a fresh deploy: 1. Drop box_jwt_config.json on the server (gitignored, scp'd in). 2. Verify the service account can read each client folder: `python backend/scripts/box_setup.py list-folder <folder_id>` 3. Once a client's box_folder_id is set in client_config.py, register its webhook idempotently: `python backend/scripts/box_setup.py register-all-clients \ https://optical-dev.oliver.solutions/ai_qc/api/box/webhook` 4. Copy the signing keys from the Box Developer Console (Custom App → Webhooks) into BOX_WEBHOOK_PRIMARY_KEY / BOX_WEBHOOK_SECONDARY_KEY in the env file, then restart ai-qc.service. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 22:53:03 +02:00

1 2 3 4 5

212 commits