Commit graph

174 commits

Author SHA1 Message Date
nickviljoen
b7e9c483de feat(box-jwt): move source file to _PROCESSED after successful run
Solves two problems at once:

1. Folder cleanliness — INCOMING accumulates indefinitely otherwise.
2. Duplicate-upload re-trigger — Box V2's FILE.UPLOADED trigger doesn't
   fire when the same filename is "uploaded as new version" of an
   existing file. By moving the source out of INCOMING after success,
   re-uploading the same filename becomes a genuinely-new file event
   again and the webhook fires normally.

After report uploads successfully to the REPORTS folder, the worker:
1. find_or_create_subfolder(<INCOMING>, '_PROCESSED') — idempotent
2. move_file(file_id, <_PROCESSED>, new_name=f'{session_id}_{filename}')

The session_id prefix gives the archived file a sortable timestamp and
ties it back to the matching QC_Report_<session_id>_*.html in REPORTS.

Defensive: the move only runs if the report upload to Box succeeded.
If Box delivery failed, the source stays in INCOMING so a retry just
means re-uploading. Move failures are non-fatal — logged + recorded
in result_data['box_source_move_error'], analysis still marked
complete.

Adds four helpers to box_jwt_client.py:
- find_subfolder_by_name(parent, name) → Optional[str]
- create_subfolder(parent, name) → str
- find_or_create_subfolder(parent, name) → str  (idempotent)
- move_file(file_id, target_folder, new_name=None) → Dict

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 13:29:45 +02:00
Nick Viljoen
4d08a23322 Merged in fix/comprehensive-report-status-filter (pull request #11)
fix(reports): render check details for status='success' in generate_comprehensive_html_report
2026-05-17 11:05:25 +00:00
nickviljoen
c75f3a99b9 fix(reports): render check details for status='success' in generate_comprehensive_html_report
generate_comprehensive_html_report filtered check rendering with
`status == 'completed'`, but the modern check pipeline
(process_single_check via /api/start_analysis and the Phase 4 Box
webhook flow) returns `status == 'success'`. Only the legacy
process_single_check_with_triage returns 'completed'.

Result: every report produced by the modern pipeline had an empty
"Detailed Analysis Results" section — just the heading with nothing
below it. Surfaced when Nick ran a LOREAL Box-webhook test on
2026-05-17: webhook fired correctly, 4 LLM checks ran, scores came
back, technical pre-flight rendered, but the per-check accordion was
empty.

Fix: accept either status value, so both modern and legacy code paths
render correctly. Errored checks (status='error') still skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 13:01:21 +02:00
Nick Viljoen
57ce396860 Merged in feature/loreal-box-folders (pull request #10)
feat(clients): wire LOREAL Box folders for webhook-driven QC
2026-05-15 07:51:39 +00:00
nickviljoen
4a9ddee87f feat(clients): wire LOREAL Box folders for webhook-driven QC
First client to use the Phase 4 unattended-QC pipeline. Adds three
optional fields to the loreal entry in client_config.py:

- box_folder_id=381501258415 (AI-QC > INCOMING > AI QC LOREAL IN)
- box_reports_folder_id=382076841334 (AI-QC > REPORTS > AI QC LOREAL REPORTS)
- default_profile=loreal_static

When a file lands in the INCOMING folder, /api/box/webhook will pick
it up, run loreal_static (strict-grade), and upload the HTML report
to the REPORTS folder. Other clients remain unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 09:50:40 +02:00
Nick Viljoen
1c8e1ea1a7 Merged in feature/box-jwt-integration (pull request #9)
Feature/box jwt integration
2026-05-14 21:42:43 +00:00
nickviljoen
a99c8601f0 Merge develop into feature/box-jwt-integration
Brings in the 4 commits that landed on develop after this branch was
cut: the chore/untrack-env-files PR (#7) and the
fix/tech-section-in-html-content PR (#8).

Conflict resolution:
- .gitignore: both branches added `backend/config/box_jwt_config.json`
  in slightly different positions. Kept both sets of additions —
  development.env + production.env (from develop) and
  box_jwt_config.json (from this branch).
- api_server.py: auto-merged cleanly; the Phase 4 webhook endpoint and
  the Phase 3 technical-section fix touch different regions of the file.

Verified after merge: api_server imports cleanly, box_webhook route
registered, _render_technical_section_html callable, 60 QC apps and
15 profiles load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 23:42:00 +02:00
Nick Viljoen
c99b8b7770 Merged in fix/tech-section-in-html-content (pull request #8)
fix(tech-check): also render Technical section in generate_html_content
2026-05-14 21:29:37 +00:00
nickviljoen
096eba747d fix(tech-check): also render Technical section in generate_html_content
Phase 3 patched generate_comprehensive_html_report() but missed the
older generate_html_content() generator. The /api/start_analysis flow
with output_mode='html' (the path the web UI's download button
actually triggers) routes through generate_html_content, so the
Technical Details section never appeared in user-downloaded reports
despite the technical_report data being present in the underlying
result_data.

Mirrors the Phase 3 treatment exactly: pre-builds technical_html via
_render_technical_section_html(), adds the .technical / .technical-grid
/ .tech-row CSS rules, and injects {technical_html} between the
summary block and the Detailed Analysis Results header.

generate_comprehensive_html_report() retains the same logic for the
/api/process_file path (line 4187) and the new Box webhook flow
(_run_box_triggered_analysis on the Phase 4 branch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 23:28:52 +02:00
Nick Viljoen
33278e4f62 Merged in chore/untrack-env-files (pull request #7)
chore(secrets): untrack env files + add JWT path to .gitignore
2026-05-14 21:17:47 +00:00
nickviljoen
cfb13eb870 chore(secrets): untrack env files + add JWT path to .gitignore
backend/config/development.env and backend/config/production.env were
committed to the repo with real API keys, SMTP passwords, and Flask
SECRET_KEY values. This commit:

1. Adds both files to .gitignore so future edits stop landing in git.
2. git rm --cached's them (local copies preserved on disk, just
   untracked).
3. Also pre-emptively adds backend/config/box_jwt_config.json to
   .gitignore — Phase 4 already gitignores it on a separate branch, but
   listing it here protects the file regardless of merge order.
4. Updates backend/config/.env.template with the new Box JWT-related
   vars (BOX_JWT_CONFIG_PATH, BOX_WEBHOOK_PRIMARY_KEY,
   BOX_WEBHOOK_SECONDARY_KEY) so the template is a complete reference
   for setting up a new environment from scratch.

IMPORTANT — secrets still in git history after this commit. Removing
them from history requires a destructive rewrite (git filter-repo +
force-push every branch). Pragmatic alternative: rotate any secret
that was ever in the files. Candidates: OPENAI_API_KEY, BOX_CLIENT_SECRET,
SECRET_KEY, SMTP_PASSWORD. AZURE_TENANT_ID and AZURE_CLIENT_ID are
public-ish identifiers and don't need rotating. GOOGLE_API_KEY just
rotated this session.

DEPLOY GOTCHA: deploy.sh does git reset --hard, which will delete the
env files from /opt/ai_qc/backend/config/ on the server when this
commit lands. Back them up before deploying, restore after:

    sudo cp /opt/ai_qc/backend/config/development.env /tmp/dev.env.bak
    # ...deploy...
    sudo cp /tmp/dev.env.bak /opt/ai_qc/backend/config/development.env
    sudo systemctl restart ai-qc.service

Same dance on prod with production.env when promoting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 23:13:18 +02:00
nickviljoen
65848bcda1 feat(box-jwt): add box_setup.py bootstrap CLI for webhook management
One-off script used to register/inspect Box V2 webhooks against the
service account. Subcommands: list-webhooks, list-folder, list-clients,
create-webhook, delete-webhook, register-all-clients.

Typical bootstrap flow on a fresh deploy:
1. Drop box_jwt_config.json on the server (gitignored, scp'd in).
2. Verify the service account can read each client folder:
   `python backend/scripts/box_setup.py list-folder <folder_id>`
3. Once a client's box_folder_id is set in client_config.py, register
   its webhook idempotently:
   `python backend/scripts/box_setup.py register-all-clients \
       https://optical-dev.oliver.solutions/ai_qc/api/box/webhook`
4. Copy the signing keys from the Box Developer Console (Custom App →
   Webhooks) into BOX_WEBHOOK_PRIMARY_KEY / BOX_WEBHOOK_SECONDARY_KEY
   in the env file, then restart ai-qc.service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 22:53:03 +02:00
nickviljoen
8f995d557b feat(box-jwt): JWT service-account client + webhook ingestion endpoint
Adds machine-to-machine Box integration alongside the existing per-user
OAuth scaffolding. The new JWT client (backend/box_jwt_client.py) is
the auth/file/webhook surface used for unattended workflows: load the
Custom App JSON config, sign a JWT assertion, exchange for a 60-minute
service-account access token (cached + refreshed automatically), and
expose file download/upload + V2 webhook CRUD + HMAC signature
verification.

Wires a new POST /api/box/webhook endpoint (NOT @auth.require_auth — it
authenticates each delivery via Box's HMAC signature headers) that:

1. Verifies the signature against env-configured signing keys
   (BOX_WEBHOOK_PRIMARY_KEY / BOX_WEBHOOK_SECONDARY_KEY).
2. Dedups deliveries by box-delivery-id with a bounded in-memory cache.
3. Maps the source folder to a client via a new
   get_client_by_box_folder() helper on client_config.
4. Spawns a background thread that downloads the file, runs the same
   technical pre-flight + LLM check pipeline as the user-uploaded path,
   writes the HTML report to output/<client>/, uploads the report back
   to the client's box_reports_folder_id, and logs the run with a
   synthetic 'box_webhook' user.

Webhook runs skip media-plan / localization / OCR context — those are
user-UI concepts without a meaningful source in unattended runs. The
existing /api/start_analysis path is unchanged.

client_config.py gains three optional per-client fields used by the new
flow when present: `box_folder_id`, `box_reports_folder_id`, and
`default_profile`. Existing client entries keep working without them.

.gitignore now excludes backend/config/box_jwt_config.json so the JWT
config (with its embedded private key + passphrase) never lands in git.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 22:51:34 +02:00
Nick Viljoen
95121f2fb9 Merged in feature/technical-preflight (pull request #6)
Feature/technical preflight
2026-05-14 20:07:10 +00:00
nickviljoen
377efe30e5 feat(tech-check): show Technical Details section in HTML report
Adds a new "Technical Details" card to generate_comprehensive_html_report()
between the summary and the per-check detailed results. Renders only
the fields present on the technical_report dict (file size, dimensions,
DPI, page count, duration, fonts, etc. — vary by file type) and shows
a prominent filename-vs-actual match badge when filename hints were
parsed.

If technical_report is absent or kind==unknown, the section is omitted
entirely so reports for assets we can't inspect (e.g. exotic
extensions) keep the existing layout unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 22:00:25 +02:00
nickviljoen
2b287f3dbb feat(tech-check): wire pre-flight into visual + document analysis
Runs technical_check.inspect() immediately after file save on both
/api/start_analysis (visual flow) and /api/document/start_analysis
(document flow). The report is stashed on progress_tracker[session_id]
so it survives across the background thread boundary, then surfaces
two ways:

1. Each LLM check in the visual flow gets a "Technical metadata"
   preamble prepended to its prompt via format_for_llm_prompt(), so the
   model knows the file's actual dimensions, format, page count, etc.
   without having to infer them visually.
2. result_data['technical_report'] in both flows carries the same dict
   through to the frontend for UI rendering (next commit).

Pre-flight is best-effort: if it fails for any reason, analysis still
proceeds without the preamble (silent except for the report.errors
list).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:57:11 +02:00
nickviljoen
f4a95914b5 feat(tech-check): add machine-side pre-flight inspection module
New backend/technical_check.py extracts technical metadata from
uploaded assets via PIL (images), PyMuPDF (PDFs), and ffprobe (videos)
— no LLM, runs in milliseconds. Also opportunistically parses
dimension hints from the filename and compares them to the actual
file, returning a match/mismatch verdict.

Output is a JSON-serializable dict; format_for_llm_prompt() renders it
as a tight Markdown block that downstream prompts can prepend. Module
never raises — inspection errors land in `errors` so partial reports
still surface.

Standalone for this commit. Wiring into the upload flow and UI lands
in subsequent commits on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:53:06 +02:00
Nick Viljoen
94af442393 Merged in chore/claude-md-after-phases-1-2 (pull request #5)
docs: update CLAUDE.md after Phases 1+2 (Dow Jones removed, demos added)
2026-05-14 19:40:44 +00:00
nickviljoen
bcd318a7b1 docs: update CLAUDE.md after Phases 1+2 (Dow Jones removed, demos added)
Updates the intro count (9 → 12 clients), adds Google/HP/Ferrero to
the client name list, and adds three table rows for the new demo
clients (Doc column marked _scope pending_ until per-client docs land).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:39:40 +02:00
Nick Viljoen
5d1eab493c Merged in feature/add-demo-clients (pull request #4)
Feature/add demo clients
2026-05-14 19:34:38 +00:00
Nick Viljoen
02ae248e92 Merged in feature/remove-dow-jones (pull request #3)
Feature/remove dow jones
2026-05-14 19:33:56 +00:00
nickviljoen
93dc030e0c feat(clients): add Google, HP, Ferrero as demo placeholders
Three new clients in demo/eval phase. Each uses Honda-style minimal
setup (static_general + video_general only) until real scope and test
assets arrive. Descriptions are placeholders to be replaced once scope
is confirmed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:30:18 +02:00
nickviljoen
5860abf0f9 docs(dow-jones): update CLAUDE.md after offboarding
Removes the Dow Jones row from the client/profile table and the four
Dow Jones profile names from the pre-session profile-load checklist.
Also updates the intro paragraph counts (9 clients, 15 profiles, 60+
checks) and drops Dow Jones from the client name list, so the intro
no longer contradicts the table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:14:30 +02:00
nickviljoen
d1826d83f1 chore(dow-jones): remove client_config entry
Drops the 'dow_jones' block from CLIENT_PROFILES. After this, the
client picker no longer renders Dow Jones; the four archived profiles
are unreachable from user flows. Nine clients remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:12:47 +02:00
nickviljoen
b23b7f2e17 chore(dow-jones): archive profiles, checks, and per-client doc
Moves the Dow Jones / MarketWatch / WSJ profile JSONs (4), check apps
(22), and CLAUDE_DOW_JONES.md into backend/_archive/dow_jones/. All
moves use git mv so history follows. Adds a restore-instructions
README. No loader changes needed — the archive lives outside the
scanned directories.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:11:54 +02:00
nickviljoen
69f6abca56 docs(dow-jones): add Phase 1 implementation plan
Step-by-step plan that turns the spec into 5 tasks: archive moves
(one commit), client_config edit (one commit), CLAUDE.md edits (one
commit), full verification, then push + PR with explicit user-confirm
gates. Defensive guards at each task halt execution if the codebase
has drifted from the spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:09:14 +02:00
nickviljoen
8437b63871 docs(dow-jones): add Phase 1 spec for client offboarding
Captures the design for removing Dow Jones from Visual AI QC: archive
location (backend/_archive/dow_jones/), file moves, code edits, things
explicitly not touched, and verification commands. Implementation
follows in subsequent commits on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:06:10 +02:00
nickviljoen
9746ba249b docs: refresh CLAUDE_AXA.md status + add AI-usage breakdown
Updates the AXA client doc to reflect the 2026-05-10 state:
- Status line now reads 2026-05-10, covers Phase 6 (veraPDF), profile split,
  and dev deploy
- New "AI usage across AXA tools" section for client-facing communication
  (8 of 9 tools deterministic, only axa_pdf_diff uses AI)
- Open items expanded to include the pending source-PDF request and the
  prod-deployment hold

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:54:24 +02:00
nickviljoen
a80ff6dee4 Merged in feature/axa-accessibility-profile-split (pull request #2) 2026-05-10 11:21:34 +02:00
nickviljoen
a1cfc75309 Merge remote-tracking branch 'origin/develop' into feature/axa-accessibility-profile-split
# Conflicts:
#	CLAUDE_AXA.md
2026-05-10 11:20:09 +02:00
nickviljoen
a46ba9fc71 Split AXA accessibility check into its own profile
Removed axa_pdf_accessibility from axa_policy_document (was 8 checks, now 7)
and created a new axa_accessibility profile that contains only that check.
Marked the new profile strict_grade: true so a single PDF/UA-1 rule failure
forces an unmistakable Fail badge on the report — mirrors how axes4 PAC is
used in practice (single-purpose, binary verdict).

Lets users run accessibility-only QC without sitting through the rest of
the policy-document checks, and removes weight from the policy-document
score that the accessibility check wasn't really earning (its 0/10 verdict
was dragging the overall grade in a way that obscured the content checks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 11:15:46 +02:00
Nick Viljoen
02ce0c774d Merged in feature/axa-verapdf-integration (pull request #1)
Wire veraPDF into axa_pdf_accessibility for PAC-equivalent PDF/UA-1 validation
2026-05-10 08:41:22 +00:00
nickviljoen
2aeff24136 Wire veraPDF into axa_pdf_accessibility for PAC-equivalent PDF/UA-1 validation
AXA's accessibility QC team uses axes4 PAC (PDF/UA-1 / Matterhorn Protocol)
as their compliance gate, but our existing 9-criterion deterministic check
runs surface-level only and would pass documents PAC fails. Wired up the
existing _run_verapdf() stub so veraPDF — the open-source Matterhorn
implementation — runs as a subprocess and drives the score when available.

Verified locally: veraPDF on EAA_v1.pdf reports the exact same Content (86)
and Metadata (1) failure counts as PAC's report on the same document family,
confirming protocol parity.

Falls back cleanly to the deterministic layer when veraPDF isn't installed,
so deploys are safe before the binary lands on dev/prod servers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 10:36:03 +02:00
nickviljoen
418be66498 Merge develop into main: AXA document-mode + Boots PPack + WSJ tuning + docs restructure for v1.2.0
Major additions:
- AXA document-mode QC pipeline: multi-page PDF analysis (Phases 1, 3, 4, 5).
  New profiles axa_policy_document (8 checks) + axa_policy_document_diff (1 check).
  document_mode/ subsystem reused by Boots PPack.
- Boots Production Pack profile: 7-check multi-page document-mode QC built on AXA
  spine, with page classifier (cover/checklist/palette/notes/artwork) and
  artwork-page-only strict-grade exemption. Includes tuned prompts (closed-world
  brand-list semantics, ALL CAPS retail convention exception, stylised logotype
  exception, vision-LLM caveat fields for font weight / superscript / sizing).
- WSJ Static prompt tuning: complete-sentence whitelist for capitalization decision
  tree, graphic/illustrative headline awareness in typography check, split-layout
  logo placement convention, mandatory 30% sizing assessment with score cap.
  Validated across 3 WSJ-NY assets and 3 tuning iterations.

Documentation:
- CLAUDE.md restructured: monolithic 962-line root split into slim 211-line
  project-wide root + per-client CLAUDE_<CLIENT>.md files (10 clients fully
  covered). Auto-loaded session context drops ~88%. Added explicit session-start
  convention pointing to the right client doc.
- Stale 932-line backend/CLAUDE.md replaced with 18-line redirect.

Other:
- Box OAuth (PR1: token storage + redirect URI inference)
- Access-request endpoint fix (list_access_entries iteration)
2026-05-06 12:47:15 +02:00
nickviljoen
078a1f9a86 Merge feature/docs-restructure into develop: slim project-wide CLAUDE.md, full per-client doc coverage 2026-05-06 12:30:45 +02:00
nickviljoen
59a0b2408c Restructure CLAUDE.md docs: slim project-wide root, complete per-client coverage
Splits the monolithic CLAUDE.md (962 lines) into a slim project-wide root (211 lines)
plus per-client files. Auto-loaded context drops ~88% per session.

Changes:
- CLAUDE.md slimmed to project-wide essentials (architecture, auth, deployment, branch
  strategy, deploy scripts, prod troubleshooting, pre-session checklist). Adds explicit
  session-start convention pointing to CLAUDE_<CLIENT>.md for client-specific work.
  Updates client roster table to all 10 clients with profile counts.
- New CLAUDE_AXA.md: document-mode pipeline + axa_policy_document profiles
- New CLAUDE_DIAGEO.md: key_visual + packaging profiles, check inventories
- New CLAUDE_UNILEVER.md: profiles + zero-score logic for face/new visibility
- New CLAUDE_HONDA.md, CLAUDE_RANK.md, CLAUDE_GENERAL.md: stubs (clients use generic
  profiles only — kept for completeness and future expansion)
- backend/CLAUDE.md: stale 932-line duplicate replaced with 18-line redirect to root
  + backend-specific quick pointers

Per-client files (CLAUDE_LOREAL.md, CLAUDE_AMAZON.md, CLAUDE_BOOTS.md,
CLAUDE_DOW_JONES.md) unchanged — already had the right content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:29:16 +02:00
nickviljoen
f5aaf8da24 Merge feature/dow-jones-tuning into develop: WSJ Static prompt tuning 2026-05-06 12:03:56 +02:00
nickviljoen
9acda38adc Merge feature/boots-ppack into develop: Boots Production Pack profile (multi-page document mode) + tuned prompts 2026-05-06 12:03:51 +02:00
nickviljoen
f493a0182a Merge feature/axa-document-mode into develop: document-mode QC pipeline (Phases 1, 3, 4, 5) 2026-05-06 12:03:42 +02:00
nickviljoen
3b76bf2c9c Tune WSJ Static prompts: cap whitelist, graphic headline, split-layout logo, 30% sizing cap
- wsj_capitalization_punctuation: explicit complete-sentence whitelist + soft-flag pattern for Rule 5 price formatting (price_spacing_correct / price_bolded_correct accept needs_manual_check, new price_formatting_caveat field)
- wsj_typography_hierarchy: graphic/illustrative headline awareness — large stylised serif price/number graphics are recognised as the display headline; surrounding sans-serif copy is correctly classified as subhead/body. Stylised price headlines exempt from the period rule.
- wsj_logo_compliance: horizontal logo placement allows anchoring to the copy block on split/asymmetric layouts; mandatory sizing assessment block with worked examples, score capped at 6/10 for logos exceeding 30% of longest side.

Validated on 3 WSJ-NY test assets across 3 iterations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:01:59 +02:00
nickviljoen
cec11f1f6a Tune Boots PPack prompts: superscript guard, ALL CAPS / logotype exceptions, weight/sizing limits
Three rounds of prompt tuning against the Remington (4p), Easter Overlay
(18p), and Grenade (7p) sample packs. Easter Overlay (the noisiest)
climbed 72.38 → 78.97 → 80.04 across iterations, with strict-grade
violations dropping 27 → 18 → 14. Remaining violations are now genuine
compliance issues — the noise patterns are cleared.

boots_caveat_compliance:
- Superscript guard: vision LLM was flagging every roundel asterisk as
  superscript because the * glyph naturally sits high in its line.
  Strict two-feature rule now required (raised baseline AND visibly
  shrunk ~50-60% of body). Borderline cases → "needs_manual_check"
  with new superscript_caveat field. Caveat avg 4.4 → 7.27.
- Same vision-LLM caveat applied to weight_matching (Light vs Regular
  at small sizes is below detection threshold) and sizing_compliant
  (1-2pt size differences below detection threshold). New weight_caveat
  and sizing_caveat fields. Reserved 1-2 score band for unambiguous
  critical violations only.
- Explicit scoring principle: "when in doubt, prefer 7-8 with
  manual_check flags over a lower confident-violation score".

boots_brand_name_accuracy:
- ALL CAPS retail convention now explicitly acceptable. L'OREAL,
  ESTEE LAUDER, MAYBELLINE etc. no longer flagged as casing errors —
  only structural element mismatches (accents, hyphens, apostrophes,
  special chars) count.
- Stylised brand logotype exception: known logomarks like `17` for
  SEVENTEEN, &SISTERS ampersand styling, e.l.f. dot rendering are
  Pass — surfaced via new logotype_observations field.
- Brand name avg 5.53 → 7.47 → 6.67 (LLM run-to-run variability).

Strongest real catch in dataset: Easter Overlay page 14 is labelled
for the ROI market in production notes but uses £ instead of € on
the artwork. Exactly the pre-press error worth surfacing. Caught
consistently across all runs by boots_currency_locale.

CLAUDE_BOOTS.md updated with three-pack smoke-test table, vision-LLM
limitations summary, and the four reusable prompt-tuning patterns
that worked on this build.

Local-only — feature/boots-ppack remains unmerged until after Boots
show-and-tell.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:26:11 +02:00
nickviljoen
50d0063b37 Add Boots Production Pack profile (multi-page document mode)
New profile boots_ppack for QCing multi-page Boots production packs
(PowerPoint-exported PDFs, 4-18 pages each). Built on top of AXA's
document-mode infrastructure — branched off feature/axa-document-mode
because it reuses the dispatcher, ingest, and result writer.

New checks:
- boots_logo_compliance — three-path scoring (master wordmark / partner
  lock-up / no branding) so OLIVER x BOOTS-style footer lock-ups aren't
  scored against master wordmark rules. Conservative without a formal
  Boots logo guideline.
- boots_colour_palette — verifies CMYK/RGB/Hex spec values on creative-
  guidance pages against canonical Boots Blue / Health Primary Blue /
  Offer Red, plus visual sanity-check on artwork pages.

Existing checks tuned:
- boots_brand_name_accuracy: closed-world list semantics. Brands not on
  the approved list now go to names_not_on_list (manual review) instead
  of failing — the list is sourced from the original 7 docs and is known
  incomplete (Remington, Imodium, Maybelline etc. are legitimate Boots-
  stocked brands not on it).
- boots_tandc_wording: explicit font-weight caveat — Boots Sharp Regular
  vs Light isn't reliably distinguishable by vision LLM at small sizes.
  Surfaced via font_weight_caveat field + needs_manual_check value.

Page classifier (document_mode/page_classifier.py):
Heuristic tags each page as cover / checklist / palette / notes /
artwork. Validated on all 10 sample packs.

Strict-grade exemption (Profile.strict_grade flag):
Only artwork-classified pages count towards Pass/Fail. Cover, checklist,
palette, and notes pages are still QC'd and reported as Informational
but cannot trigger a Fail. Banner shows exactly which artwork-page
checks fell below 6.

Result writer extended:
- Per-page table with score + page_type pill for any page_each-scope
  check (auto-applied as fallback)
- Strict-grade banner (red on violation, green when clean)
- Page_type pills throughout the per-page strip

Smoke-test result (Remington 4-page pack, 2026-05-05):
Overall 70.75/100, strict-grade Fail. After two iterations of prompt
tuning, all three remaining strict-grade violations are real catches:
orphan asterisk in T&Cs, "they may not be stocked" wording deviation,
missing "Charges may apply". brand_name_accuracy 7.0 (was 3.0 before
list fix), logo_compliance 9.5 (was 1.5 before lock-up path fix).

Local-only — not pushed to dev or merged to develop until after Boots
show-and-tell. Same posture as feature/axa-document-mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:47:13 +02:00
nickviljoen
90563b8cf2 Add AXA document-mode QC pipeline (Phases 1, 3, 4, 5)
Multi-page PDF QC for AXA Ireland policy documents. Runs as a third mode
alongside static + video, gated on profile.mode. New code isolated under
backend/document_mode/ with new endpoints under /api/document/*.

Phase 1 — Spine + 6 deterministic doc-scope checks ($0, runs in seconds):
- Scope-aware dispatcher (document/targeted/page_sample/page_pair/page_each)
- axa_font_inventory, axa_phone_inventory, axa_bold_words_definitions,
  axa_page_numbering, axa_print_code, axa_omg_versioning
- Bootstrap bold-words dictionary extracted from Example 1 General Definitions

Phase 3 — Old-vs-new diff (~$0.50/run, 3-5 min):
- Page alignment via difflib SequenceMatcher (windowed fuzzy match)
- Vision-LLM page-pair diff via Gemini 2.5 Pro (8 concurrent)
- Two-slot upload UX, axa_policy_document_diff profile, mode=document_diff

Phase 4 — PDF accessibility (PyMuPDF, $0):
- 9 PDF/UA-1 aligned criteria (tagged structure, /MarkInfo, title, /Lang,
  encryption, font embedding, PDF version, XMP UA-conformance, alt-text)
- _run_verapdf() stub for optional Java-based veraPDF integration later

Phase 5 — Print preflight (PyMuPDF, $0):
- 7 criteria (page geometry, bleed, image colour spaces, image DPI,
  transparency, PDF/X conformance, spot colours)

Profile additions:
- axa_policy_document — 8 deterministic checks, $0 cost
- axa_policy_document_diff — 1 page-pair LLM check, ~$0.50/run

API additions:
- POST /api/document/start_analysis (single PDF)
- POST /api/document/start_diff (old + new PDFs)

Frontend additions:
- Third profile.mode value (document_diff) in applyProfileMode()
- Two-slot upload UX with PDF-only file pickers
- checkFormValidity() branches by mode for the analyse-button gate

Smoke-tested locally against Example 1 (Home Insurance V8, 86pp) and
Example 2 (Landlord V1 vs V10, 68→74pp) with real findings caught
including bold-words gaps, missing PDF/UA flag, transparency on press,
V1→V10 bold-formatting fixes. Plan + integration map + gotchas in
backend/AXA_DOCUMENT_MODE_PLAN.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 18:38:14 +02:00
nickviljoen
67ed7fdd9d Add wsj podcast profile to Dow Jones client, File naming check added to all profiles 2026-04-29 18:17:36 +02:00
nickviljoen
b32e8f0c8b Add wsj podcast profile to Dow Jones client, File naming check added to all profiles 2026-04-29 18:09:58 +02:00
nickviljoen
24c716df77 Fix /api/access_request iterating list_access_entries() as a list
list_access_entries() returns a dict {default_clients, entries} but the
endpoint iterated it directly, which yields the dict keys (strings) and
then crashed on .get('is_admin') with "'str' object has no attribute
'get'". Read access_data['entries'] instead so admin recipients are
collected correctly and the request email actually sends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 08:24:22 +02:00
nickviljoen
24ea62b082 Fix /api/access_request iterating list_access_entries() as a list
list_access_entries() returns a dict {default_clients, entries} but the
endpoint iterated it directly, which yields the dict keys (strings) and
then crashed on .get('is_admin') with "'str' object has no attribute
'get'". Read access_data['entries'] instead so admin recipients are
collected correctly and the request email actually sends.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 08:20:25 +02:00
nickviljoen
f17a4ed6da Box redirect URI: infer from hostname when X-Forwarded-Host is absent
The previous fix relied on Apache forwarding X-Forwarded-Host, but on
optical-dev that header isn't set. Apache uses ProxyPreserveHost (so
request.host correctly resolves to optical-dev.oliver.solutions) but the
backend connection is plain http and Flask sees no path prefix, so the
fallback emitted "http://optical-dev.oliver.solutions/auth/box/callback"
— which Box rejected as "insecure_redirect_uri" (no HTTPS) and which is
also missing the required /ai_qc/ prefix.

Resolution order is now:
  1. BOX_REDIRECT_URI env var (escape hatch / unusual deploys).
  2. X-Forwarded-Host header if Apache happens to send it.
  3. Otherwise: infer from request.host. Any host that isn't localhost
     or 127.0.0.1 is treated as the optical-dev / optical-prod proxy and
     gets HTTPS + the /ai_qc/ prefix. localhost stays http and rootless.

Verified all five paths (dev with and without XF-Host, laptop on
localhost and 127.0.0.1, explicit override) produce the right URL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:55:14 +02:00
nickviljoen
7c3945417a Compute Box OAuth redirect URI from the request
Caught a redirect_uri_mismatch on the dev server: the env file was the
localhost one (BOX_REDIRECT_URI=http://localhost:7183/auth/box/callback)
which deploy.sh resets on every deploy, so the dev server kept telling Box
"redirect me to localhost". Same thing would have hit prod.

Switched to request-based detection so the same code works on laptop, dev,
and prod:
- box_client.build_authorize_url and exchange_code_for_tokens now take
  redirect_uri as an explicit parameter (the two URIs MUST match — Box
  rejects the token exchange otherwise).
- New _box_redirect_uri() helper in api_server: prefers BOX_REDIRECT_URI
  if explicitly set (escape hatch), otherwise reads X-Forwarded-Host (set
  by Apache when behind the optical-dev / optical-prod reverse proxy,
  where the app is mounted at /ai_qc/), and falls back to request.host
  for direct local access.
- Dropped the per-env BOX_REDIRECT_URI from the four env files. Templates
  keep it commented out as documentation, and now also list all three
  redirect URIs you'll need to register in the Box developer console.
- box_client.is_configured() no longer gates on the redirect URI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:50:59 +02:00
nickviljoen
4939f990c5 Merge feature/box-oauth into develop (PR1: Box OAuth + token storage) 2026-04-27 15:42:46 +02:00