ai_qc/backend/BOX_CLIENT_ONBOARDING.md
nickviljoen 31b059de79 docs: add Box client onboarding runbook
Documents the end-to-end process for adding a new client to the
Box-webhook-driven QC pipeline:

1. Box admin: create INCOMING + REPORTS folders, invite service account
2. Code: add box_folder_id / box_reports_folder_id / default_profile
   to client_config.py, ship via PR
3. Verify service account access with `box_setup.py list-folder`
4. Register webhook via `box_setup.py register-all-clients` (or UI)
5. End-to-end test by uploading a sample asset, watching logs,
   confirming report appears + source moves to _PROCESSED
6. Optional: tune default_profile from the Settings UI without a code
   deploy
7. Promote to prod (develop→main PR, tag, deploy.sh prod)

Includes a gotchas table for the issues most likely to come up:
403s from missing collaborator invites, signature verification
failures, folder ID mismatches, replace-upload behavior, etc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 14:12:48 +02:00

10 KiB

Box Client Onboarding Runbook

Adds a new client to the Box-webhook-driven QC pipeline (Phase 4). Run through this once per client. Most steps need ~5 minutes; total ~30 minutes including Box admin turnaround for collaborator invites.

Architectural reference: the JWT auth + webhook endpoint live in backend/box_jwt_client.py and backend/api_server.py (search for _run_box_triggered_analysis). The admin CLI is backend/scripts/box_setup.py. The JWT auth coexists with an older per-user OAuth flow in backend/box_client.py — different code path, dormant scaffolding, not used by this pipeline.


What you need before starting

  • Box admin access (or someone who can act as one) — to create folders and invite the service account.
  • SSH access to the dev server (optical-production-dev) — to run the bootstrap CLI and tail logs.
  • Repo write access — to land the client_config.py change as a PR.
  • The client's profile decisions — which profile should be the unattended-run default? (Pick from the client's existing profiles list.)

Already done at the platform level (don't redo per-client):

  • JWT config JSON at /opt/ai_qc/backend/config/box_jwt_config.json on each server
  • BOX_WEBHOOK_PRIMARY_KEY + BOX_WEBHOOK_SECONDARY_KEY in each server's env file
  • ffmpeg installed (for video pre-flight)

Step 1 — Box-side prep (admin task)

For client <CLIENT> (e.g. Diageo):

  1. Create two folders in Box:

    • AI-QC > INCOMING > AI QC <CLIENT> IN — where source assets land
    • AI-QC > REPORTS > AI QC <CLIENT> REPORTS — where QC reports land
  2. Invite the JWT service account as a collaborator on BOTH folders. Role: Editor or higher. (Editor lets it read uploads, write reports, and move files into the auto-created _PROCESSED subfolder. Co-owner also works.)

  3. Capture the folder IDs. Box shows them in the URL when you open a folder, or you can list them programmatically once invites are in:

    cd /opt/ai_qc
    venv/bin/python backend/scripts/box_setup.py list-folder <parent_AI-QC_folder_id>
    

Step 2 — Code change

Edit backend/client_config.py, add three optional fields to the client entry:

'<client_id>': {
    'name': 'Client Display Name',
    'profiles': ['client_specific_profile', 'static_general', 'video_general'],
    'display_name': 'Client Display Name',
    'description': '...',
    'box_folder_id': '<INCOMING folder ID>',
    'box_reports_folder_id': '<REPORTS folder ID>',
    'default_profile': '<one of the profiles above>',
},

Then:

  • Push as a small PR → merge to develop
  • On the dev server: cd /opt/ai_qc && ./backend/scripts/deploy.sh dev
  • No env-file backup dance needed (this is a code-only change)

Step 3 — Verify the service account got access

Before registering webhooks, sanity-check that the service account can actually read the folders the admin invited it to:

cd /opt/ai_qc
venv/bin/python backend/scripts/box_setup.py list-folder <INCOMING folder ID>
venv/bin/python backend/scripts/box_setup.py list-folder <REPORTS folder ID>

Expected: both print Folder <id> contains N items: even if empty.

If you get Access Denied / HTTP 403: the service account isn't actually a collaborator yet. Box admin needs to retry the invite. Common causes:

  • Invite went to the wrong identity (Box has separate "user" and "app" identities — the JWT app is an app)
  • Invite is pending acceptance somewhere
  • Folder was created but invite wasn't applied at the right level

Don't proceed until both list-folder calls succeed.


Step 4 — Register the V2 webhook

Option A: CLI (recommended) — idempotent, batch-able, lives in version control:

cd /opt/ai_qc
venv/bin/python backend/scripts/box_setup.py register-all-clients \
    https://optical-dev.oliver.solutions/ai_qc/api/box/webhook

The script:

  • Scans client_config.py for every client with box_folder_id set
  • For each, checks Box for an existing webhook on that folder pointing at the given URL
  • Skips ones that already exist
  • Creates webhooks for any that are missing
  • Prints <client> (<folder_id>): CREATED webhook id=<id> or SKIP — webhook already exists

Safe to re-run any time; it won't duplicate.

Option B: Box Developer Console UI — useful for one-off testing:

  • Box Developer Console → your Custom App → Webhooks tab → Create Webhook
  • URL: https://optical-dev.oliver.solutions/ai_qc/api/box/webhook
  • Content Type: Folder → search/pick the client's INCOMING folder
  • Event Triggers: tick FILE.UPLOADED only (do not tick others — they'd trigger spurious webhook deliveries)
  • Save

No new signing keys to generate — they're app-level, configured once for the whole Custom App.


Step 5 — End-to-end test

Open one terminal:

sudo journalctl -u ai-qc.service -f

In Box: upload a small test asset (image, PDF, or video) to the client's INCOMING folder.

Within a few seconds you should see (timestamps abbreviated):

Box webhook: dispatching session=<ts> client=<client_id> profile=<default_profile> file_id=...
Box webhook: downloaded <file> → uploads-dev/<ts>/<file>
Running check 1/N: <check_name>
...
Box webhook: uploaded report QC_Report_<ts>_<file>.html → folder <REPORTS folder ID>
Box webhook: moved source → _PROCESSED/<ts>_<file>
Box webhook: analysis complete for session <ts>, score <N>

Then in Box, verify:

  • A new QC_Report_<ts>_<original-filename>.html exists in the REPORTS folder
  • The source file has been moved into the auto-created _PROCESSED subfolder inside INCOMING. Its new name has the session_id prefix, which ties back to the corresponding report.

Step 6 — (Optional) Tune the default profile from the UI

If the team finds that the static default_profile in code doesn't match how they want webhook-triggered runs to behave, an admin can change it without a code deploy:

  1. Open the app → pick the client in the picker
  2. ⚙️ SettingsDefault Profile tab
  3. Click a different profile → Set as default

The override is persisted to backend/client_defaults.json (gitignored, per-server) and takes effect immediately on the next webhook run. Revert to static default clears the override.


Step 7 — Promote to prod

After the dev test passes:

  1. PR develop → main on Bitbucket. Merge.
  2. Tag main: e.g. v1.2.0, push the tag.
  3. On the prod server (optical-production):
    cd /opt/ai_qc
    ./backend/scripts/deploy.sh prod v1.2.0
    
  4. Once-per-environment prod prerequisites (you only do these the first time prod gets Phase 4, never again):
    • JWT config JSON at /opt/ai_qc/backend/config/box_jwt_config.json (scp from your laptop, chmod 600)
    • BOX_WEBHOOK_PRIMARY_KEY + BOX_WEBHOOK_SECONDARY_KEY in production.env — these are the same app-level keys as dev
    • sudo apt install ffmpeg (for video pre-flight)
  5. Register webhooks pointing at the prod URL (different from dev's URL — each webhook is bound to one address):
    cd /opt/ai_qc
    venv/bin/python backend/scripts/box_setup.py register-all-clients \
        https://optical-prod.oliver.solutions/ai_qc/api/box/webhook
    

The Box folders themselves are shared — you don't create new prod-only folders. Both dev and prod webhooks fire on the same client folders. If you don't want prod handling uploads yet, just don't register the prod webhooks until you're ready.


Common gotchas

Symptom Likely cause Fix
403 from list-folder Service account isn't a collaborator on that folder yet Box admin re-invites with Editor role
Box webhook: signature verification failed in logs Signing keys in env don't match what the Custom App has Box Developer Console → Manage Signature Keys → regenerate → update env on each server → restart service
Box webhook: no client configured for Box folder <id> The folder ID Box sent doesn't match any box_folder_id in client_config.py Check client_config.py against the actual Box folder ID; they're strings, must match exactly
Box webhook: skipping non-QC extension <ext> User uploaded a file type we don't QC (e.g. .docx, .zip) Working as intended; document for the client
Webhook fires correctly but source file stays in INCOMING The report-upload step failed earlier; the move is gated on a successful report upload so the user can retry by re-uploading Look upstream in the log for failed to upload report to Box: <error> and fix the cause (usually a permissions issue on the REPORTS folder)
Re-uploading the same filename doesn't trigger a fresh webhook This is normal Box V2 behavior — same-name "replace" uploads create new versions of the existing file, which the folder-scoped webhook doesn't fire on The auto-move-to-_PROCESSED step solves this for the happy path. If a file got stuck in INCOMING because of a previous failure, move/delete it manually so the next upload is a genuinely-new file
Reports folder fills up indefinitely No auto-cleanup of old reports — by design Manual cleanup, or add an age-based pruning script as a follow-up
_PROCESSED folder not auto-created Service account doesn't have Editor (Viewer can't create subfolders) Box admin upgrades the collaborator role to Editor

What this onboarding does NOT cover

  • Removing a client from the integration — to stop processing: delete the webhook in the Box Developer Console (or box_setup.py delete-webhook <webhook_id>), then remove the box_folder_id field from client_config.py in a PR. Existing reports in the REPORTS folder are left alone.
  • Multiple webhook-triggered profiles per client — current schema is one default profile per client. If a client needs FILE.UPLOADED in one folder to run profile A and a different folder to run profile B, that's a schema change (one client_config.py entry per folder, or extend the schema to {folder_id: profile_id} maps).
  • Webhook health monitoring — there's no alert if Box stops delivering. If you suspect webhooks are silent, drop a fresh test asset and watch logs; if nothing fires, check Box Developer Console → Webhooks → the webhook's App Diagnostics tab.