marriott-box-image-video-ta.../deploy
DJP 9e6a75feb6 Manual-only runs, DB-based skip check, backfill-from-Box
Previously a nightly APScheduler container fired the tagger on every
file in the configured Box folder. With ~5000 files coming, that's
~5000 Box HTTP calls every night just to ask "is this tagged?". Move
to manual-only mode and source the skip decision from the local DB.

- `db.is_file_already_tagged(conn, file_id)` — returns True iff the
  DB has a row with status IN ('success','backfilled'). Used by both
  image and video loops in main.py instead of the previous
  `check_existing_metadata(box_client, file_id)` Box round-trip.
- `fetch_existing_metadata(box_client, file_id)` (main.py) — returns
  the user-defined template fields as a flat dict by stripping the
  Box `$id`/`$type`/etc. attrs from the SDK response.
- `_run_backfill(run_id, db_conn)` (main.py) — walks the Box folder
  and inserts a `status='backfilled'` row for every file Box already
  has marriottUsa metadata for. Read-only against Box; safe to re-run.
  Use this after first deploy, or to repopulate the DB from Box.
- `POST /api/backfill` mirrors `POST /api/runs` (background thread,
  same live-state record).
- SPA: new "Backfill from Box" button next to "Run now" (with a
  confirm dialog and a yellow `.status-backfilled` event treatment).
- docker-compose.yml: removed the `tagger` (scheduler) service.
  Manual triggers via the SPA / `POST /api/runs` only. scheduler.py
  stays in the repo for archival / opt-back-in.
- deploy.sh: readiness now checks the `api` container instead of
  `tagger`; `--logs` tails api logs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 15:41:10 -04:00
..
apache-marriott-tagging.conf.tmpl Dockerize, add Postgres request log, FastAPI + React SPA 2026-05-11 14:56:58 -04:00
deploy.sh Manual-only runs, DB-based skip check, backfill-from-Box 2026-05-11 15:41:10 -04:00