Run model: long-running scheduler container (APScheduler) replacing the
systemd timer in Docker deployments. Every Gemini-analysed file is also
persisted to a Postgres `tagging_events` table (run_id, prompt, raw
response, validated metadata, Box-write outcomes, status, error, timing)
for search and audit. Box is still updated exactly as before and remains
the source of truth for "already tagged" — `db.log_event` swallows DB
failures so an outage can't stop a tagging pass.
Backend:
- `db.py` + `schema.sql` — append-only `tagging_events` with indexes on
run_id, file_id, created_at.
- `scheduler.py` — APScheduler BlockingScheduler with `SCHEDULE_CRON`
(default daily 02:00), `RUN_AT_STARTUP`, SIGTERM handling.
- `api.py` (FastAPI) — `/api/health`, `/api/me`, `/api/events?q=…`
(single-input search across file_name, folder_path, description,
status, file_id, validated_metadata::text, raw_response::text,
scenes::text), `POST /api/runs` (fire-and-forget pass in a background
thread), `/api/runs`, `/api/runs/{id}/events`. Every event response
carries a synthesised `box_url`.
- `auth.py` — Azure AD bearer-token validation against the tenant JWKS
(signature + aud + iss). `DEV_AUTH_BYPASS=true` short-circuits to a
configurable dev user, mirrored on the frontend by
`VITE_DEV_AUTH_BYPASS`.
Frontend (Vite + React + TS):
- `frontend/` SPA, Montserrat + black/white/#FFC407 palette.
- @azure/msal-react with the bypass switch (auto-signin when bypass off).
- Search bar across all logged fields, results list with metadata tags,
status pills, and "Open in Box ↗" links.
- "Run now" button kicks off a tagging pass via `POST /api/runs` and
polls `/api/runs/{id}/events` every 2 s for live progress.
Docker / compose:
- `docker-compose.yml` pins `name: marriott-tagging`. Three services:
`db` (postgres:16, named volume, bound to 127.0.0.1 only), `tagger`
(scheduler.py), `api` (uvicorn). Same image, different `command`.
- `Dockerfile` — python:3.12-slim, non-root user.
Deploy (optical-dev.oliver.solutions):
- `deploy/deploy.sh` — idempotent. Auto-picks free host ports
(POSTGRES_HOST_PORT 5435-5499, MARRIOTT_API_PORT 8003-8099), renders
`apache-marriott-tagging.conf` from the .tmpl, builds the SPA in a
one-shot node:20-alpine container, rsyncs `dist/` to
`/var/www/html/marriott-tagging/`, polls `/api/health`, and prints the
shared-vhost Include line.
- `apache-marriott-tagging.conf.tmpl` — proxy `/marriott-tagging/api/`
to the API container, alias `/marriott-tagging` to the SPA web-root,
SPA fallback to `index.html`.
systemd unit files left in place for the existing Ubuntu deployment
path; do not run both on the same host (would double-fire the tagger).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
67 lines
1.1 KiB
Text
67 lines
1.1 KiB
Text
# ── Project-specific (security-critical, do NOT commit) ──────────────────────
|
|
# Box JWT keypair + client secrets
|
|
box_config.json
|
|
# Gemini API key + DB creds + scheduler config
|
|
.env
|
|
.env.*
|
|
!.env.example
|
|
# Local virtualenv
|
|
env/
|
|
venv/
|
|
# Python bytecode
|
|
__pycache__/
|
|
*.pyc
|
|
*.py[cod]
|
|
# Docker / Postgres bind-mount data (if anyone switches off the named volume)
|
|
data/
|
|
pgdata/
|
|
|
|
# Generated by deploy.sh — rebuilt from .tmpl every deploy
|
|
deploy/apache-marriott-tagging.conf
|
|
|
|
# Frontend build artefacts
|
|
frontend/node_modules/
|
|
frontend/dist/
|
|
frontend/.vite/
|
|
|
|
# ── Bitbucket boilerplate ────────────────────────────────────────────────────
|
|
# Node artifact files
|
|
node_modules/
|
|
dist/
|
|
|
|
# Compiled Java class files
|
|
*.class
|
|
|
|
# Log files
|
|
*.log
|
|
|
|
# Package files
|
|
*.jar
|
|
|
|
# Maven
|
|
target/
|
|
|
|
# JetBrains IDE
|
|
.idea/
|
|
|
|
# Unit test reports
|
|
TEST*.xml
|
|
|
|
# Generated by MacOS
|
|
.DS_Store
|
|
|
|
# Generated by Windows
|
|
Thumbs.db
|
|
|
|
# Applications
|
|
*.app
|
|
*.exe
|
|
*.war
|
|
|
|
# Large media files
|
|
*.mp4
|
|
*.tiff
|
|
*.avi
|
|
*.flv
|
|
*.mov
|
|
*.wmv
|