marriott-box-image-video-ta.../schema.sql
DJP 1f2c2ff8e1 Multi-token + fuzzy search; admin-only Run Now / Backfill
Search:
- Previously /api/events did one ILIKE %q% across the columns, so
  "female city" required the literal substring "female city" to
  appear somewhere. Now the query is tokenised on whitespace; every
  token must match somewhere (AND), and each token matches either
  by substring (ILIKE) across the searched columns OR by trigram
  similarity (pg_trgm) against a concatenated text blob with a 0.3
  threshold — handles typos like "femalle" → "female".
- Results ranked by summed similarity score across all tokens, then
  recency. Empty query falls back to "newest 100".
- schema.sql: CREATE EXTENSION IF NOT EXISTS pg_trgm (idempotent;
  applied by ensure_schema on api startup).

Admin gating:
- auth.py: User now carries `is_admin`. Computed from a
  comma-separated ADMIN_EMAILS env var (case-insensitive match
  against `preferred_username`/`upn`/`email` claim). New
  `require_admin` FastAPI dependency 403s non-admins.
- In DEV_AUTH_BYPASS mode the dev user is admin by default; flip
  DEV_AUTH_IS_ADMIN=false to test the read-only UX without enabling
  SSO.
- POST /api/runs and POST /api/backfill now gated by require_admin.
- /api/me carries is_admin so the SPA can hide the destructive
  buttons for non-admins.

Frontend:
- App.tsx fetches /api/me on mount and hides Run Now + Backfill
  unless `is_admin` is true. Non-admins still see search + results +
  recent-runs table.

docker-compose / .env.example: thread ADMIN_EMAILS +
DEV_AUTH_IS_ADMIN into the api container.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 15:51:50 -04:00

32 lines
1.5 KiB
SQL

-- Marriott Box Tagger — request log
-- One row per file the tagger sent to Gemini (success or error).
-- Skipped-as-already-tagged files do not produce rows.
-- pg_trgm powers the fuzzy `similarity()` call in /api/events. Idempotent.
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE TABLE IF NOT EXISTS tagging_events (
id BIGSERIAL PRIMARY KEY,
run_id UUID NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
file_id TEXT NOT NULL,
file_name TEXT NOT NULL,
folder_path TEXT,
media_type TEXT NOT NULL CHECK (media_type IN ('image','video')),
gemini_model TEXT NOT NULL,
prompt TEXT,
raw_response JSONB,
description TEXT,
scenes JSONB,
validated_metadata JSONB,
metadata_write_success BOOLEAN,
description_write_success BOOLEAN,
scene_comment_write_success BOOLEAN,
status TEXT NOT NULL,
error_message TEXT,
duration_ms INTEGER
);
CREATE INDEX IF NOT EXISTS tagging_events_run_id_idx ON tagging_events (run_id);
CREATE INDEX IF NOT EXISTS tagging_events_file_id_idx ON tagging_events (file_id);
CREATE INDEX IF NOT EXISTS tagging_events_created_idx ON tagging_events (created_at DESC);