T-2: Extract getJobStatusColor() into utils/jobStatusMessages.ts; StatusBadge now uses the
shared helper (single source of truth for badge colors).
PR-7: GET /admin/production/queue-stats — returns Celery queue depths via Redis LLEN.
Production dashboard shows a live panel (10s refresh) with per-queue task counts.
PR-8: POST /admin/production/jobs/{id}/upload-final-vtt — Production/Admin can upload a
hand-crafted VTT to bypass AI, writing to GCS and advancing the job to PENDING_QC.
Upload modal added to FailuresList with language + type (captions/ad) selectors.
docker-compose.optical-dev.yml: enable USE_CELERY_FALLBACK=true, set worker replicas=1
for all pipeline workers (ffmpeg/tts/whisper) with WORKER_CONCURRENCY=2 so the full
pipeline runs on the 2-CPU optical-dev server until Cloud Run VPC Connector is ready.
Fix: remove unused effectiveMs variable in TimelinePreview (TS6133).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
R-8 — Linguist language competence:
- Add User.languages[] BCP-47 field to backend model + UserResponse schema
- Frontend: show amber warning in assign modal when selected linguist has no
competence listed for the target language
PM VTT editing (FinalDetail):
- PM and ADMIN can now edit captions/AD in the final review stage
- VttEditor becomes read-write with onCueSave wired to updateVttMutation
- Other roles remain read-only
Timeline right-click + add pause:
- Right-click anywhere on the timeline opens a context menu showing the timestamp
- If near a pause point marker: "Edit timing" + "Regenerate TTS" options
- If on empty space: "Add AD cue at Xs" → inserts a new AD cue in the editor
- Pause point markers widened from 1px → 2px (3px on hover) for easier clicking
- Right-click on a pause point marker directly opens the editor
VttEditor insertAtTimeMs prop:
- New prop triggers programmatic insert at a specific video timestamp
- Used by the timeline right-click "Add AD cue here" action
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- POST /{job_id}/actions/blocked_on_source (L-18): linguist/reviewer flags a source
video issue; moves job to QC_FEEDBACK and records blocked_on_source_reason/at/by
- POST /{job_id}/actions/promote_to_qc (PR-10): production/admin manually bypasses
AI processing for edge-case failures; adds audit history entry
- Reset reviewed_cues to 0 on submit_for_review (R-12) so reviewer must re-acknowledge
all cues after each linguist resubmit
- Add assert_job_in_user_org + get_user_org_ids to core/dependencies.py (used by
the new endpoints and the cross-tenant isolation test suite)
- Remove unused ingest_and_ai_task / translate_and_synthesize_task imports
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Blocks 1–5 of stabilization plan:
SECURITY
- validation.py: restore settings.upload_max_video_bytes (T-14 regression fix)
and JSON object key validation that was incorrectly removed
- MT-18: add accessible_org_ids filter to list_for_reviewer/list_for_linguist
so reviewers/linguists only see jobs from their own org in QC queue
- MT-17: add Membership.team_ids[], write to it on invitation acceptance and
direct team add/remove; migration backfills from Team.member_user_ids
- MT-19: validate all target_team_ids belong to invitation's org_id at creation
TESTS
- Restore test_cross_tenant_isolation.py (was deleted, only .pyc remained)
- Extend with MT-18 reviewer org isolation tests
QUICK WINS
- W-8: remove time.sleep(1) + dead debug block from POST /jobs (task was
undefined — would have caused NameError → HTTP 500 on every job creation)
- T-13: warn at startup when REDIS_URL configured but connection failed
- T-16: skip language_qc lifespan migration when count=0 (no DB scan on startup)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
__init__ was setting self.version = "0000-00-00-000000" on every instantiation,
overriding the subclass class variable. All migrations were recorded in DB
with the default version instead of their own, causing duplicate key errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Heavy pipeline tasks (ingest, translate, render, tts) now dispatch to
va-worker Cloud Run Job which has its own Dockerfile.cloudrun with ffmpeg.
API and lightweight Celery worker (notify/embed) don't need it.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Heavy pipeline tasks (ingest, translate, render, rerender) now dispatch
to a Cloud Run Job (va-worker) instead of local Celery workers. optical-dev
runs only api + lightweight worker (notify/embed) within its 2-CPU budget.
- backend/app/tasks/runner.py — Cloud Run Job entrypoint
- backend/app/services/cloud_run_dispatch.py — replaces .delay() for heavy tasks
- backend/Dockerfile.cloudrun — Cloud Run worker image (ffmpeg included)
- docker-compose.optical-dev.yml — 2-CPU safe overrides, disables heavy workers
- cloudbuild.yaml — builds va-worker image and updates Cloud Run Job
- deploy-dev.sh — uses 3-file compose, builds only api+worker locally
- routes_jobs, routes_admin_production, ingest_and_ai, translate_and_synthesize
— all dispatch sites updated to use cloud_run_dispatch.dispatch()
USE_CELERY_FALLBACK=true in .env.local to use Celery locally during dev.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- create_access_token gains optional org_ids: list[str] param; encodes
{exp, sub, org_ids, v:2} — org_ids is a prefilter hint only, never
used as authorization source of truth (Redis cache is authoritative)
- Login, MS login, refresh endpoints: fetch memberships and include
org_ids in issued access tokens via _get_user_org_ids() helper
- routes_invitations.py accept flow: same org_ids population on token
- get_current_user: reads org_ids from payload, attaches as transient
user.__dict__["org_ids"] — available to OrgScopedQuery for prefilter
- Force logout: rotate JWT_SECRET env var at deployment time (no code
change needed; all existing tokens immediately invalidated)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- JobBrief model (DRAFT→SUBMITTED→APPROVED→FULFILLED) with 6 CRUD
endpoints: list, create, get, patch (DRAFT only), submit, approve
- All endpoints use MembershipContext; read=VIEWER, mutate=MANAGER,
approve=ADMIN for org-scoped access
- create_job accepts brief_id Form field; validates APPROVED brief,
copies organization_id/project_id/deadline from brief, marks brief
FULFILLED after job insert
- organization_id now populated from project client_id on job create
(fixes missing multi-tenant field on new jobs)
- migration_2026-04-29-000001: job_briefs collection + 4 indexes
- Wired briefs router into main.py
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- GET /admin/production/failures: list failed jobs filtered by step/org
- POST /admin/production/bulk-retry: dispatch retry for up to 50 jobs
with "auto" (from failure.step) or "from_scratch" strategies
- FailuresList.tsx: accordion-grouped by error type, multi-select,
bulk retry action, step label, retry count (red >3), updated date
- Sidebar: "Failures" item with live badge for production/admin roles
(polls useJobs with processing_failed,tts_failed,render_failed)
- New useFailures / useBulkRetry hooks
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ingest_and_ai: exception handler now sets status=PROCESSING_FAILED and
writes Job.failure{step, type, message, retriable, occurred_at} instead
of leaving status unchanged (was silent failure).
translate_and_synthesize: replace the blanket TTS_FAILED status (even for
translation failures) with PROCESSING_FAILED + failure.step="translation"|
"tts" based on current job status at failure time.
render_accessible_video: add failure{step="render"} alongside existing
RENDER_FAILED status for UI consumption.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add JobFailure model (step, type, message, retriable, occurred_at,
retry_count) to job.py. Add PROCESSING_FAILED to JobStatus (legacy
TTS_FAILED/RENDER_FAILED preserved for back-compat).
Add missing Job fields that existed in DB but not the Pydantic model:
organization_id, brief_id, gcs_prefix, initial_linguist_id,
initial_reviewer_id, failure, retry_count.
Add JOB_TASK_FAILED, JOB_RETRY, JOB_BULK_RETRY to AuditAction enum.
Add migration 2026-04-29-000000: processing_failed in schema validator +
compound indexes (failure.step/status) and (status/org_id/created_at).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
list_jobs now uses MembershipContext (Redis-cached, 60s TTL) to build
org-scoped queries instead of per-request memberships.find(). Falls back
to legacy get_accessible_project_ids for users with no memberships.
get_job replaces the role-specific CLIENT/PM access check with
get_job_or_403() which uniformly checks organization_id membership for
all roles (returns 404 not 403 to avoid leaking cross-org job existence).
get_accessible_project_ids in dependencies.py now uses _cached_memberships
from authz.py, eliminating the duplicate uncached DB query.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
All 8 glossary route handlers now verify the requesting user has org
membership in the target client_id using assert_user_in_org() from
core/authz.py. Read endpoints require VIEWER, mutations require MANAGER,
archive requires ADMIN (org-level). Removed dead _assert_can_read() and
_require_client_staff() helpers. Removed unused require_roles/User/UserRole
imports. Also added get_job_or_403() to authz.py for MT-15.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The unconditional `if user.role in (CLIENT, PROJECT_MANAGER): return`
allowed any PM to access any client regardless of membership. Removed;
kept pm_client_ids legacy fallback for pre-migration users.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Prevent PM in org A from assigning linguist/reviewer from org B.
Added _assert_user_in_job_org() helper that resolves job org_id (with
project fallback) and checks db.memberships for the assignee. Also added
assert_user_in_org() and get_job_or_403() to core/authz.py for use in
upcoming MT-13 and MT-15 commits.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
W-4: team assignment (linguist/reviewer) stored on job at creation,
auto-assigned to all language QC states on first GET /language-qc
(lazy init via auto_assign_defaults)
L-3 WS: broadcast_to_job when reviewer opens VTT for editing;
QCDetail shows "User X is editing [lang]" banner (auto-clears 5s)
R-5: comment broadcast via broadcast_to_job on add_comment();
QCDetail invalidates comments query on language_qc_comment WS event
L-15: QCDetail subscribes to language_qc_assigned WS event →
refetches lang-qc data and shows toast
R-7: VttEditor gets onCuePlay prop; AD editor in QCDetail wires
handleAdCuePlay → switches to accessible video mode, seeks & plays
T-15: beforeunload warning in NewJob while upload is in progress
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
T-6: Add Blob URL native <track> in VideoWithCaptions so browser CC button works in fullscreen.
T-7: Sync hidden <audio> AD playback with video play/pause/seeked events.
T-11: Double Submit Cookie CSRF — _set_auth_cookies issues httponly refresh_token + readable csrf_token; /refresh validates X-CSRF-Token header; frontend reads csrf_token cookie and sends header on all refresh calls.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend:
- VttContentResponse gets etag field (SHA1 of captions+AD content)
- VttUpdateRequest gets if_match field (optional)
- GET /jobs/{id}/vtt: computes and returns etag
- PATCH /jobs/{id}/vtt: if if_match present, fetches current content, recomputes
hash, returns 409 Conflict if mismatch
Frontend:
- VttContentResponse type + VttUpdateRequest type updated
- QCDetail stores vttEtag from GET response
- All updateVttMutation calls pass if_match: vttEtag
- 409 responses show specific "Conflict: another user has modified" message
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend:
- ShareToken model (share_tokens collection)
- POST /jobs/{id}/share — create token (PM/PROD/ADMIN)
- GET /jobs/{id}/share — list active tokens
- DELETE /jobs/{id}/share/{token_id} — revoke token
- GET /public/share/{token} — unauthenticated preview with signed GCS URLs (6h TTL)
Returns video, captions, AD for all languages
Frontend:
- ShareView.tsx — public page at /share/:token with language switcher, video player, download tiles
- App.tsx — /share/:token route (no auth wrapper)
- QCDetail.tsx — "↗ Share link" button in header → modal to generate + copy link
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- POST /jobs/{job_id}/languages/bulk-assign — assigns linguist (required) and
reviewer (optional) across all or selected languages; supports only_unassigned
flag and optional deadline
- bulkAssignLanguages() added to API client
- QCDetail: "Assign all languages" button in Languages header; opens modal with
linguist/reviewer dropdowns, deadline, and skip-already-assigned checkbox
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Changes:
- Dashboard: add project_manager case (final review / QC counts / new job widgets)
and production case (AI pipeline / failures widgets)
- Sidebar: add project_manager to Final Review and Audit Log nav items;
live badge counts for QC Queue (pending_qc) and Final Review (pending_final_review)
- App.tsx: add project_manager to Final Review and Audit Log RoleGates (W-10, PM-18)
- Login: role-based redirect after login — linguist/reviewer → /qc/queue, others → /
- language_qc._assert_can_approve: enforce two-stage QC; remove linguist self-approve
fallback; require reviewer assignment + submitted_for_review_at (W-6)
- routes_jobs.complete_job: allow project_manager to complete jobs (W-9)
- notify.py: re-enable email notifications (W-7)
- Fix 400 on cue save: treat empty-string audio_description_vtt/captions_vtt as absent
both in backend (truthy check) and frontend (|| undefined) — root cause was adVtt
initialising to '' when job has no AD track
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Two-stage QC workflow: linguist edits + submits → reviewer approves/rejects per language.
New statuses: in_progress, pending_review, in_review. New service functions: submit_for_review,
open_review, assign_reviewer, reassign_reviewer, add_comment. Linguist and reviewer deadlines.
- Reject now resets language to in_progress so linguist can iterate without full re-assignment.
- QC comment threads per language (append-only), visible to all assignees.
- Email notifications via Mailgun on: assignment, submit-for-review, comment, approve, reject.
Best-effort (failures do not roll back QC actions). asyncio.gather for parallel fan-out.
- New audit actions: LANGUAGE_QC_REVIEWER_ASSIGN/REASSIGN, LANGUAGE_QC_SUBMIT,
LANGUAGE_QC_OPEN_REVIEW, LANGUAGE_QC_COMMENT.
- Inline project picker in NewJob: "+ Create new project…" option with name, default
languages, default linguist, default reviewer. Pre-fills languages on the new job.
- Project model extended with default_languages, default_linguist_id, default_reviewer_id.
- RBAC: CLIENT org-members can now create projects (backend guard relaxed).
- LinguistQueue: role toggle "As linguist / As reviewer" + new status tabs.
- QCDetail: two-slot assignment cards (linguist + reviewer), deadline display, role-aware
action buttons, comments panel with optimistic insert and 15s refetch.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
text-embedding-004 and text-multilingual-embedding-002 are not available
through this API key. gemini-embedding-001 (768-dim, multilingual) is.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Projected docs only have _id/source_term/translations; validating against
GlossaryTerm (which requires glossary_id, version_id, source_term_lower)
caused 500 on the terms endpoint. Return plain dicts instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- /audit-logs/user/{id}: now accepts email OR ObjectId, returns bare array
- /audit-logs/security: returns bare array instead of {logs, hours} wrapper
Both match AuditLogEntry[] that the frontend expects.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The legacy GET /audit-logs (returning wrong shape) shadowed the proper one.
Removed the duplicate and changed page/size params to skip/limit to match
the AuditLogQuery the frontend sends.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PyMongo Collection raises NotImplementedError on bool(), so 'if not self.collection'
crashes on every audit log write. Changed to 'if self.collection is None'.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- routes_admin.py: size query param max raised from 100 → 500 so
ClientDetail.tsx (size=200) no longer returns 422
- GlossaryDetail.tsx: three .toLocaleString() calls guarded with ?? 0
to prevent TypeError when term_count is undefined on first render
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
default_factory=PyObjectId produced "" (empty string) since
Annotated[str, ...] is a type annotation, not a callable factory.
Replace with lambda: str(ObjectId()) to generate a real unique ID.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
poetry.lock was out of sync with pyproject.toml (cost-tracker and
glossary deps added since last lock). Regenerated with Poetry 2.1.4.
Also updated Dockerfile.whisper-service from poetry==1.8.2 to 2.1.4
to match the main Dockerfile and avoid format incompatibility.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
poetry.lock was generated with 2.1.4 — using 1.8.2 caused
incompatible lock file error and failed Docker builds.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The function was copy-pasted identically in ingest_and_ai.py and
translate_and_synthesize.py. Extracted to tasks/_websocket_bridge.py
as the single definition; all four task modules now import from there.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- login and microsoft_login routes now use Depends(get_database) instead
of creating a per-request MongoClient — removes connection-pool churn
under load
- MicrosoftAuthService._get_openid_config/_get_jwks/validate_token are
now async, using httpx.AsyncClient instead of blocking requests.get —
removes ~400ms event-loop block per Microsoft login
- Removed unused AsyncIOMotorClient import from routes_auth.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
seed_default_admin now skips creation and logs a warning when
DEFAULT_ADMIN_PASSWORD is unset instead of falling back to the
hardcoded ChangeMe123! value. Existing-admin promotion path is
unaffected. Added DEFAULT_ADMIN_PASSWORD to .env.prod.example.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaced the bare except that leaked str(e) (JWT library internals,
claim validation messages) with a generic "Invalid refresh token" detail.
Full traceback is now logged server-side via the structured logger.
Re-raises HTTPException before the generic handler so valid 401s from
inner checks are not double-wrapped.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
get_current_user and get_current_user_optional now reject any token
whose payload carries type="refresh". Access tokens carry no type field
so the check is asymmetric and safe. Prevents a refresh-cookie value
from being replayed as a Bearer access token.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removed /api/v1/auth/login from the rate-limit bypass list in both
rate_limiting.py and main.py. The existing 5-req/5-min limit for the
login endpoint was already configured but never applied.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
user.role stored as plain string in MongoDB — calling .value on it
caused AttributeError on every login, blocking all auth.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>