authz.py (new):
- MembershipContext — per-request membership dict for the current user
- get_membership_context FastAPI dependency
- require_org_role(min_role) — dependency factory keyed off org_id path param
- require_platform_admin()
- OrgScopedQuery — adds organization_id filter; platform admin passes through
- bump_user_membership_cache — invalidates Redis key on membership writes
dependencies.py:
- get_accessible_project_ids now queries memberships collection first;
legacy pm_client_ids / team.member_user_ids fallback retained until migration runs
(four job-route access checks at lines 608/1054/1181/1538 are fixed via this function)
routes_clients.py:
- _assert_pm_or_admin and _assert_client_access are now async and query memberships
- All 10 call sites updated with await + db arg
emailer.py:
- Switched from SendGrid to Mailgun REST API via httpx (already in requirements)
- _send() is now fully async; same public method signatures preserved
- send_completion_email uses _send()
config.py:
- Added mailgun_api_key, mailgun_domain, mailgun_from settings
- sendgrid_api_key kept with empty default for backward compat
migration_2026-04-28-000003:
- Backfills job.organization_id from project.client_id
- Creates (organization_id, status, created_at) sparse index on jobs
routes_organizations.py / routes_invitations.py:
- Call bump_user_membership_cache after every membership write
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend:
- New UserRole.PROJECT_MANAGER with pm_client_ids[] on User model
- New models: Client (slug-based), Team (member_user_ids[]), Project (client-scoped)
- Job model gains project_id field
- New GET/POST/PATCH/DELETE /clients, /clients/{id}/teams, /clients/{id}/projects,
/clients/{id}/pm routes (admin-only client CRUD; PM or admin for teams/projects)
- get_accessible_project_ids() helper: staff→all, PM→their clients' projects,
CLIENT→projects from teams they belong to (with legacy owner fallback)
- list_jobs, get_job, bulk_download, get_vtt_content, delete_job all use new isolation
Frontend:
- UserRole type gains 'project_manager'
- Job, JobCreateRequest gain project_id field
- Client, Team, Project, PMUser types added
- ApiClient: full client/team/project/PM CRUD methods
- useClients hook with all query/mutation hooks
- Admin pages: ClientList + ClientDetail (teams, members, projects, PM assignment)
- NewJob form: client + project picker (shown when clients exist)
- Sidebar: Clients nav item for admin and project_manager roles
- Routes: /admin/clients and /admin/clients/:clientId behind RoleGate
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New services/cost_tracker.py: sync httpx preflight()/record() + async wrappers;
BudgetExceeded exception; no-op when COST_TRACKER_BASE_URL is empty
- Preflight budget check added before ingestion (Gemini), per-language translation
(video-native + traditional), and per-language TTS dispatch
- _record_gemini_usage and _record_tts_cost now call cost_tracker directly;
removes broken asyncio.get_event_loop() hack from sync Celery worker
- Fix: _cost_ctx now threaded into extract_accessibility_targeted (video-native path)
- Fix: user_id/cost_project_id now propagated through dispatch_language_tts →
synthesize_cue_task.s() and the rerender_accessible_video.py re-render path
- Remove oliver-cost-tracker SDK dependency (was commented-out/never installed)
- Drop cost_tracker_outbox_path setting and get_cost_tracker() factory
- Update COST_TRACKER_BASE_URL default to optical-dev.oliver.solutions in
.env.prod.example, docker-compose.yml, and all Cloud Run service yamls
- Cloud Run yamls use Secret Manager ref (cost-tracker-api-key) for the API key
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add LINGUIST role to UserRole enum (backend + frontend)
- Grant linguists access to QC Review, Final Review, review notes, and VTT editing
- Add MongoDB migration to update schema validator with linguist role
- Add admin seed: vadymsamoilenko@oliver.agency is promoted to admin on startup
- Add User Management sidebar link for admin users
- Fix Login.tsx role type cast to use UserRole instead of hardcoded union
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rewrote VTT translation to two-step (text-only → Gemini → apply to original timestamps) preventing caption timing desync
- Added polling fallback for all processing states and Safari visibilitychange WebSocket reconnect
- Added 11 new TTS languages (cs, da, fi, hu, no, sk, sv, es-419, pt-BR, fr-CA)
- Updated caption/AD prompts to DCMP Captioning Key & Description Key standards (line splitting, ♪ music notation, italic tags, caption positioning, ethics guidelines)
- Added descriptive transcript generation (WCAG 2.1 §1.2.1) combining captions + AD into plain text
- Fixed amix normalize=0 to prevent audio loss in rendered videos
- Fixed AD re-timing double-count when source_ms is None
- Fixed cue block numbering to be 1-based in VttEditor and Timeline Preview
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add dynamic ElevenLabs voice catalog with provider toggle in the UI,
allowing users to browse ElevenLabs voices, configure stability and
similarity boost settings, and preview/synthesize with ElevenLabs TTS.
Backend:
- New elevenlabs_voices.py service with 1-hour cached API fetching
- TTS routes support ?provider= query param for voices and options
- Preview endpoint routes to ElevenLabs or Gemini based on provider
- stability/similarity_boost params flow through TTS synthesis pipeline
- TTSPreferences model extended with ElevenLabs-specific fields
- Deprecated hardcoded elevenlabs_voices config (now fetched dynamically)
Frontend:
- Provider toggle (Gemini/ElevenLabs) in VoiceSelector
- ElevenLabsSettingsPanel with stability and similarity boost sliders
- VoicePreviewButton supports provider-specific preview parameters
- API client passes provider param to voices, options, and preview endpoints
- New VoiceInfo, ProviderVoicesResponse, ProviderOptionsResponse types
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow configuring Celery worker concurrency via environment variables
to take advantage of Cloud Run autoscaling:
- Add WORKER_CONCURRENCY, WHISPER_WORKER_CONCURRENCY, FFMPEG_WORKER_CONCURRENCY
settings to config.py with recommended values documented
- Update Dockerfile to use ${WORKER_CONCURRENCY} and ${WHISPER_WORKER_CONCURRENCY}
environment variables instead of hardcoded values
- Update docker-compose.yml to pass concurrency env vars to worker commands
- Add WHISPER_SERVICE_URL and FFMPEG_SERVICE_URL to relevant workers
Recommended settings:
Local mode: WHISPER=1, FFMPEG=1 (CPU/RAM constrained)
Cloud Run mode: WHISPER=10, FFMPEG=20 (match autoscaling limits)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Migrate CPU-intensive workloads to Cloud Run for autoscaling:
- Add Whisper HTTP service (FastAPI) with /transcribe endpoint
- Add FFmpeg HTTP service (FastAPI) with /encode, /probe, /extract-frame, etc.
- Add Dockerfiles for both services (8 vCPU, 32GB RAM, Gen2)
- Add Cloud Build config for CI/CD deployment
- Add Cloud Run service YAML configs with scale-to-zero
- Update whisper_transcribe.py to call Cloud Run when WHISPER_SERVICE_URL set
- Update video_renderer.py to call Cloud Run when FFMPEG_SERVICE_URL set
- Update whisper_service.py for Cloud Run compatibility (no settings dependency)
- Add ffmpeg_service_url and whisper_service_url to config.py
Services scale 0→N based on request load, falling back to local
execution when service URLs are not configured (hybrid mode).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a new "Video Native Mode" translation option that re-processes the
video through Gemini for each target language, generating captions and
audio descriptions directly from visual context. This produces more
natural and culturally appropriate content compared to traditional VTT
text translation.
Changes:
- Add translation_mode field to RequestedOutputs (video_native | traditional)
- Create gemini_ingestion_targeted.md prompt for target language generation
- Add extract_accessibility_targeted() method to Gemini service
- Modify translate_and_synthesize task to handle both translation modes
- Add Translation Mode UI selector in NewJob screen (video_native is default)
- Remove transcreation UI (replaced by video_native mode)
- Remove Google Translate service (replaced by Gemini translation)
- Add LanguageSelector component with searchable dropdown
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Medium model is faster and uses less memory (~1.5GB vs ~3GB)
while still providing good multilingual transcription quality.
Updated in:
- config.py
- docker-compose.yml
- whisper-worker-service.yaml
- cloudbuild.yaml
- Dockerfile (pre-download)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The setting in config.py (5.0) was overriding the default in
whisper_service.py (30.0). Now both are consistent at 30s.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses the multilingual large model for more accurate transcription
and sentence boundary detection.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements word-level speech analysis using faster-whisper to refine
AD pause points. Gemini's timestamps are snapped to natural speech gaps
(sentence/phrase boundaries) to prevent pauses mid-word.
Key changes:
- Add WhisperService for transcription and gap detection
- Add dedicated Celery task routed to 'whisper' queue
- Integrate refinement into render_accessible_video task
- Cache Whisper transcripts in MongoDB for reuse across languages
- Add dedicated whisper-worker with concurrency=1 to prevent OOM
Configuration:
- Uses faster-whisper 'base' model (multilingual, ~145MB)
- 5-second search window after Gemini's recommended point
- Falls back to original timestamp if no gap found
Infrastructure:
- New Docker stage: whisper-worker
- New Cloud Run service: accessible-video-whisper-worker
- Updated docker-compose.yml with whisper-worker service
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a comprehensive video review feature to the Final Review page that allows
reviewers to watch videos with caption overlays and add timestamped notes.
Backend:
- New ReviewNote model for MongoDB with job_id, asset_key, timestamp, content
- CRUD API endpoints at /jobs/{job_id}/review-notes
- Owner-only edit/delete permissions (admins can bypass)
- Database indexes for efficient querying
Frontend:
- VideoReviewPlayer component with video player and caption overlay
- NotesSidebar for viewing/adding notes with auto-highlight when video reaches timestamp
- SyncedCaptionList with auto-scroll and click-to-seek
- AssetTabs for switching between languages and accessible videos
- React Query hooks with 30s polling for collaborative updates
Features:
- Notes persist to database and are shared across all reviewers
- Notes highlight for 5 seconds when video playback reaches their timestamp
- Click note to seek video to that position
- Pause video to add note at current timestamp
- Accessible videos use retimed captions when available
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add model selection (flash vs pro) for quality control
- Add speed slider (0.5x - 2.0x) for pacing adjustment
- Add style presets (neutral, calm, energetic, professional, warm, documentary)
- Add custom style prompt option for advanced customization
- New /tts/options endpoint returns available TTS options
- Voice preview now tests all settings so users hear exact output
- Backward compatible: all new fields have sensible defaults
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Gemini TTS service with 30 voices and 24 languages
- Add TTS API endpoints for voice listing and preview
- Add per-language voice selection in job creation form
- Add voice override at QC approval stage
- Add VoiceSelector and VoicePreviewButton components
- Update TTSPreferences model with provider and voice mapping
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>