Logs pause point placements, segment creation, and final segment
calculation to help diagnose the 30s black footage issue.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Medium model is faster and uses less memory (~1.5GB vs ~3GB)
while still providing good multilingual transcription quality.
Updated in:
- config.py
- docker-compose.yml
- whisper-worker-service.yaml
- cloudbuild.yaml
- Dockerfile (pre-download)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of a fixed 175ms buffer, the pause point is now placed
halfway between the end of the sentence and the start of the
next word. If the half-gap exceeds 2 seconds, uses 500ms instead.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Logs VTT content fetching, parsing, and current caption state
to help diagnose subtitle overlay issues in final review mode.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Dynamically detects CPU count with os.cpu_count() instead of
hardcoded 4 threads. Falls back to 4 if detection fails.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Log model name explicitly when loading and transcribing
- Log model load time
- Log transcription processing time
- Helps verify correct model is being used
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Environment variables were overriding config.py with 'base' model.
Updated defaults in:
- docker-compose.yml
- whisper-worker-service.yaml
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The setting in config.py (5.0) was overriding the default in
whisper_service.py (30.0). Now both are consistent at 30s.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Downloads the model (~3GB) at build time to avoid cold start delays.
Also updated comment to reflect large-v3 memory usage (~4-6GB).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses the multilingual large model for more accurate transcription
and sentence boundary detection.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
If consecutive AD cues have pause points within 5 seconds, they now
play back-to-back at the same pause point. This prevents AD from being
inserted mid-sentence when cues are close together.
Adds _consolidate_close_cues() method and consolidation_threshold
parameter to refine_all_pause_points().
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Increases the search window from ±10s to ±20s to maximize the chance
of finding a proper sentence ending and avoid falling back to Gemini's
potentially imprecise pause points.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When the first AD cue (index 0) cannot find a sentence boundary within
the ±10s search window, insert the AD at T=0:00 instead of using the
potentially mid-sentence Gemini pause point.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update Gemini prompt to require transcription with precise timestamps
- Add sentence_boundaries output field for validation
- Add pause_point_rationale field to explain each pause point choice
- Emphasize terminal punctuation only (., ?, !) - never commas
- Expand Whisper search window from ±5s to ±10s
- Increase post-pause buffer from 50ms to 175ms
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The "Download All Files" function was missing accessible_video_mp4 and
accessible_captions_vtt files that the backend provides. Updated both
the bulk download in JobsList and the individual Downloads page to
include all available file types.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace 3-stage redundant deletion with single prefix-based approach.
All job files are under {job_id}/ prefix, so listing and deleting by
prefix is simpler and catches all files including new types like
accessible_video.mp4 and ad_cues/*.mp3.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updated pause point algorithm:
- Search range: 5 seconds BEFORE to 5 seconds AFTER Gemini pause point
- ONLY considers sentence breaks (after periods, !, ?) - not phrase breaks
- Chooses the closest sentence break to the Gemini pause point
This ensures audio descriptions are inserted at natural sentence
boundaries, not in the middle of sentences after commas.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two fixes:
1. Snap pause point to gap.start (end of previous word) to prevent
cutting off the first word after the pause
2. Add explicit whisper_transcribe import in celery_worker.py
3. Fix misleading queue log message
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The pause point algorithm was snapping to gap.end (start of next word),
which caused the first word after the pause to be cut off. Changed to
snap to gap.start (end of previous word) instead.
Now the video pauses right after a word finishes, the AD plays during
the silence gap, and the next word plays in full when video resumes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed the Whisper transcription to run on dedicated whisper-worker
using the same dispatch pattern as FFmpeg:
1. apply_async() to dispatch to the whisper queue
2. Poll with ready() using async sleep to avoid blocking
3. Use allow_join_result() context manager
4. Get result only after task is ready
This ensures Whisper runs with concurrency=1 on a dedicated worker
to prevent memory overload while still allowing the render task
to wait for results without deadlocking.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Celery does not allow calling result.get() within a task as it causes
deadlocks. Changed the implementation to run Whisper transcription
directly using asyncio.to_thread() instead of dispatching to a separate
Celery queue.
The Whisper transcript is still cached in MongoDB for reuse across
language variants.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The rendering_video status was defined in job.py and frontend types but
was missing from the MongoDB schema validator, causing document update
failures when jobs transitioned to the rendering_video state.
Changes:
- Add migration script to update existing databases
- Update mongodb-init.js for new database setups
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements word-level speech analysis using faster-whisper to refine
AD pause points. Gemini's timestamps are snapped to natural speech gaps
(sentence/phrase boundaries) to prevent pauses mid-word.
Key changes:
- Add WhisperService for transcription and gap detection
- Add dedicated Celery task routed to 'whisper' queue
- Integrate refinement into render_accessible_video task
- Cache Whisper transcripts in MongoDB for reuse across languages
- Add dedicated whisper-worker with concurrency=1 to prevent OOM
Configuration:
- Uses faster-whisper 'base' model (multilingual, ~145MB)
- 5-second search window after Gemini's recommended point
- Falls back to original timestamp if no gap found
Infrastructure:
- New Docker stage: whisper-worker
- New Cloud Run service: accessible-video-whisper-worker
- Updated docker-compose.yml with whisper-worker service
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Show contextual buttons when jobs are ready for review:
- QC Review button (blue) when status is pending_qc
- Final Review button (purple) when status is pending_final_review
Buttons appear dynamically via WebSocket updates without refresh.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously only the final pending_qc status was broadcast via WebSocket.
Now all intermediate status changes (ingesting, ai_processing) are also
broadcast so the frontend can update in real-time during reprocessing.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added header rows to clarify that "Original only" refers to
"Languages Requested" in both Pending and Completed sections.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Backend returns UTC dates without timezone indicator. Added
parseUTCDate helper that appends 'Z' to ensure JavaScript
correctly interprets dates as UTC and converts to local time.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Users can see the filename on the job detail page if needed.
Search still works on filename even though it's not displayed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Created By filter dropdown was empty because client_id was not
being returned by the API. Added client_id to JobResponse schema
and included it in the list_jobs response.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add created_by_name field to JobResponse schema and API
- Batch-fetch user names in list_jobs endpoint for efficiency
- Convert JobsList from card layout to sortable data table
- Add search box (job name, filename, created by user)
- Add user filter dropdown (populated from current jobs)
- Add status filter dropdown (individual statuses from current jobs)
- Add date range filter (All Time, Last 7 Days, Last 30 Days)
- Add sortable columns: Job Name, Created By, Date Created, Status
- Fetch all jobs for full client-side filtering capability
- Add responsive horizontal scroll for mobile
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Stabilized video URLs in VideoReviewPlayer to prevent the video from
resetting when WebSocket updates trigger React Query refetches. The
fix stores the initial video URL for each tab and reuses it instead
of using the refreshed signed URL, preventing unnecessary remounts.
Changes:
- Added stableVideoUrls state to cache video URLs per tab
- Removed key prop from video element to prevent forced remounts
- Event listeners now attached once on mount, not on URL change
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When jobs with accessible video option enabled enter video rendering
phase, the status now transitions to 'rendering_video' so users can
see why processing is taking longer. This provides better visibility
into the video rendering pipeline.
Changes:
- Added RENDERING_VIDEO status to JobStatus enum
- Updated render_accessible_video task to set new status
- Added status display to StatusBadge, jobStatusMessages
- Included new status in JobsList Translation filter group
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added staleTime: 0 and refetchOnMount: 'always' to useJobs hook to
ensure the jobs list always fetches fresh data when navigating to
the All Jobs tab after creating a new job.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Modified render_accessible_video.py to explicitly pass TMPDIR to
tempfile.TemporaryDirectory() so files are created in shared volume
- Updated docker-compose.yml to run containers as root initially,
chown /shared-tmp to app:app, then switch to app user for celery
- This ensures both worker containers can access the same temp files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The main worker and ffmpeg-worker run in separate containers with
isolated filesystems. Add a shared volume mounted at /shared-tmp
and set TMPDIR so Python's tempfile module uses it. This allows
ffmpeg-worker to access video files created by the main worker.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Celery doesn't allow calling result.get() within a task by default to
prevent deadlocks. Use allow_join_result() context manager since we've
already confirmed the task is complete via ready() polling.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a dedicated Celery queue (ffmpeg) with concurrency=1 to serialize
all FFmpeg operations. This prevents CPU spikes when multiple render
tasks run in parallel with multiple languages.
Changes:
- Add ffmpeg_operations.py with run_ffmpeg_command and run_ffprobe_command tasks
- Update VideoRendererService to dispatch ffmpeg commands via the queue
- Add ffmpeg-worker service to docker-compose with --concurrency=1
- Configure main worker to exclude the ffmpeg queue
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Clarifies the UI when a job has no additional translation languages
requested, making it clearer that the job contains only the original
language assets.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a video's source language is non-English (e.g., German), the
language asset was incorrectly marked as 'Translated'. Now correctly
shows 'Original' with a green badge for the primary/source language.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add validation for accessible_video_gcs (file exists, size 0.1MB-5GB)
- Add validation for retimed_captions_vtt_gcs when accessible video exists
- Add AD Videos count to asset validation panel
- Include retimed captions in VTT file count
- Remove AI confidence from validation panel and backend checks
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix video event listeners not re-attaching when video element remounts
(add activeTab?.videoUrl to useEffect dependency array)
- Add retimed_captions_vtt to VTT API response for accessible videos
- Use retimed captions for accessible video tab in VideoReviewPlayer
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a comprehensive video review feature to the Final Review page that allows
reviewers to watch videos with caption overlays and add timestamped notes.
Backend:
- New ReviewNote model for MongoDB with job_id, asset_key, timestamp, content
- CRUD API endpoints at /jobs/{job_id}/review-notes
- Owner-only edit/delete permissions (admins can bypass)
- Database indexes for efficient querying
Frontend:
- VideoReviewPlayer component with video player and caption overlay
- NotesSidebar for viewing/adding notes with auto-highlight when video reaches timestamp
- SyncedCaptionList with auto-scroll and click-to-seek
- AssetTabs for switching between languages and accessible videos
- React Query hooks with 30s polling for collaborative updates
Features:
- Notes persist to database and are shared across all reviewers
- Notes highlight for 5 seconds when video playback reaches their timestamp
- Click note to seek video to that position
- Pause video to add note at current timestamp
- Accessible videos use retimed captions when available
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The method field (overlay/pause_insert) is metadata, not a downloadable
file. Including it in the downloads dict caused the frontend to render
a broken download link.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update _get_video_properties() to extract audio sample_rate, channels,
and pix_fmt in addition to video properties
- Add _extract_segment_reencoded() for frame-accurate cuts using
re-encoding instead of stream copy (fixes keyframe-only cut limitation)
- Add _create_freeze_segment_matched() to enforce source audio property
matching (fixes silent pauses caused by sample rate mismatch)
- Update _render_pause_insert_method() to use new methods with uniform
encoding parameters
- Add -video_track_timescale 90000 for consistent timebase across segments
Root causes fixed:
1. -c copy could only cut at keyframes, causing audio dropouts
2. Sample rate mismatch (48kHz source vs 44.1kHz MP3) caused silent
freeze-frame segments when concatenated
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The accessible video render task was being dispatched to the 'render' queue
but no worker was listening to it. Added 'render' to:
- Dockerfile CMD args for worker queue list
- celery_worker.py import and log message
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add new deliverable type that renders video with audio descriptions embedded.
Supports two AI-determined methods:
- Direct Overlay: ducks original audio and overlays AD TTS (for minimal dialogue)
- Pause-Insert: freeze-frame video, insert AD, re-time subtitles (for significant dialogue)
Backend:
- Add Pydantic schemas for Gemini analysis response
- Add Gemini prompt and analyze_accessible_video_placement() method
- Add video_renderer.py service using FFmpeg for both rendering methods
- Add vtt_retimer.py service for pause-insert subtitle adjustment
- Add render_accessible_video.py Celery task
- Modify TTS service to return individual per-cue segments
- Update translate_and_synthesize.py to save segments and trigger rendering
- Update download endpoint to include accessible video outputs
Frontend:
- Add accessible_video_mp4 checkbox to NewJob form
- Update TypeScript types for new deliverable
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add checkbox selection for pending final review jobs
- Add bulk actions toolbar with Complete/Return to QC options
- Mirror QC Review bulk action pattern with parallel API calls
- Rejection notes required, completion notes optional
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>