video-accessibility

Author	SHA1	Message	Date
michael	65a7404c87	fix: use proper signed URL generation for accessible video preview The generate_signed_url() was called with expiration=3600 as an integer, but GCS expects a datetime or timedelta. Now uses gcs_service.get_signed_url() which properly calculates the expiration timestamp. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-11 09:28:51 -06:00
michael	81d4e6a3cc	fix: convert datetime fields to ISO strings in edit state response The AccessibleVideoEditStateResponse schema expects string timestamps but the API was passing raw datetime objects from MongoDB. Now converts last_render_at and requested_at to ISO format strings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-11 08:49:29 -06:00
michael	aa6777d2c2	feat: add QC accessible video review and editing capabilities - Reorder workflow: translations now happen BEFORE QC Review step - Add language tabs to switch between translated languages in QC - Add video mode tabs (Original Video / Accessible Video) - Add interactive timeline preview showing video segments and AD cues - Enable pause point adjustment with millisecond precision - Add TTS regeneration queue for selective cue re-synthesis - Add re-render controls with optional Whisper refinement - Persist video segments and TTS MP3s to GCS for editability - Add new RENDERING_QC job status for re-render operations - Create 5 new API endpoints for accessible video editing - Add rerender_accessible_video.py Celery task Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-11 08:32:27 -06:00
michael	c5f59b1079	fix: use local ffprobe for freeze segment duration measurement The previous implementation incorrectly used _get_video_duration which in Cloud Run mode uses the cached source video URI instead of actually measuring the freeze segment files. This caused all freeze segments to report the source video duration (~78s) instead of their actual duration. Changed to use _get_video_duration_local directly since freeze segments are local files and need to be measured directly via ffprobe. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 16:11:03 -06:00
michael	add958008a	fix: use actual freeze segment durations for VTT subtitle retiming Subtitles were appearing progressively out of sync (~1.0s early per AD) because the VTT retimer calculated freeze durations theoretically rather than using actual rendered segment durations. Changes: - video_renderer: Measure actual freeze segment duration after creation - video_renderer: Return updated placements with actual_freeze_duration - vtt_retimer: Prefer actual_freeze_duration over calculated values - render_task: Pass actual durations to VTT retimer This ensures subtitle timing matches the real video timeline regardless of any FFmpeg encoding variations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 15:52:57 -06:00
michael	e44210ea64	feat: auto-rewrite TTS cues that fail synthesis When TTS synthesis fails after 3 retries, the system now: - Sends problematic cue text to Gemini for TTS-safe rewriting - Updates the VTT file in GCS with rewritten text - Retries TTS synthesis with the new text - Records successful rewrites in job.tts_rewrites field UI changes: - JobDetail shows amber caution box with original/rewritten text - JobsList shows warning icon next to jobs with rewrites - Error display clarifies text shown is "after rewrite attempt" Files changed: - backend/app/models/job.py: Add tts_rewrites field - backend/app/prompts/gemini_tts_rewrite.md: New prompt template - backend/app/services/gemini.py: Add rewrite_tts_cue method - backend/app/tasks/tts_synthesis.py: Add VTT update utilities - backend/app/tasks/translate_and_synthesize.py: Rewrite+retry logic - frontend/src/types/api.ts: Add TTSRewriteItem type - frontend/src/routes/jobs/JobDetail.tsx: Caution display - frontend/src/routes/jobs/JobsList.tsx: Warning indicator 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 14:42:50 -06:00
michael	76c4c60b0d	fix: add tts_failed and render_failed to MongoDB schema validator MongoDB was rejecting status updates to 'tts_failed' and 'render_failed' because these values weren't in the schema validator's enum, even though they were defined in the Python JobStatus model. This caused TTS failures to leave jobs stuck in 'tts_generating' status with no error feedback to users - the WriteError from MongoDB prevented the status and error fields from being updated. The migration adds both failed statuses to the jobs collection validator. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-05 14:09:41 -06:00
michael	83e4752327	feat: add server-side zip download for bulk job downloads Replace sequential browser-based bulk download with server-side zip generation. When users select "Download All Files" from bulk actions, the system now creates a single organized .zip file containing all job assets. Changes: - Add POST /jobs/bulk/download endpoint that streams zip to client - Add BulkDownloadRequest schema for the new endpoint - Create zip_download.py service with streaming zip generation - Update frontend to call new endpoint and download single zip file - Organize files in zip by job title and language subdirectories Zip structure: accessible_video_YYYYMMDD_HHMMSS.zip └── {job_title}/ ├── source.mp4 └── {lang}/ ├── captions.vtt ├── ad.vtt ├── ad.mp3 ├── accessible_video.mp4 └── accessible_captions.vtt 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 15:57:57 -06:00
michael	8606877d01	fix: properly set tts_failed status when TTS synthesis fails The TTS error handling had a bug where failed jobs stayed in 'tts_generating' status instead of being set to 'tts_failed'. Root cause: synthesize_cue_task used autoretry_for=(Exception,) which raises the original exception after max retries, not MaxRetriesExceededError. The exception handler never fired. Changes: - tts_synthesis.py: Replace autoretry_for with manual retry logic that returns a failure dict on final failure instead of raising - translate_and_synthesize.py: Add propagate=False to group.get() to safely retrieve all results including failures - translate_and_synthesize.py: Update outer exception handler to set job status to tts_failed, store error details, and broadcast status update via WebSocket Now TTS failures will: 1. Set job status to 'tts_failed' 2. Store detailed error info (cue index, text, message) 3. Show error in UI with retry button 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-04 10:45:33 -06:00
michael	8bd9be6353	fix: TypeScript type narrowing for TTS error display Use typeof checks for proper type narrowing of unknown error fields 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 21:41:12 -06:00
michael	6915cf46af	feat: add TTS retry functionality with detailed error reporting - Add POST /jobs/{id}/actions/retry_tts endpoint for retrying TTS - Frontend shows TTS-specific error details (cue index, blocked text) - Add "Retry TTS Generation" button on failed jobs - Guides users to edit problematic AD text before retrying 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 21:39:59 -06:00
michael	9436aa4c6b	fix: remove useMemo hook after early returns in QCList The useMemo hook was placed after early returns (isLoading, error), which violates React's rules of hooks. Hooks must be called unconditionally on every render. Replaced with simple inline computation since the operation is cheap. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 19:22:29 -06:00
michael	c512bdc184	feat: use AD VTT pause points instead of Gemini video analysis Optimize the accessible video workflow by eliminating the dedicated Gemini video analysis call for pause point estimation. Instead: - Use AD VTT cue start times as initial pause points for Whisper refinement - Add user-selectable accessible video method (pause_insert/overlay) at QC approval - Add bulk approval API endpoint with method selection - Add method selector UI to QCDetail page - Add bulk approval modal to QCList for jobs with accessible video Benefits: - Eliminates expensive Gemini API call with video upload - Faster workflow (~5-15 seconds saved per job) - Cost savings on Gemini video analysis - User control over accessible video integration method Backend changes: - Add accessible_video_method to RequestedOutputs and ApproveSourceRequest - Add POST /jobs/bulk/approve endpoint - Replace Gemini call with _build_placements_from_ad_vtt() helper - Mark analyze_accessible_video_placement() as deprecated Frontend changes: - Add method selector radio buttons to QCDetail - Add bulk approval modal with method selection to QCList - Update API client and React Query hooks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 19:05:45 -06:00
michael	3689653135	refactor: remove manual language selection, always auto-detect Remove the "Original Video Language" control from job upload form. All videos now use AI auto-detection for source language, simplifying the UX and eliminating potential for incorrect manual selection. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 18:42:45 -06:00
michael	5342ab1a28	fix: prevent event loop closed error in video renderer Cloud Run calls Use context manager for AsyncClient instead of caching on singleton. Each asyncio.run() creates a new event loop, so cached clients bound to previous event loops fail when reused across jobs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 08:44:27 -06:00
michael	3e2099515a	fix: use async httpx client for true parallel Cloud Run calls Changed from httpx.Client (sync) to httpx.AsyncClient so that asyncio.gather() actually executes HTTP calls in parallel instead of blocking the event loop sequentially. Before: ~5 min for 18 segments (serial HTTP calls despite gather) After: ~30 sec for 18 segments (truly parallel HTTP calls) Changes: - _http_client: httpx.Client -> httpx.AsyncClient - _call_cloud_run_probe: sync -> async - _call_cloud_run_endpoint: sync -> async - Added await to all Cloud Run HTTP calls 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-03 08:11:46 -06:00
michael	95852f1357	fix: update Cloud Run service configs for compatibility - FFmpeg: Enable CPU throttling to reduce idle costs - Whisper: Keep CPU throttling disabled (model loading needs full CPU) - Remove readinessProbe (requires BETA launch stage) - Both services scale to zero when idle for cost savings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 17:34:10 -06:00
michael	593d3bf346	cost: enable CPU throttling for Whisper and FFmpeg Cloud Run services Changed cpu-throttling from "false" to "true" for both services. This reduces costs when instances are idle between requests: - Idle CPU billed at ~10% of active rate instead of 100% - Instances still scale to zero after ~15 min of no traffic Trade-off: Slightly slower response when resuming from throttled state, but startup-cpu-boost is still enabled to mitigate cold starts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 17:22:14 -06:00
michael	e2302d497d	perf: parallelize FFmpeg Cloud Run calls using asyncio.gather() Refactored _render_pause_insert to execute FFmpeg operations in parallel phases instead of sequentially: Phase 1: Parallel extraction - Generate shared silence (once, reused by all) - Extract ALL video segments simultaneously - Extract ALL freeze frames simultaneously - Extract final segment Phase 2: Parallel audio concatenation - Concatenate ALL audio tracks (silence + AD + silence) simultaneously Phase 3: Parallel freeze segment creation - Create ALL freeze segments simultaneously Phase 4: Assemble segments in correct order for final concatenation This reduces render time from ~3 minutes (serial) to ~30 seconds (parallel) for an 8-cue video when using Cloud Run FFmpeg service. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 17:18:23 -06:00
michael	87a4b1ab77	fix: use command_template instead of ffmpeg_args in _generate_silence_cloud_run The /run-ffmpeg Cloud Run endpoint expects command_template field with ffmpeg command placeholders, not ffmpeg_args. This fixes 422 validation errors when generating silence audio via Cloud Run. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 16:57:17 -06:00
michael	e68bac2f60	fix: correct FFmpeg probe request parameter name The /probe endpoint expects 'gcs_uri' but we were sending 'source_gcs_uri'. Fixed to match the ProbeRequest model in ffmpeg_http_service.py. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 16:43:01 -06:00
Michael Clervi	97b582fed1	.env.production	2026-01-02 22:32:40 +00:00
michael	7d2366d0f4	fix: add authentication for Cloud Run service calls Cloud Run services are deployed with --no-allow-unauthenticated, requiring an ID token in the Authorization header. - Add _get_cloud_run_id_token() helper using google-auth library - Update whisper_transcribe.py to include Bearer token in Cloud Run calls - Update video_renderer.py to include Bearer token in FFmpeg Cloud Run calls The ID token is fetched using the service account credentials (GOOGLE_APPLICATION_CREDENTIALS) and targets the Cloud Run service URL. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 11:41:07 -06:00
michael	77dc58b124	fix: add Cloud Run URLs to .env.local for docker-compose Docker-compose reads from root .env (symlinked to .env.local), not backend/.env. Added WHISPER_SERVICE_URL, FFMPEG_SERVICE_URL, and worker concurrency settings to enable Cloud Run autoscaling. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 10:54:27 -06:00
michael	b6cec63656	chore: remove .env from git tracking The .env file was tracked before .gitignore rules were added. Running `git rm --cached` removes it from the index while keeping the local file. Future changes to .env will now be ignored. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 10:29:51 -06:00
michael	9580979ac8	feat: add environment-based worker concurrency for Cloud Run mode Allow configuring Celery worker concurrency via environment variables to take advantage of Cloud Run autoscaling: - Add WORKER_CONCURRENCY, WHISPER_WORKER_CONCURRENCY, FFMPEG_WORKER_CONCURRENCY settings to config.py with recommended values documented - Update Dockerfile to use ${WORKER_CONCURRENCY} and ${WHISPER_WORKER_CONCURRENCY} environment variables instead of hardcoded values - Update docker-compose.yml to pass concurrency env vars to worker commands - Add WHISPER_SERVICE_URL and FFMPEG_SERVICE_URL to relevant workers Recommended settings: Local mode: WHISPER=1, FFMPEG=1 (CPU/RAM constrained) Cloud Run mode: WHISPER=10, FFMPEG=20 (match autoscaling limits) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 10:27:07 -06:00
Michael Clervi	6881565d75	.env and .gitignore	2026-01-02 16:21:41 +00:00
michael	79440929f4	feat: add Cloud Run HTTP services for Whisper and FFmpeg Migrate CPU-intensive workloads to Cloud Run for autoscaling: - Add Whisper HTTP service (FastAPI) with /transcribe endpoint - Add FFmpeg HTTP service (FastAPI) with /encode, /probe, /extract-frame, etc. - Add Dockerfiles for both services (8 vCPU, 32GB RAM, Gen2) - Add Cloud Build config for CI/CD deployment - Add Cloud Run service YAML configs with scale-to-zero - Update whisper_transcribe.py to call Cloud Run when WHISPER_SERVICE_URL set - Update video_renderer.py to call Cloud Run when FFMPEG_SERVICE_URL set - Update whisper_service.py for Cloud Run compatibility (no settings dependency) - Add ffmpeg_service_url and whisper_service_url to config.py Services scale 0→N based on request load, falling back to local execution when service URLs are not configured (hybrid mode). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 10:12:50 -06:00
michael	a8cc0b65a4	chore: update page title to Video Accessibility Platform 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 10:28:00 -06:00
michael	5e22f15e76	fix: TypeScript errors in JobDetail error display Use explicit null returns and String() casts for unknown types from job.error Record to satisfy TypeScript strict mode. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 10:22:20 -06:00
michael	c1c0b876fc	feat: add RENDER_FAILED status with error propagation to GUI - Add RENDER_FAILED job status for when video rendering fails - Fix _check_accessible_video_completion to detect failures and transition job status accordingly (was stuck in RENDERING_VIDEO forever) - Store detailed error info in job.error including failed_languages array - Call completion check after failures to properly update job status - Broadcast WebSocket notification on render failures Frontend: - Add render_failed to JobStatus type and StatusBadge (red styling) - Add tts_failed and render_failed to JobsList STATUS_LABELS - Enhance JobDetail error display with: - Warning icon and prominent styling - Error type and message - Failed languages list with per-language errors - Timestamp of when error occurred - Update ProgressIndicator to handle failed states with red dot 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 10:18:27 -06:00
michael	0c1c115a8f	fix: clarify language selector label Changed "Target Languages" to "Target Languages for Translation" for clarity in the new job form. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 09:38:29 -06:00
michael	e0850ca307	fix: include source language in jobs list language count The Languages column now shows total languages (source + translations) instead of just translation count, matching other parts of the UI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 09:33:30 -06:00
michael	77be93b526	perf: parallelize video-native translations with asyncio.gather Video-native translation mode now processes all target languages in parallel using asyncio.gather() with a semaphore (max 3 concurrent) for rate limiting. This significantly reduces total translation time when multiple languages are selected. - Add MAX_CONCURRENT_VIDEO_NATIVE constant for rate limiting - Refactor video-native path to use parallel coroutines - Keep traditional VTT translation mode sequential - Handle per-language errors without stopping other translations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-01 09:21:07 -06:00
michael	d08c20914e	fix: use delete_blob() to avoid read-only generation property error The google-cloud-storage library made blob.generation read-only, causing job deletion to fail silently (0 GCS files deleted). Using bucket.delete_blob(name) instead avoids generation checking. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-31 13:57:59 -06:00
michael	d2d8e32819	feat: add video-native translation mode for multi-language content Add a new "Video Native Mode" translation option that re-processes the video through Gemini for each target language, generating captions and audio descriptions directly from visual context. This produces more natural and culturally appropriate content compared to traditional VTT text translation. Changes: - Add translation_mode field to RequestedOutputs (video_native \| traditional) - Create gemini_ingestion_targeted.md prompt for target language generation - Add extract_accessibility_targeted() method to Gemini service - Modify translate_and_synthesize task to handle both translation modes - Add Translation Mode UI selector in NewJob screen (video_native is default) - Remove transcreation UI (replaced by video_native mode) - Remove Google Translate service (replaced by Gemini translation) - Add LanguageSelector component with searchable dropdown 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-31 13:50:05 -06:00
michael	81079c2d17	fix: handle race conditions and 404 errors in bulk job deletion - Deduplicate job IDs to prevent processing same job twice - Convert GCS blob iterator to list upfront to avoid stale generations - Clear blob.generation before delete to handle concurrent deletions - Catch NotFound errors gracefully for already-deleted blobs - Don't re-raise GCS errors - cleanup failures shouldn't block deletion - Treat already-deleted jobs as successful (idempotent delete) - Disable action dropdown during bulk operations in UI - Show spinner with "Please wait" message during deletion 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-30 15:46:02 -06:00
michael	e8b940aee8	feat: add TTS_FAILED status and robust error handling for TTS synthesis Add comprehensive error handling for TTS synthesis failures: Backend: - Add TTS_FAILED status to JobStatus enum for failed synthesis jobs - Add TTSSynthesisError exception with cue index and context tracking - Improve null-safe error handling in Gemini TTS response parsing - Add _synthesize_cue_with_retry() with exponential backoff (3 attempts) - Enhanced error logging with text preview and model context Frontend: - Add TTS_FAILED status styling (red badge) in StatusBadge component - Add tts_failed to JobStatus TypeScript type 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-30 14:26:07 -06:00
michael	6689778be7	feat: add dedicated TTS worker with parallel per-cue synthesis Break out TTS synthesis into a dedicated Celery worker (tts queue) with concurrency=8 for parallel processing. Each AD cue is now synthesized as a separate task, enabling up to 8 cues to be processed simultaneously. Key changes: - Add tts_synthesis.py with synthesize_cue_task for per-cue synthesis - Refactor translate_and_synthesize.py to dispatch cue tasks in parallel - Add tts-worker service to docker-compose.yml (concurrency=8) - Add Cloud Run service config for production deployment Benefits: - Parallel synthesis even for single jobs (e.g., 50 cues → 8 concurrent) - Natural rate limiting across multiple concurrent jobs - Fault tolerance with per-cue retries and GCS persistence 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-30 14:23:11 -06:00
michael	b11c3d0d4f	fix: rewrite VTT retiming algorithm to prevent captions during AD segments The VTT retimer had two bugs causing subtitles to display during freeze periods and become out of sync: 1. Same offset applied to both start and end times (should differ when pause falls between them) 2. Cues spanning pause points weren't split (causing captions during freeze) Changes: - Add _offset_at() for timestamps AT or AFTER pause points - Add _offset_before() for timestamps STRICTLY BEFORE pause points - Add _retime_cue() to split cues at pause points into multiple segments - Add _filter_short_segments() to remove <100ms segments after splitting - Rewrite retime_for_pause_insert() to use new helper methods Example fix for cue 8s-12s with pause at 10s (4s freeze): - Before: 8s-12s (displayed during freeze!) - After: 8s-10s + 14s-16s (gap during AD) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-30 09:01:03 -06:00
michael	eea1c25ab2	feat: enhance VTT editor with timing editing, cue insert/delete Add comprehensive editing capabilities to the QC Review VTT editor: - Inline editable timestamp inputs for start/end times - Insert cue before/after with midpoint timestamp calculation - Delete cue with confirmation modal - Real-time per-cue validation warnings for overlaps - Hover action buttons (insert before, insert after, delete) Also fixes VTT parser to properly handle empty/invalid timestamps. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 11:46:24 -06:00
michael	0a79a82b04	fix: resolve race condition preventing captions in batch uploads The video event listener effect had an empty dependency array, causing it to run only once on mount. For batch uploads where downloads load slower, the video element wasn't rendered yet when the effect ran, so listeners were never attached. Adding currentVideoUrl as a dependency ensures the effect re-runs when the video becomes available. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 10:59:30 -06:00
michael	37593dd4bc	refactor: simplify pause point algorithm with midpoint snapping and silence buffers Replace complex overlap/catch-up logic with simpler approach: - Snap pause points to midpoint between sentences (not sentence boundaries) - Add 500ms silence before AND after AD audio during freeze frame - Resume playback from same midpoint (no overlap, no visual jump-back) This eliminates audio/visual anomalies caused by the previous algorithm's complexity around sentence boundary snapping and audio catch-up. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 09:55:40 -06:00
michael	37f5e8d1b0	fix: validate pause points and frame extraction - Get video duration BEFORE the render loop (not after) - Clamp pause_point to 100ms before video end if it exceeds duration - Add validation in _extract_frame() to verify frame was created - Add debug logging for frame extraction timestamps This prevents "Frame file not found" errors when pause points calculated by Whisper refinement exceed the source video duration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 09:32:13 -06:00
michael	40ece78652	feat: implement audio catch-up to eliminate visual jump-back artifact When a pause point falls between two sentences, the previous algorithm created a visual jump-back where the video rewound to resume_from after the AD played. This was distracting to viewers. New behavior: - Video plays normally to pause_point - Freeze frame shows + AD audio plays - Freeze frame CONTINUES while source audio from [resume_from, pause_point] plays (the "catch-up" audio) - Video resumes smoothly from pause_point (no visual jump) The audio from the overlap region plays twice (once during video, once during freeze extension) but this is acceptable as it's typically <1s and provides natural audio context around the AD. Implementation: - Add _extract_audio_segment() to extract catch-up audio from source - Add _concatenate_audio() to join AD + catch-up audio - Modify render loop to create extended freeze segments with combined audio - Resume video from pause_point instead of resume_from 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 09:02:46 -06:00
michael	ce7a1b182f	fix: improve FFmpeg error reporting and add input validation - Show last 1500 chars of stderr instead of first 500 to capture actual error messages (FFmpeg writes version banner first, errors at end) - Add validation for freeze segment creation: - Check duration > 0 - Verify frame and audio files exist - Add debug logging for parameters This helps diagnose FFmpeg failures that were previously showing only version/configuration info instead of the actual error. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 08:41:32 -06:00
michael	3588d3fa14	refactor: rewrite pause point refinement algorithm with ordered logic Completely rewrites the Whisper-based pause point refinement to use a two-phase approach with explicit ordering: Phase 1 - Individual refinement: 1. Check if pause point is "during speaking" (words within ±2s) - If NOT during speaking → use Gemini's exact point, no overlap 2. If during speaking, find nearest sentence boundary 3. Apply appropriate buffering based on context: - Case A: First sentence → pause 500ms before sentence starts - Case B: Last sentence → pause 500ms after sentence ends - Case C: Between sentences → full double buffer (overlap) Phase 2 - Consolidation (after all refinements): - Consolidate cues within 5s of each other to play back-to-back Key changes: - Add SentenceBoundary dataclass for tracking boundaries with context - Add _is_during_speaking() helper to detect speech proximity - Add _find_sentence_boundaries() with longest-gap fallback - Rewrite snap_pause_point() with new ordered algorithm - Update refine_all_pause_points() to pass words and use two phases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-29 08:19:03 -06:00
michael	d092800676	fix: treat consolidated AD cues as single segment for buffering Previously, all consolidated cues shared the same pause_point AND resume_from, which caused the overlap video segment to play between each AD cue in a consolidated group. Now consolidated cues are treated as a single AD segment: - All cues in a group share the same pause_point (front buffer once) - Only the LAST cue keeps resume_from (back buffer once) - Other cues have resume_from = pause_point (no video between ADs) This ensures consolidated ADs play seamlessly back-to-back: - Video plays up to pause_point (front buffer) - AD_1 plays - AD_2 plays immediately (no video) - AD_n plays immediately (no video) - Video resumes from resume_from (back buffer) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-28 23:33:15 -06:00
michael	ee6a30e7a7	feat: always generate fresh Whisper transcripts (disable caching) Remove the cached transcript lookup - always run a fresh Whisper transcription for each accessible video render. This ensures we get accurate word timestamps for the current video file. The transcript is still saved to the job document for debugging and auditing purposes, but it will never be read back for reuse. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-28 23:25:35 -06:00
michael	e8fde7962f	chore: increase accessible-video-worker concurrency from 4 to 8 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-28 23:20:58 -06:00

1 2 3 4

172 commits