video-accessibility

History

michael 05bde8326d feat: add Whisper-based pause point refinement for audio descriptions Implements word-level speech analysis using faster-whisper to refine AD pause points. Gemini's timestamps are snapped to natural speech gaps (sentence/phrase boundaries) to prevent pauses mid-word. Key changes: - Add WhisperService for transcription and gap detection - Add dedicated Celery task routed to 'whisper' queue - Integrate refinement into render_accessible_video task - Cache Whisper transcripts in MongoDB for reuse across languages - Add dedicated whisper-worker with concurrency=1 to prevent OOM Configuration: - Uses faster-whisper 'base' model (multilingual, ~145MB) - 5-second search window after Gemini's recommended point - Falls back to original timestamp if no gap found Infrastructure: - New Docker stage: whisper-worker - New Cloud Run service: accessible-video-whisper-worker - Updated docker-compose.yml with whisper-worker service 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2025-12-27 08:27:48 -06:00
..
__pycache__	better tts config for worker	2025-10-08 18:47:28 -05:00
audit_logger.py	initial commit	2025-08-24 16:28:33 -05:00
emailer.py	initial commit	2025-08-24 16:28:33 -05:00
gcs.py	initial commit	2025-08-24 16:28:33 -05:00
gemini.py	feat: add accessible video (MP4 with embedded audio descriptions)	2025-12-26 11:06:41 -06:00
gemini_tts.py	feat: add TTS settings panel with model, speed, and style options	2025-12-22 15:22:14 -06:00
microsoft_auth.py	added MSAL microsoft authentication	2025-10-10 09:19:39 -05:00
secrets_manager.py	initial commit	2025-08-24 16:28:33 -05:00
translate.py	add support for non-English original video uploads	2025-12-22 10:33:58 -06:00
tts.py	feat: add accessible video (MP4 with embedded audio descriptions)	2025-12-26 11:06:41 -06:00
validation.py	feat: add accessible video validation, remove AI confidence check	2025-12-26 16:41:57 -06:00
video_renderer.py	fix: use allow_join_result for celery subtask result retrieval	2025-12-26 18:09:37 -06:00
vtt_retimer.py	feat: add accessible video (MP4 with embedded audio descriptions)	2025-12-26 11:06:41 -06:00
websocket.py	wrote docker files and deployment instructions	2025-10-08 16:00:12 -05:00
websocket_publisher.py	wrote docker files and deployment instructions	2025-10-08 16:00:12 -05:00
whisper_service.py	feat: add Whisper-based pause point refinement for audio descriptions	2025-12-27 08:27:48 -06:00