video-accessibility/backend/app/tasks
michael 05bde8326d feat: add Whisper-based pause point refinement for audio descriptions
Implements word-level speech analysis using faster-whisper to refine
AD pause points. Gemini's timestamps are snapped to natural speech gaps
(sentence/phrase boundaries) to prevent pauses mid-word.

Key changes:
- Add WhisperService for transcription and gap detection
- Add dedicated Celery task routed to 'whisper' queue
- Integrate refinement into render_accessible_video task
- Cache Whisper transcripts in MongoDB for reuse across languages
- Add dedicated whisper-worker with concurrency=1 to prevent OOM

Configuration:
- Uses faster-whisper 'base' model (multilingual, ~145MB)
- 5-second search window after Gemini's recommended point
- Falls back to original timestamp if no gap found

Infrastructure:
- New Docker stage: whisper-worker
- New Cloud Run service: accessible-video-whisper-worker
- Updated docker-compose.yml with whisper-worker service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 08:27:48 -06:00
..
__pycache__ removed mongodb change stream monitoring, added global websockets monitoring for notifications, broke symmetry between toasts and persistent notifications (and refined which notifications get sent and how) 2025-08-25 15:48:18 -05:00
__init__.py feat: add Whisper-based pause point refinement for audio descriptions 2025-12-27 08:27:48 -06:00
ffmpeg_operations.py feat: add dedicated ffmpeg queue to prevent server overload 2025-12-26 17:56:23 -06:00
ingest_and_ai.py fix: broadcast WebSocket updates for ingesting and ai_processing status 2025-12-27 07:38:25 -06:00
notify.py added websockets for live job status updates with toast notifications on job list page 2025-08-24 19:41:23 -05:00
render_accessible_video.py feat: add Whisper-based pause point refinement for audio descriptions 2025-12-27 08:27:48 -06:00
translate_and_synthesize.py feat: add accessible video (MP4 with embedded audio descriptions) 2025-12-26 11:06:41 -06:00
whisper_transcribe.py feat: add Whisper-based pause point refinement for audio descriptions 2025-12-27 08:27:48 -06:00