video-accessibility

Author	SHA1	Message	Date
Vadym Samoilenko	f22d568fc5	fix(security): fix false-positive injection blocks on French/multilingual VTT content - Remove ';' from command-injection pattern — semicolons are common in French and other European languages, not a shell injection risk in JSON context - Skip security pattern scanning for free-text fields (captions_vtt, audio_description_vtt, notes, etc.) — natural language always generates false positives against injection regexes - Add GET/HEAD to GCS CORS config so browsers can load signed VTT URLs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-13 19:11:01 +01:00
Vadym Samoilenko	d5e63129dd	feat(upload): PR-3 GCS resumable chunked upload for large videos Files >100 MB bypass the load balancer via browser→GCS direct upload: - POST /jobs/upload/init — creates GCS resumable session, returns job_id + session URI - POST /jobs/upload/complete — verifies GCS object, creates job, dispatches ingestion - Frontend sends 8 MB chunks with Content-Range directly to GCS session URI - infra/gcs-cors.json + deploy-dev.sh ensure_gcs_cors() enable browser CORS on bucket Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 11:35:13 +01:00
Vadym Samoilenko	ea21cace96	feat: replace SDK with direct HTTP integration to centralized cost tracker - New services/cost_tracker.py: sync httpx preflight()/record() + async wrappers; BudgetExceeded exception; no-op when COST_TRACKER_BASE_URL is empty - Preflight budget check added before ingestion (Gemini), per-language translation (video-native + traditional), and per-language TTS dispatch - _record_gemini_usage and _record_tts_cost now call cost_tracker directly; removes broken asyncio.get_event_loop() hack from sync Celery worker - Fix: _cost_ctx now threaded into extract_accessibility_targeted (video-native path) - Fix: user_id/cost_project_id now propagated through dispatch_language_tts → synthesize_cue_task.s() and the rerender_accessible_video.py re-render path - Remove oliver-cost-tracker SDK dependency (was commented-out/never installed) - Drop cost_tracker_outbox_path setting and get_cost_tracker() factory - Update COST_TRACKER_BASE_URL default to optical-dev.oliver.solutions in .env.prod.example, docker-compose.yml, and all Cloud Run service yamls - Cloud Run yamls use Secret Manager ref (cost-tracker-api-key) for the API key Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 13:36:15 +01:00
michael	95852f1357	fix: update Cloud Run service configs for compatibility - FFmpeg: Enable CPU throttling to reduce idle costs - Whisper: Keep CPU throttling disabled (model loading needs full CPU) - Remove readinessProbe (requires BETA launch stage) - Both services scale to zero when idle for cost savings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 17:34:10 -06:00
michael	593d3bf346	cost: enable CPU throttling for Whisper and FFmpeg Cloud Run services Changed cpu-throttling from "false" to "true" for both services. This reduces costs when instances are idle between requests: - Idle CPU billed at ~10% of active rate instead of 100% - Instances still scale to zero after ~15 min of no traffic Trade-off: Slightly slower response when resuming from throttled state, but startup-cpu-boost is still enabled to mitigate cold starts. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 17:22:14 -06:00
michael	79440929f4	feat: add Cloud Run HTTP services for Whisper and FFmpeg Migrate CPU-intensive workloads to Cloud Run for autoscaling: - Add Whisper HTTP service (FastAPI) with /transcribe endpoint - Add FFmpeg HTTP service (FastAPI) with /encode, /probe, /extract-frame, etc. - Add Dockerfiles for both services (8 vCPU, 32GB RAM, Gen2) - Add Cloud Build config for CI/CD deployment - Add Cloud Run service YAML configs with scale-to-zero - Update whisper_transcribe.py to call Cloud Run when WHISPER_SERVICE_URL set - Update video_renderer.py to call Cloud Run when FFMPEG_SERVICE_URL set - Update whisper_service.py for Cloud Run compatibility (no settings dependency) - Add ffmpeg_service_url and whisper_service_url to config.py Services scale 0→N based on request load, falling back to local execution when service URLs are not configured (hybrid mode). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 10:12:50 -06:00
michael	6689778be7	feat: add dedicated TTS worker with parallel per-cue synthesis Break out TTS synthesis into a dedicated Celery worker (tts queue) with concurrency=8 for parallel processing. Each AD cue is now synthesized as a separate task, enabling up to 8 cues to be processed simultaneously. Key changes: - Add tts_synthesis.py with synthesize_cue_task for per-cue synthesis - Refactor translate_and_synthesize.py to dispatch cue tasks in parallel - Add tts-worker service to docker-compose.yml (concurrency=8) - Add Cloud Run service config for production deployment Benefits: - Parallel synthesis even for single jobs (e.g., 50 cues → 8 concurrent) - Natural rate limiting across multiple concurrent jobs - Fault tolerance with per-cue retries and GCS persistence 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-30 14:23:11 -06:00
michael	54638d1065	feat: switch Whisper model from large-v3 to medium Medium model is faster and uses less memory (~1.5GB vs ~3GB) while still providing good multilingual transcription quality. Updated in: - config.py - docker-compose.yml - whisper-worker-service.yaml - cloudbuild.yaml - Dockerfile (pre-download) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-28 22:35:47 -06:00
michael	6baf6cc254	fix: set WHISPER_MODEL env var default to large-v3 Environment variables were overriding config.py with 'base' model. Updated defaults in: - docker-compose.yml - whisper-worker-service.yaml 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-28 21:56:41 -06:00
michael	1395807b23	feat: increase whisper worker memory to 8GB for large-v3 model - docker-compose.yml: 4G -> 8G limit, 2G -> 4G reservation - whisper-worker-service.yaml: 4Gi -> 8Gi limit, 2Gi -> 4Gi request - cloudbuild.yaml: 4Gi -> 8Gi, WHISPER_MODEL=base -> large-v3 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-28 21:23:29 -06:00
michael	05bde8326d	feat: add Whisper-based pause point refinement for audio descriptions Implements word-level speech analysis using faster-whisper to refine AD pause points. Gemini's timestamps are snapped to natural speech gaps (sentence/phrase boundaries) to prevent pauses mid-word. Key changes: - Add WhisperService for transcription and gap detection - Add dedicated Celery task routed to 'whisper' queue - Integrate refinement into render_accessible_video task - Cache Whisper transcripts in MongoDB for reuse across languages - Add dedicated whisper-worker with concurrency=1 to prevent OOM Configuration: - Uses faster-whisper 'base' model (multilingual, ~145MB) - 5-second search window after Gemini's recommended point - Falls back to original timestamp if no gap found Infrastructure: - New Docker stage: whisper-worker - New Cloud Run service: accessible-video-whisper-worker - Updated docker-compose.yml with whisper-worker service 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-27 08:27:48 -06:00
michael	af2562096a	initial commit	2025-08-24 16:28:33 -05:00

12 commits