video-accessibility

Author	SHA1	Message	Date
Vadym Samoilenko	ea21cace96	feat: replace SDK with direct HTTP integration to centralized cost tracker - New services/cost_tracker.py: sync httpx preflight()/record() + async wrappers; BudgetExceeded exception; no-op when COST_TRACKER_BASE_URL is empty - Preflight budget check added before ingestion (Gemini), per-language translation (video-native + traditional), and per-language TTS dispatch - _record_gemini_usage and _record_tts_cost now call cost_tracker directly; removes broken asyncio.get_event_loop() hack from sync Celery worker - Fix: _cost_ctx now threaded into extract_accessibility_targeted (video-native path) - Fix: user_id/cost_project_id now propagated through dispatch_language_tts → synthesize_cue_task.s() and the rerender_accessible_video.py re-render path - Remove oliver-cost-tracker SDK dependency (was commented-out/never installed) - Drop cost_tracker_outbox_path setting and get_cost_tracker() factory - Update COST_TRACKER_BASE_URL default to optical-dev.oliver.solutions in .env.prod.example, docker-compose.yml, and all Cloud Run service yamls - Cloud Run yamls use Secret Manager ref (cost-tracker-api-key) for the API key Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 13:36:15 +01:00
Vadym Samoilenko	ae2c474061	feat: integrate oliver-cost-tracker SDK into video-accessibility Add AI cost tracking to all Gemini and TTS call sites: - config.py: add COST_TRACKER_* env vars (base_url, api_key, source_app, outbox_path, enabled) - dependencies.py: add get_cost_tracker() factory (lru_cache, graceful degradation if SDK not installed) - models/job.py: add cost_tracker_project_id field for cost attribution - services/gemini.py: - add import time, _record_gemini_usage() helper (reads usage_metadata) - add _cost_ctx kwarg to extract_accessibility, extract_accessibility_targeted, transcreate_content, translate_vtt, rewrite_tts_cue - record usage after every generate_content call via asyncio.create_task() - tasks/ingest_and_ai.py: pass _cost_ctx (user_id, job_id, project_id) to extract_accessibility - tasks/translate_and_synthesize.py: build _cost_ctx from job_doc and pass to transcreate_content + translate_vtt calls - tasks/tts_synthesis.py: add user_id + cost_project_id kwargs, add _record_tts_cost() helper (records len(text) chars to cost tracker) - pyproject.toml: document SDK install instructions (comment) - .env.prod.example: add COST_TRACKER_* vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 11:30:46 +01:00
michael	05bde8326d	feat: add Whisper-based pause point refinement for audio descriptions Implements word-level speech analysis using faster-whisper to refine AD pause points. Gemini's timestamps are snapped to natural speech gaps (sentence/phrase boundaries) to prevent pauses mid-word. Key changes: - Add WhisperService for transcription and gap detection - Add dedicated Celery task routed to 'whisper' queue - Integrate refinement into render_accessible_video task - Cache Whisper transcripts in MongoDB for reuse across languages - Add dedicated whisper-worker with concurrency=1 to prevent OOM Configuration: - Uses faster-whisper 'base' model (multilingual, ~145MB) - 5-second search window after Gemini's recommended point - Falls back to original timestamp if no gap found Infrastructure: - New Docker stage: whisper-worker - New Cloud Run service: accessible-video-whisper-worker - Updated docker-compose.yml with whisper-worker service 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-27 08:27:48 -06:00
michael	46b6f25fd0	upgrade to Gemini 3 Pro preview model - Change model from gemini-2.5-pro to gemini-3-pro-preview - Upgrade google-genai package from ^1.31.0 to ^1.56.0 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-22 14:02:02 -06:00
michael	af2562096a	initial commit	2025-08-24 16:28:33 -05:00

5 commits