Commit graph

5 commits

Author SHA1 Message Date
Vadym Samoilenko
ea21cace96 feat: replace SDK with direct HTTP integration to centralized cost tracker
- New services/cost_tracker.py: sync httpx preflight()/record() + async wrappers;
  BudgetExceeded exception; no-op when COST_TRACKER_BASE_URL is empty
- Preflight budget check added before ingestion (Gemini), per-language translation
  (video-native + traditional), and per-language TTS dispatch
- _record_gemini_usage and _record_tts_cost now call cost_tracker directly;
  removes broken asyncio.get_event_loop() hack from sync Celery worker
- Fix: _cost_ctx now threaded into extract_accessibility_targeted (video-native path)
- Fix: user_id/cost_project_id now propagated through dispatch_language_tts →
  synthesize_cue_task.s() and the rerender_accessible_video.py re-render path
- Remove oliver-cost-tracker SDK dependency (was commented-out/never installed)
- Drop cost_tracker_outbox_path setting and get_cost_tracker() factory
- Update COST_TRACKER_BASE_URL default to optical-dev.oliver.solutions in
  .env.prod.example, docker-compose.yml, and all Cloud Run service yamls
- Cloud Run yamls use Secret Manager ref (cost-tracker-api-key) for the API key

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 13:36:15 +01:00
Vadym Samoilenko
ae2c474061 feat: integrate oliver-cost-tracker SDK into video-accessibility
Add AI cost tracking to all Gemini and TTS call sites:

- config.py: add COST_TRACKER_* env vars (base_url, api_key, source_app,
  outbox_path, enabled)
- dependencies.py: add get_cost_tracker() factory (lru_cache, graceful
  degradation if SDK not installed)
- models/job.py: add cost_tracker_project_id field for cost attribution
- services/gemini.py:
  - add import time, _record_gemini_usage() helper (reads usage_metadata)
  - add _cost_ctx kwarg to extract_accessibility, extract_accessibility_targeted,
    transcreate_content, translate_vtt, rewrite_tts_cue
  - record usage after every generate_content call via asyncio.create_task()
- tasks/ingest_and_ai.py: pass _cost_ctx (user_id, job_id, project_id) to
  extract_accessibility
- tasks/translate_and_synthesize.py: build _cost_ctx from job_doc and pass
  to transcreate_content + translate_vtt calls
- tasks/tts_synthesis.py: add user_id + cost_project_id kwargs, add
  _record_tts_cost() helper (records len(text) chars to cost tracker)
- pyproject.toml: document SDK install instructions (comment)
- .env.prod.example: add COST_TRACKER_* vars

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 11:30:46 +01:00
michael
05bde8326d feat: add Whisper-based pause point refinement for audio descriptions
Implements word-level speech analysis using faster-whisper to refine
AD pause points. Gemini's timestamps are snapped to natural speech gaps
(sentence/phrase boundaries) to prevent pauses mid-word.

Key changes:
- Add WhisperService for transcription and gap detection
- Add dedicated Celery task routed to 'whisper' queue
- Integrate refinement into render_accessible_video task
- Cache Whisper transcripts in MongoDB for reuse across languages
- Add dedicated whisper-worker with concurrency=1 to prevent OOM

Configuration:
- Uses faster-whisper 'base' model (multilingual, ~145MB)
- 5-second search window after Gemini's recommended point
- Falls back to original timestamp if no gap found

Infrastructure:
- New Docker stage: whisper-worker
- New Cloud Run service: accessible-video-whisper-worker
- Updated docker-compose.yml with whisper-worker service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 08:27:48 -06:00
michael
46b6f25fd0 upgrade to Gemini 3 Pro preview model
- Change model from gemini-2.5-pro to gemini-3-pro-preview
- Upgrade google-genai package from ^1.31.0 to ^1.56.0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 14:02:02 -06:00
michael
af2562096a initial commit 2025-08-24 16:28:33 -05:00