video-accessibility/backend
Vadym Samoilenko 76bee82119 fix(pipeline): fix 5 QA tickets — caption alignment, glossary, source_has_ad render, filler words, NL error surfacing
- caption_aligner: lower match ratio 0.5→0.35, widen search window 60→150, add time-based cursor fallback on miss
- gemini.py: explicit 'MUST use glossary terms' requirement in translate_vtt prompt; source_has_ad prompt now instructs not to include AD narration in captions
- ingest_and_ai: load glossary for source language and pass to extract_accessibility
- render_accessible_video: handle source_has_ad=True via caption-embed path (ffmpeg subtitle inject, no AD pipeline)
- translate_and_synthesize: track failed languages, write translation_errors to DB, add exc_info to error log
- vtt.py: expand _FILLER_PATTERNS to nl/pt/pl/uk/ru, widen EN/ES/FR/DE/IT lists
- gemini_ingestion.md: strengthen line:0% placement rule, expand disfluency examples per language

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 18:36:59 +01:00
..
app fix(pipeline): fix 5 QA tickets — caption alignment, glossary, source_has_ad render, filler words, NL error surfacing 2026-05-08 18:36:59 +01:00
scripts feat(help): in-app role-based help guides + screenshot capture pipeline 2026-05-01 13:08:13 +01:00
tests feat(pause-insert): adaptive buffer, forward-snap, timeline drag + share link fix 2026-05-01 16:09:09 +01:00
.dockerignore fixed dockerignore 2025-10-08 17:17:39 -05:00
.env.example feat: Client → Team → Project isolation system with Project Manager role 2026-04-27 15:11:13 +01:00
.gitignore feat: per-client glossary — hybrid exact/vector retrieval + AI injection 2026-04-29 13:03:38 +01:00
celery_worker.py fix: pause at start of gap + add explicit whisper_transcribe import 2025-12-27 09:11:29 -06:00
cors-config.json initial commit 2025-08-24 16:28:33 -05:00
create_test_users.py added production user role and made it default for new MSAL users - production can access everything EXCEPT user management - that's only for admin 2025-10-10 10:07:30 -05:00
debug_login.py initial commit 2025-08-24 16:28:33 -05:00
Dockerfile fix(docker): add ffmpeg to base image — fixes pydub AudioSegment in worker 2026-04-30 19:12:57 +01:00
Dockerfile.cloudrun feat(infra): move heavy workers to Cloud Run Jobs 2026-04-29 21:47:10 +01:00
Dockerfile.ffmpeg-service feat: add Cloud Run HTTP services for Whisper and FFmpeg 2026-01-02 10:12:50 -06:00
Dockerfile.whisper-service fix: add --no-root to poetry install in Dockerfiles (Poetry 2.x) 2026-04-29 14:35:28 +01:00
gunicorn_conf.py initial commit 2025-08-24 16:28:33 -05:00
migrate.py initial commit 2025-08-24 16:28:33 -05:00
optical-414516-80e2475f6412.json initial commit 2025-08-24 16:28:33 -05:00
poetry.lock chore(deps): upgrade google-cloud-texttospeech to ^2.36.0 2026-05-08 17:26:30 +01:00
pyproject.toml chore(deps): upgrade google-cloud-texttospeech to ^2.36.0 2026-05-08 17:26:30 +01:00
setup_secrets.py initial commit 2025-08-24 16:28:33 -05:00
simple_login_test.py initial commit 2025-08-24 16:28:33 -05:00
test_auth.py initial commit 2025-08-24 16:28:33 -05:00
test_db.py initial commit 2025-08-24 16:28:33 -05:00
test_endpoint.py initial commit 2025-08-24 16:28:33 -05:00
test_mp3_serving.py initial commit 2025-08-24 16:28:33 -05:00
uv.lock docs: add canonical documentation + audit cleanup 2026-04-29 14:22:51 +01:00