video-accessibility/backend
Vadym Samoilenko fddf803b74 feat(translation): enforce EN-first pipeline with cue-preserving translations
All translations now derive strictly from the approved English master VTT,
eliminating the cue-count and timestamp drift reported by linguists
(e.g. PL AD = 11 cues vs EN AD = 17 cues).

Key changes:
- Remove video_native translation mode entirely; all languages go through
  translate_vtt() which guarantees 1:1 cue alignment with EN master
- Transcreation languages now use translate_vtt(style="transcreate") —
  same cue-preserving contract, culturally-adapted instructions
- Post-translation cue alignment validator added (VTTEditor.assert_cue_alignment)
- After ingestion, job moves to PENDING_QC (EN-only) instead of TRANSLATING;
  translation pipeline dispatches automatically when EN QC is approved
- New POST /jobs/{id}/retranslate-language endpoint for PM/admin to fix
  legacy video_native jobs on demand
- Frontend: origin badge (EN-aligned / transcreated / video-native warning),
  EN-first gate banner on target-language cards, Re-translate from EN button
  with confirm modal, removed translation mode selector from NewJob

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 12:11:35 +01:00
..
app feat(translation): enforce EN-first pipeline with cue-preserving translations 2026-05-06 12:11:35 +01:00
scripts feat(help): in-app role-based help guides + screenshot capture pipeline 2026-05-01 13:08:13 +01:00
tests feat(pause-insert): adaptive buffer, forward-snap, timeline drag + share link fix 2026-05-01 16:09:09 +01:00
.dockerignore fixed dockerignore 2025-10-08 17:17:39 -05:00
.env.example feat: Client → Team → Project isolation system with Project Manager role 2026-04-27 15:11:13 +01:00
.gitignore feat: per-client glossary — hybrid exact/vector retrieval + AI injection 2026-04-29 13:03:38 +01:00
celery_worker.py fix: pause at start of gap + add explicit whisper_transcribe import 2025-12-27 09:11:29 -06:00
cors-config.json initial commit 2025-08-24 16:28:33 -05:00
create_test_users.py added production user role and made it default for new MSAL users - production can access everything EXCEPT user management - that's only for admin 2025-10-10 10:07:30 -05:00
debug_login.py initial commit 2025-08-24 16:28:33 -05:00
Dockerfile fix(docker): add ffmpeg to base image — fixes pydub AudioSegment in worker 2026-04-30 19:12:57 +01:00
Dockerfile.cloudrun feat(infra): move heavy workers to Cloud Run Jobs 2026-04-29 21:47:10 +01:00
Dockerfile.ffmpeg-service feat: add Cloud Run HTTP services for Whisper and FFmpeg 2026-01-02 10:12:50 -06:00
Dockerfile.whisper-service fix: add --no-root to poetry install in Dockerfiles (Poetry 2.x) 2026-04-29 14:35:28 +01:00
gunicorn_conf.py initial commit 2025-08-24 16:28:33 -05:00
migrate.py initial commit 2025-08-24 16:28:33 -05:00
optical-414516-80e2475f6412.json initial commit 2025-08-24 16:28:33 -05:00
poetry.lock chore: update poetry.lock after adding lameenc dependency 2026-04-30 18:34:04 +01:00
pyproject.toml fix(tts): replace pydub MP3 export with lameenc (pure Python, no system ffmpeg) 2026-04-30 18:24:15 +01:00
setup_secrets.py initial commit 2025-08-24 16:28:33 -05:00
simple_login_test.py initial commit 2025-08-24 16:28:33 -05:00
test_auth.py initial commit 2025-08-24 16:28:33 -05:00
test_db.py initial commit 2025-08-24 16:28:33 -05:00
test_endpoint.py initial commit 2025-08-24 16:28:33 -05:00
test_mp3_serving.py initial commit 2025-08-24 16:28:33 -05:00
uv.lock docs: add canonical documentation + audit cleanup 2026-04-29 14:22:51 +01:00