- caption_aligner: lower match ratio 0.5→0.35, widen search window 60→150, add time-based cursor fallback on miss
- gemini.py: explicit 'MUST use glossary terms' requirement in translate_vtt prompt; source_has_ad prompt now instructs not to include AD narration in captions
- ingest_and_ai: load glossary for source language and pass to extract_accessibility
- render_accessible_video: handle source_has_ad=True via caption-embed path (ffmpeg subtitle inject, no AD pipeline)
- translate_and_synthesize: track failed languages, write translation_errors to DB, add exc_info to error log
- vtt.py: expand _FILLER_PATTERNS to nl/pt/pl/uk/ru, widen EN/ES/FR/DE/IT lists
- gemini_ingestion.md: strengthen line:0% placement rule, expand disfluency examples per language
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gemini occasionally produces captions where a cue's start_time is
earlier than the previous cue's end_time. Add VTTEditor.fix_overlapping_cues()
that trims each cue's end_time to 1ms before the next cue's start, applied
to both captions and AD VTT immediately after AI generation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bug fixes:
- Bug 1a: source_has_ad flag prevents AI generating AD over existing professional AD;
JobBrief/Job models, gemini service prompt conditional, NewBrief UI checkbox
- Bug 1b: disable native textTracks on video element to prevent double captions
- Bug 2: caption ALL audible speech including off-screen narrators (prompt fix)
- Bug 3: DCMP §6.01 disfluency removal for EN/ES/FR/DE/IT (prompt + post-pass)
- Bug 4: VTT cue settings (line:0%, position:) preserved through parser round-trip
- Bug 5: Whisper word-level timestamp alignment via new caption_aligner service
- Bug 6: assert_cue_alignment used .start/.end; renamed to .start_time/.end_time
- New migration: backfill source_has_ad=False on existing jobs and job_briefs
Also fix retranslation error handling to preserve existing GCS URIs on failure
so video_native captions remain accessible if retranslation fails.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All translations now derive strictly from the approved English master VTT,
eliminating the cue-count and timestamp drift reported by linguists
(e.g. PL AD = 11 cues vs EN AD = 17 cues).
Key changes:
- Remove video_native translation mode entirely; all languages go through
translate_vtt() which guarantees 1:1 cue alignment with EN master
- Transcreation languages now use translate_vtt(style="transcreate") —
same cue-preserving contract, culturally-adapted instructions
- Post-translation cue alignment validator added (VTTEditor.assert_cue_alignment)
- After ingestion, job moves to PENDING_QC (EN-only) instead of TRANSLATING;
translation pipeline dispatches automatically when EN QC is approved
- New POST /jobs/{id}/retranslate-language endpoint for PM/admin to fix
legacy video_native jobs on demand
- Frontend: origin badge (EN-aligned / transcreated / video-native warning),
EN-first gate banner on target-language cards, Re-translate from EN button
with confirm modal, removed translation mode selector from NewJob
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rewrote VTT translation to two-step (text-only → Gemini → apply to original timestamps) preventing caption timing desync
- Added polling fallback for all processing states and Safari visibilitychange WebSocket reconnect
- Added 11 new TTS languages (cs, da, fi, hu, no, sk, sv, es-419, pt-BR, fr-CA)
- Updated caption/AD prompts to DCMP Captioning Key & Description Key standards (line splitting, ♪ music notation, italic tags, caption positioning, ethics guidelines)
- Added descriptive transcript generation (WCAG 2.1 §1.2.1) combining captions + AD into plain text
- Fixed amix normalize=0 to prevent audio loss in rendered videos
- Fixed AD re-timing double-count when source_ms is None
- Fixed cue block numbering to be 1-based in VttEditor and Timeline Preview
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>