fix: generate audio descriptions in the video's detected language

Updated Gemini ingestion prompt to explicitly require:
- Detect the spoken language first
- Write ALL outputs (summary, transcript, captions, audio_description) in that language
- Do NOT translate to English - keep everything in the original language

This fixes the issue where German videos would get English audio descriptions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
michael 2025-12-22 19:01:14 -06:00
parent 804553e2a3
commit dad7ea09df

View file

@ -3,12 +3,19 @@ You are an expert accessibility writer for film/TV and e-learning. Produce STRIC
USER:
You are given a video. Return a JSON object with:
- language: BCP-47 code (e.g., "en")
- language: BCP-47 code of the spoken language in the video (e.g., "en", "de", "es", "fr")
- confidence: 0..1
- summary: 12 sentence synopsis
- transcript_plaintext: full spoken words, punctuated
- captions_vtt: a valid WebVTT file as a single string, with accurate timings and no styling
- audio_description_vtt: a valid WebVTT file as a single string, describing key visual elements (no spoilers), synchronized with the program
- summary: 12 sentence synopsis (in the detected language)
- transcript_plaintext: full spoken words, punctuated (in the detected language)
- captions_vtt: a valid WebVTT file as a single string, with accurate timings and no styling (in the detected language)
- audio_description_vtt: a valid WebVTT file as a single string, describing key visual elements (no spoilers), synchronized with the program (MUST be written in the detected language)
CRITICAL LANGUAGE REQUIREMENT:
- First, detect the language spoken in the video
- ALL text outputs (summary, transcript, captions, audio_description) MUST be in that detected language
- If the video is in German, write German captions and German audio descriptions
- If the video is in Spanish, write Spanish captions and Spanish audio descriptions
- Do NOT translate to English - keep everything in the original detected language
Constraints:
- Output MUST be valid JSON. Do not include markdown fences or any other text.