obsidian/wiki/tech-patterns/vtt-descriptive-transcript-regeneration.md
2026-05-01 12:15:13 +01:00

3.3 KiB

title description tags created updated projects
VTT Edit → Descriptive Transcript Regeneration Pattern for keeping descriptive_transcript.txt in sync when captions or audio description VTTs are edited via PATCH /vtt
fastapi
gcs
vtt
accessibility
celery
2026-05-01 2026-05-01
video-accessibility

VTT Edit → Descriptive Transcript Regeneration

Problem

When a reviewer edits either captions.vtt or ad.vtt via PATCH /jobs/{id}/vtt, the descriptive transcript (descriptive_transcript.txt) becomes stale — it still reflects the pre-edit VTT content. This goes undetected because the transcript is not re-generated in the PATCH handler.

Pattern

In the PATCH handler, after writing the edited VTT(s) to GCS but before the MongoDB update:

  1. Determine which stream was edited (request body) and which was not
  2. Read the unchanged stream from GCS
  3. Merge both streams via generate_descriptive_transcript(captions_text, ad_text)
  4. Upload the new transcript to GCS
  5. Update lang_output["descriptive_transcript_gcs"] so the MongoDB doc points to the fresh file

Wrap in a broad except so a transcript failure never blocks the VTT save.

# After GCS uploads for captions/AD:
if request.captions_vtt or request.audio_description_vtt:
    try:
        from ...services.descriptive_transcript import (
            generate_descriptive_transcript as _gen_transcript,
        )
        captions_text = request.captions_vtt
        if not captions_text:
            cc_gcs = lang_output.get("captions_vtt_gcs")
            if cc_gcs:
                _blob = gcs_service.bucket.blob(
                    cc_gcs.replace(f"gs://{settings.gcs_bucket}/", "")
                )
                captions_text = await asyncio.get_event_loop().run_in_executor(
                    gcs_service.executor, _blob.download_as_text
                )
        ad_text = request.audio_description_vtt
        if not ad_text:
            ad_gcs = lang_output.get("ad_vtt_gcs")
            if ad_gcs:
                _blob = gcs_service.bucket.blob(
                    ad_gcs.replace(f"gs://{settings.gcs_bucket}/", "")
                )
                ad_text = await asyncio.get_event_loop().run_in_executor(
                    gcs_service.executor, _blob.download_as_text
                )
        transcript_text = _gen_transcript(captions_text or "", ad_text or "")
        if transcript_text:
            transcript_uri = await upload_vtt_to_gcs(
                transcript_text,
                f"{job_id}/{target_language}/descriptive_transcript.txt",
            )
            lang_output["descriptive_transcript_gcs"] = transcript_uri
    except Exception as _tr_err:
        logger.warning(
            f"Failed to regenerate descriptive transcript for job {job_id}: {_tr_err}"
        )

Notes

  • asyncio.get_event_loop().run_in_executor(gcs_service.executor, blob.download_as_text) — use the GCS service's thread pool executor to keep GCS SDK calls off the async loop
  • The local import inside try/except avoids circular import issues if the module is conditionally present
  • Always update the GCS pointer in lang_output before the MongoDB update — the write is atomic at the document level, so both the VTT path and the transcript path update together
  • This pattern applies to any derived artifact that depends on two source VTT files