obsidian/wiki/tech-patterns/redis-celery-worker-queue.md
2026-04-15 10:48:47 +01:00

3.4 KiB

title aliases tags sources created updated
Redis + Celery Async Worker Queue
celery
task-queue
worker
redis-queue
redis
celery
async
worker
queue
python
01 Projects/enterprise-ai-hub-nexus
01 Projects/video-accessibility
01 Projects/pdf-accessibility
2026-04-15 2026-04-15

Redis + Celery Async Worker Queue

Pattern for offloading long-running AI/processing tasks to background workers. Used in the heaviest Oliver processing pipelines.

Key Takeaways

  • Redis is both the message broker AND result backend for Celery
  • Use Celery when tasks take >5s (AI inference, video processing, PDF analysis)
  • Celery beat for scheduled recurring tasks (e.g., SharePoint sync)
  • PDF Accessibility uses Redis queue directly (pdf:queue) without Celery — simpler worker.py daemon
  • Always poll for task status from the frontend; never block on long tasks

When to Use

  • Video processing pipelines (multi-phase, minutes-long)
  • Scheduled sync jobs (Celery beat)
  • Any task that would timeout an HTTP request (>30s)
  • Parallel AI analysis tasks

Key Details

Standard Setup

# docker-compose.yml
services:
  redis:
    image: redis:7
    ports: ["6379:6379"]
  worker:
    build: ./backend
    command: celery -A app.celery worker --loglevel=info
    depends_on: [redis]
  beat:
    build: ./backend
    command: celery -A app.celery beat --loglevel=info
    depends_on: [redis]

Task Definition

@celery.task
def process_video(video_id: str):
    # Long-running pipeline
    phase_1_ingest(video_id)
    phase_2_caption(video_id)   # Gemini 2.5 Pro
    phase_3_translate(video_id)
    phase_4_tts(video_id)

Polling Pattern (Frontend)

// Poll until complete
const poll = async (jobId) => {
  const { status } = await api.get(`/jobs/${jobId}/status`)
  if (status === 'pending') setTimeout(() => poll(jobId), 2000)
}

Projects Using This Pattern

Pipeline Phases (Video Accessibility)

1. Upload → Ingestion worker
2. Gemini 2.5 Pro → VTT captions
3. Audio Description generation
4. QC review (approve/reject/edit VTT)
5. Translation → 50+ languages
6. TTS synthesis (GCP TTS + ElevenLabs)
7. Final delivery

Gotchas & Lessons

  • Celery beat needs its own container — it manages schedules independently from workers
  • Proactive token refresh required for long Celery jobs that need M365 access (Enterprise Nexus)
  • worker.py simpler alternative to Celery for single-queue use (PDF Accessibility pattern)
  • Always store job status in DB (not just Redis) so it survives Redis restart
  • video_accessibility_development_plan.txt is the authoritative spec — always read before touching that pipeline