Celery + Redis: Must Flush Redis on Deterministic Errors

When a Celery task crashes with a deterministic error (type error, missing config, wrong data shape), the task ID remains in the Redis queue and workers retry it in a loop. Resetting job status in MongoDB alone is NOT enough — the Redis queue entry must also be cleared.

Key Points

Celery task failure leaves the task ID in the Redis broker queue
Workers retry the task on the next cycle, hit the same deterministic error, fail again — infinite loop
Resetting MongoDB job status (e.g., status = "pending") does NOT remove the task from Redis
Fix: flush Redis + re-enqueue the job from scratch
Deterministic errors (type errors, config errors, wrong data shape) will never succeed on retry — retrying them wastes worker cycles and blocks the queue
When all retries fail identically, the error is NOT transient — diagnose the root cause before re-running

Details

The Stuck Queue Scenario

1. Task enqueued → Redis queue: [task_id_abc123]
2. Worker picks up task → crashes (TypeError: bytearray vs bytes)
3. Celery marks as failed, increments retry count
4. Task re-queued for next retry → Redis queue: [task_id_abc123]
5. Repeat until max_retries exhausted
6. Job status in MongoDB: still "processing" (or "failed")
7. Developer resets MongoDB status to "pending" 
8. NEW task enqueued → Redis queue: [task_id_abc123, task_id_xyz789]
9. OLD task_id_abc123 STILL runs and fails

The Fix

# Option 1: Full Redis flush (nuclear — clears ALL queues)
docker compose exec redis redis-cli FLUSHALL

# Option 2: Clear specific queue
docker compose exec redis redis-cli DEL celery

# Option 3: Clear named queue (e.g., tts queue)
docker compose exec redis redis-cli DEL tts

After flushing:

Fix the underlying code error
Rebuild/restart the affected worker container
Re-enqueue the job via the application (not by resetting MongoDB status alone)

When to Use This Procedure

Error type	Retry useful?	Redis flush needed?
Network timeout to external API	Yes	No
Rate limit (429)	Yes (with backoff)	No
TypeError, AttributeError	No	Yes
Missing env var / config	No	Yes
File not found (runtime dep)	No	Yes
DB connection error (transient)	Yes	No

Checking Queue Depth

# See all Redis keys (queues)
docker compose exec redis redis-cli KEYS '*'

# Check queue length
docker compose exec redis redis-cli LLEN celery
docker compose exec redis redis-cli LLEN tts

wiki/concepts/lameenc-bytearray-gcs-upload — example of a deterministic error (bytearray TypeError) that caused this scenario
wiki/concepts/celery-queue-worker-specialization — queue naming and which workers consume which queue

Sources

daily/2026-04-30.md — Session 17:09, Celery retry loop after lameenc bytearray TypeError

3.2 KiB Raw Permalink Blame History