| title |
aliases |
tags |
sources |
created |
updated |
| Celery + Redis: Must Flush Redis on Deterministic Errors |
| celery-redis-flush |
| celery-stuck-queue |
| redis-task-retry-loop |
|
| celery |
| redis |
| python |
| gotcha |
| worker |
| debugging |
|
|
2026-04-30 |
2026-04-30 |
Celery + Redis: Must Flush Redis on Deterministic Errors
When a Celery task crashes with a deterministic error (type error, missing config, wrong data shape), the task ID remains in the Redis queue and workers retry it in a loop. Resetting job status in MongoDB alone is NOT enough — the Redis queue entry must also be cleared.
Key Points
- Celery task failure leaves the task ID in the Redis broker queue
- Workers retry the task on the next cycle, hit the same deterministic error, fail again — infinite loop
- Resetting MongoDB job status (e.g.,
status = "pending") does NOT remove the task from Redis
- Fix: flush Redis + re-enqueue the job from scratch
- Deterministic errors (type errors, config errors, wrong data shape) will never succeed on retry — retrying them wastes worker cycles and blocks the queue
- When all retries fail identically, the error is NOT transient — diagnose the root cause before re-running
Details
The Stuck Queue Scenario
1. Task enqueued → Redis queue: [task_id_abc123]
2. Worker picks up task → crashes (TypeError: bytearray vs bytes)
3. Celery marks as failed, increments retry count
4. Task re-queued for next retry → Redis queue: [task_id_abc123]
5. Repeat until max_retries exhausted
6. Job status in MongoDB: still "processing" (or "failed")
7. Developer resets MongoDB status to "pending"
8. NEW task enqueued → Redis queue: [task_id_abc123, task_id_xyz789]
9. OLD task_id_abc123 STILL runs and fails
The Fix
# Option 1: Full Redis flush (nuclear — clears ALL queues)
docker compose exec redis redis-cli FLUSHALL
# Option 2: Clear specific queue
docker compose exec redis redis-cli DEL celery
# Option 3: Clear named queue (e.g., tts queue)
docker compose exec redis redis-cli DEL tts
After flushing:
- Fix the underlying code error
- Rebuild/restart the affected worker container
- Re-enqueue the job via the application (not by resetting MongoDB status alone)
When to Use This Procedure
| Error type |
Retry useful? |
Redis flush needed? |
| Network timeout to external API |
Yes |
No |
| Rate limit (429) |
Yes (with backoff) |
No |
| TypeError, AttributeError |
No |
Yes |
| Missing env var / config |
No |
Yes |
| File not found (runtime dep) |
No |
Yes |
| DB connection error (transient) |
Yes |
No |
Checking Queue Depth
# See all Redis keys (queues)
docker compose exec redis redis-cli KEYS '*'
# Check queue length
docker compose exec redis redis-cli LLEN celery
docker compose exec redis redis-cli LLEN tts
Related Concepts
Sources