vault backup: 2026-05-08 17:09:20

This commit is contained in:
Vadym Samoilenko 2026-05-08 17:09:20 +01:00
parent 8a3c404dee
commit b447a4d829
2 changed files with 54 additions and 0 deletions

View file

@ -273,3 +273,9 @@ tags: [daily]
- 17:05 | `video-accessibility`
- **Asked:** Asked accessibility video not generating, check server logs and fix it
- **Done:** Done Added `_extract_retry_after` method to `GeminiTTSService` class and deployed
- 17:07 | `video-accessibility`
- **Asked:** Fix accessibility video generation failing due to server errors.
- **Done:** Implemented dynamic retry delay parsing from API rate-limit responses to handle 429 errors properly.
- 17:08 | `video-accessibility`
- **Asked:** Fix accessibility video generation failures due to API rate limits.
- **Done:** Implemented dynamic retry delay parsing from Gemini API responses to handle 429 errors correctly instead of using fixed backoff.

View file

@ -0,0 +1,48 @@
---
tags: [tech-patterns, auto-generated]
source: video-accessibility
created: 2026-05-08
---
# Adaptive Rate Limit Backoff with API Retry-After Headers
## When to use
When integrating with third-party APIs that enforce rate limits (429 responses) and return dynamic retry delays, especially when multiple concurrent jobs can exhaust limits unexpectedly. Use this pattern to avoid retry storms and respect API-mandated wait times.
## Prerequisites
- Celery or similar async task queue for job retry management
- asyncio for concurrent request handling
- API that returns `retry-after` headers or `retryDelay` fields in responses
- Rate limiting already in place but insufficient for concurrent workloads
## Steps
1. Parse the rate limit response (429 status code) to extract retry delay information:
- Check HTTP `Retry-After` header (standard)
- Parse API-specific JSON fields like `retryDelay: "37s"` or text patterns like `"retry in 37s"`
2. Extract numeric delay value and convert to seconds
3. Use extracted delay as countdown for Celery task retry:
```python
task.retry(countdown=retry_delay_seconds)
```
4. For asyncio-based retries, implement exponential backoff using the API-provided delay:
```python
await asyncio.sleep(retry_delay_seconds)
await retry_operation()
```
5. Log the parsed delay to track rate limit events and verify API compliance
## Key Configuration
- **Static backoff** (old): 1-3 seconds — insufficient for APIs requiring 30+ second waits
- **Dynamic backoff** (new): Parse `retryDelay` from API response and use as source of truth
- **Concurrency control**: Monitor parallel job count; if tests or increased traffic trigger 8+ concurrent TTS requests against a 10 RPM limit, rate limiting will be hit immediately
- **Retry limit**: Set reasonable max retries to prevent infinite loops (e.g., 3-5 attempts)
## Gotchas
- **Hardcoded backoff too short**: Fixed 1-3 second delays will immediately fail against 429 responses requiring 30+ seconds; always parse and respect API-provided delays
- **Ignoring Retry-After header**: Retrying before the required delay causes immediate re-throttling and cascading failures
- **Concurrency spike**: Tests or batch operations can suddenly exceed rate limits previously considered safe; monitor actual concurrent request rates vs. API limits
- **Pattern parsing**: Different APIs format retry delays differently — implement flexible parsing for both header format (`"37"`) and JSON format (`"retryDelay: '37s'"`)
- **Task queue vs asyncio mismatch**: Use Celery `countdown` for Celery tasks; use `asyncio.sleep()` for async operations — don't mix patterns
## Source
Project: video-accessibility