A1_RETRY_LOGIC.md updated to reflect the 2026-04-28 rework: empty folders are now treated as expected workflow (silent skip + one-time warning at poll 20, no auto permanent-fail), while the original 3-strikes-then-permanently-fail behavior is preserved for genuine folder errors via the mark_failed_at_max flag. README.md adds LTD (Licensing Translation Document) to the asset type override section alongside EOL, and notes that empty overrides remove fields while non-empty overrides on non-MVP fields are appended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
324 lines
11 KiB
Markdown
324 lines
11 KiB
Markdown
# A1→A2 Empty Folder Handling
|
||
|
||
**Purpose:** Avoid spam emails and false-positive permanent failures for the common workflow where campaign managers create an A1 campaign before uploading the master assets.
|
||
|
||
**Initial implementation:** January 31, 2026
|
||
**Reworked:** April 28, 2026 — empty folders are now treated as expected client workflow rather than failures.
|
||
|
||
**Related files:**
|
||
- `scripts/a1_to_a2_box_uploader.py` (main script)
|
||
- `scripts/shared/database.py` (retry tracking methods)
|
||
- `database/migrations/003_add_a1_retry_tracking.sql` (schema)
|
||
|
||
---
|
||
|
||
## How It Works (current behavior)
|
||
|
||
### The empty-folder case (most common)
|
||
When a campaign is at A1 in DAM but the Master Assets folder is empty, the script treats this as a normal pre-asset state, not a failure.
|
||
|
||
**Flow:**
|
||
1. Every poll: `a1_retry_count` is incremented for visibility, the script logs `No master assets yet (poll N) - skipping until assets appear`, and exits silently.
|
||
2. At poll 20 (~1 hour at the 3-minute orchestrator cadence) the script sends a single `a1_to_a2_no_assets_warning` email so genuinely-stuck campaigns still surface.
|
||
3. After poll 20, the script keeps skipping silently. **`a1_permanently_failed` is never auto-set for empty folders.**
|
||
4. When assets eventually appear and A1→A2 succeeds, `db.reset_a1_retry()` clears the counter automatically.
|
||
|
||
The threshold lives in `scripts/a1_to_a2_box_uploader.py` as `EMPTY_FOLDER_WARNING_THRESHOLD = 20`.
|
||
|
||
### The genuine-error case
|
||
The 3-retries-then-permanently-fail behavior **still exists** for actual folder-level errors (e.g. `Assets folder not found (tried Master Assets)`), which are caught by the script's exception handler. These DO mark `a1_permanently_failed=TRUE` after 3 failures and DO send the retry / permanently-failed emails.
|
||
|
||
`db.increment_a1_retry()` accepts `mark_failed_at_max=True|False` to switch between the two behaviors. The empty-folder branch passes `False`; the exception handler passes `True` (default).
|
||
|
||
### Queue-slot filter
|
||
The A1→A2 script processes up to 2 campaigns per run (`campaigns[:2]`). Permanently-failed campaigns are filtered out **before** the slot cap so they no longer block the queue (`scripts/a1_to_a2_box_uploader.py:652`).
|
||
|
||
### Database tracking
|
||
|
||
Four fields on the `campaign_status` table:
|
||
- `a1_retry_count` (INTEGER): Number of polls where the folder was empty / errored. For empty-folder cases this can grow unbounded; reset on success.
|
||
- `a1_last_retry_at` (TIMESTAMP): When last attempt occurred
|
||
- `a1_permanently_failed` (BOOLEAN): TRUE only via the genuine-error path (after 3 failures), never via the empty-folder path
|
||
- `a1_failure_reason` (TEXT): Why it failed (e.g., "Assets folder not found (tried Master Assets)")
|
||
|
||
---
|
||
|
||
## Configuration
|
||
|
||
### Empty-folder warning threshold
|
||
`scripts/a1_to_a2_box_uploader.py`:
|
||
```python
|
||
EMPTY_FOLDER_WARNING_THRESHOLD = 20 # ~1 hour at 3-min poll cadence
|
||
```
|
||
Send the one-time warning sooner/later by adjusting this constant.
|
||
|
||
### Genuine-error retry attempts
|
||
`scripts/shared/database.py` → `increment_a1_retry()`:
|
||
```python
|
||
MAX_RETRIES = 3
|
||
```
|
||
Applies only when the caller passes `mark_failed_at_max=True` (default), i.e. the exception handler in `process_campaign()`. The empty-folder branch passes `False` and is unaffected.
|
||
|
||
---
|
||
|
||
## Email Notifications
|
||
|
||
### Empty-folder warning (one-time, at poll 20)
|
||
**Template:** `a1_to_a2_no_assets_warning`
|
||
**Subject:** ⚠️ Campaign in A1 with no assets yet - {campaign_name}
|
||
**Recipients:** Error notification list
|
||
**Sent:** exactly once per stuck campaign, when `a1_retry_count == 20`. Counter resets on success, so a future re-stuck event would warn again.
|
||
|
||
### Genuine-error retry email (attempts 1–2)
|
||
**Template:** `a1_to_a2_no_assets_retry`
|
||
**Subject:** ⚠️ No Assets Found (Attempt X/3) - Campaign {name}
|
||
**Recipients:** Error notification list
|
||
**Trigger:** non-empty-folder errors caught by `process_campaign()`'s exception handler.
|
||
|
||
### Genuine-error final failure (attempt 3)
|
||
**Template:** `a1_to_a2_permanently_failed`
|
||
**Subject:** ❌ PERMANENTLY FAILED - Campaign {name} (No Assets After 3 Attempts)
|
||
**Recipients:** Error notification list
|
||
**Content:**
|
||
- Campaign marked as permanently failed (campaign filtered from future queue runs)
|
||
- Required actions to fix
|
||
- SQL command to manually reset
|
||
|
||
---
|
||
|
||
## Manual Operations
|
||
|
||
### Check Campaign Retry Status
|
||
|
||
```sql
|
||
SELECT campaign_number, campaign_name, status,
|
||
a1_retry_count, a1_last_retry_at,
|
||
a1_permanently_failed, a1_failure_reason
|
||
FROM campaign_status
|
||
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
|
||
```
|
||
|
||
### Reset Single Campaign
|
||
|
||
```sql
|
||
UPDATE campaign_status
|
||
SET a1_retry_count = 0,
|
||
a1_last_retry_at = NULL,
|
||
a1_permanently_failed = FALSE,
|
||
a1_failure_reason = NULL
|
||
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
|
||
```
|
||
|
||
**Or using psql command:**
|
||
```bash
|
||
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking <<EOF
|
||
UPDATE campaign_status
|
||
SET a1_retry_count = 0,
|
||
a1_last_retry_at = NULL,
|
||
a1_permanently_failed = FALSE,
|
||
a1_failure_reason = NULL
|
||
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
|
||
EOF
|
||
```
|
||
|
||
### Reset All Failed Campaigns
|
||
|
||
```sql
|
||
UPDATE campaign_status
|
||
SET a1_retry_count = 0,
|
||
a1_last_retry_at = NULL,
|
||
a1_permanently_failed = FALSE,
|
||
a1_failure_reason = NULL
|
||
WHERE a1_permanently_failed = TRUE;
|
||
```
|
||
|
||
### View All Failed Campaigns
|
||
|
||
```sql
|
||
SELECT campaign_number, campaign_name,
|
||
a1_retry_count, a1_last_retry_at, a1_failure_reason
|
||
FROM campaign_status
|
||
WHERE a1_permanently_failed = TRUE
|
||
ORDER BY a1_last_retry_at DESC;
|
||
```
|
||
|
||
---
|
||
|
||
## Failure Scenarios
|
||
|
||
### Scenario 1: Temporary Empty Folder
|
||
**What Happens:**
|
||
- Attempt 1: Email sent, retry counter = 1
|
||
- Assets added to folder before attempt 2
|
||
- Next run finds assets, processes successfully
|
||
- Retry counter automatically reset to 0
|
||
|
||
**Result:** Problem self-resolves, minimal notifications
|
||
|
||
### Scenario 2: Persistent Empty Folder
|
||
**What Happens:**
|
||
- Attempt 1 (0 min): Email sent, retry counter = 1
|
||
- Attempt 2 (3 min): Email sent, retry counter = 2
|
||
- Attempt 3 (6 min): Email sent, retry counter = 3
|
||
- Campaign marked permanently failed
|
||
- Processing stops, no more emails
|
||
|
||
**Result:** Support team alerted, infinite emails prevented
|
||
|
||
### Scenario 3: Wrong Status Assignment
|
||
**What Happens:**
|
||
- Campaign set to A1 by mistake (no assets intended)
|
||
- Fails 3 times, marked permanently failed
|
||
- Admin realizes mistake, changes status to different value
|
||
- Campaign no longer appears in A1 search results
|
||
|
||
**Result:** No reset needed, campaign excluded from processing
|
||
|
||
---
|
||
|
||
## Testing
|
||
|
||
### Test Retry Logic
|
||
|
||
1. Create test campaign in DAM with A1 status
|
||
2. Ensure Master Assets folder is empty
|
||
3. Run A1→A2 script manually 3 times
|
||
4. Verify emails received and database state
|
||
|
||
```bash
|
||
# Run 1
|
||
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
|
||
|
||
# Check database
|
||
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"
|
||
|
||
# Run 2 (wait 3 minutes or run immediately for testing)
|
||
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
|
||
|
||
# Check again
|
||
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"
|
||
|
||
# Run 3
|
||
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
|
||
|
||
# Verify permanently failed
|
||
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed, a1_failure_reason FROM campaign_status WHERE a1_permanently_failed = TRUE;"
|
||
```
|
||
|
||
### Test Reset Logic
|
||
|
||
```bash
|
||
# Reset the test campaign
|
||
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "UPDATE campaign_status SET a1_retry_count = 0, a1_permanently_failed = FALSE WHERE campaign_number = 'TEST_CAMPAIGN';"
|
||
|
||
# Run again
|
||
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
|
||
|
||
# Verify it retries
|
||
```
|
||
|
||
---
|
||
|
||
## Monitoring
|
||
|
||
### Dashboard Query: Current Retry Status
|
||
|
||
```sql
|
||
SELECT
|
||
COUNT(*) FILTER (WHERE a1_retry_count = 0) as "No Issues",
|
||
COUNT(*) FILTER (WHERE a1_retry_count = 1) as "Attempt 1",
|
||
COUNT(*) FILTER (WHERE a1_retry_count = 2) as "Attempt 2",
|
||
COUNT(*) FILTER (WHERE a1_retry_count >= 3) as "Permanently Failed"
|
||
FROM campaign_status
|
||
WHERE status = 'A1';
|
||
```
|
||
|
||
### Alert Query: Campaigns Near Failure
|
||
|
||
```sql
|
||
SELECT campaign_number, campaign_name, a1_retry_count, a1_last_retry_at
|
||
FROM campaign_status
|
||
WHERE status = 'A1'
|
||
AND a1_retry_count >= 2
|
||
AND a1_permanently_failed = FALSE
|
||
ORDER BY a1_retry_count DESC, a1_last_retry_at DESC;
|
||
```
|
||
|
||
---
|
||
|
||
## Troubleshooting
|
||
|
||
### Q: Campaign keeps failing even after adding assets
|
||
**A:** Check if campaign was marked permanently failed. Reset using SQL command above.
|
||
|
||
### Q: Want to change from 3 to 5 retry attempts
|
||
**A:** Edit `MAX_RETRIES = 3` in `database.py` line ~567. Also update email templates to reflect new maximum.
|
||
|
||
### Q: How to disable retry logic completely?
|
||
**A:** Not recommended, but you can:
|
||
1. Set `MAX_RETRIES = 999` (effectively infinite)
|
||
2. Or revert to old `a1_to_a2_no_assets` template without retry tracking
|
||
|
||
### Q: Can I set different retry counts for different campaigns?
|
||
**A:** No, it's a global setting. All campaigns use same `MAX_RETRIES` value.
|
||
|
||
### Q: What if I want to delete permanently failed campaigns from database?
|
||
**A:** Don't delete. Instead, change their status to something other than A1. They'll be excluded from processing automatically.
|
||
|
||
---
|
||
|
||
## Future Enhancements
|
||
|
||
Potential improvements for future versions:
|
||
|
||
1. **Configurable retry timing:**
|
||
- Instead of relying on cron frequency (3 min)
|
||
- Check `a1_last_retry_at` and skip if too recent
|
||
- Allow exponential backoff (3 min, 10 min, 30 min)
|
||
|
||
2. **Campaign-specific retry limits:**
|
||
- Add optional `a1_max_retries` column
|
||
- Allow different campaigns to have different thresholds
|
||
- Default to global MAX_RETRIES if not set
|
||
|
||
3. **Automatic cleanup:**
|
||
- After 30 days, auto-reset permanently failed campaigns
|
||
- Or send weekly digest of stuck campaigns
|
||
|
||
4. **Webhook notifications:**
|
||
- Send to external system when campaign permanently fails
|
||
- Integrate with ticketing system
|
||
|
||
5. **Admin UI:**
|
||
- Web interface to view/reset retry status
|
||
- Bulk reset operations
|
||
|
||
---
|
||
|
||
## Code Locations
|
||
|
||
**Quick reference for developers:**
|
||
|
||
| Component | File | Line Range |
|
||
|-----------|------|------------|
|
||
| Retry check logic | `a1_to_a2_box_uploader.py` | ~176-186 |
|
||
| Empty folder detection | `a1_to_a2_box_uploader.py` | ~193-231 |
|
||
| Success reset | `a1_to_a2_box_uploader.py` | ~354-356 |
|
||
| `get_a1_retry_status()` | `database.py` | ~522-558 |
|
||
| `increment_a1_retry()` | `database.py` | ~560-620 |
|
||
| `reset_a1_retry()` | `database.py` | ~622-655 |
|
||
| Email templates | `notifier.py` | ~593-687 |
|
||
| Database migration | `migrations/003_add_a1_retry_tracking.sql` | All |
|
||
|
||
---
|
||
|
||
## Change Log
|
||
|
||
**January 31, 2026:**
|
||
- Initial implementation
|
||
- 3-attempt retry mechanism
|
||
- Permanent failure tracking
|
||
- Two new email templates
|
||
- This documentation created
|
||
|
||
**Future updates will be logged here.**
|