ferrero-opentext/Python-Version/MARKDOWN_DOCS/A1_RETRY_LOGIC.md
nickviljoen 28586308d7 Docs: Refresh A1 empty-folder doc and LTD asset type notes
A1_RETRY_LOGIC.md updated to reflect the 2026-04-28 rework: empty
folders are now treated as expected workflow (silent skip + one-time
warning at poll 20, no auto permanent-fail), while the original
3-strikes-then-permanently-fail behavior is preserved for genuine
folder errors via the mark_failed_at_max flag.

README.md adds LTD (Licensing Translation Document) to the asset type
override section alongside EOL, and notes that empty overrides remove
fields while non-empty overrides on non-MVP fields are appended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:19:06 +02:00

11 KiB
Raw Permalink Blame History

A1→A2 Empty Folder Handling

Purpose: Avoid spam emails and false-positive permanent failures for the common workflow where campaign managers create an A1 campaign before uploading the master assets.

Initial implementation: January 31, 2026 Reworked: April 28, 2026 — empty folders are now treated as expected client workflow rather than failures.

Related files:

  • scripts/a1_to_a2_box_uploader.py (main script)
  • scripts/shared/database.py (retry tracking methods)
  • database/migrations/003_add_a1_retry_tracking.sql (schema)

How It Works (current behavior)

The empty-folder case (most common)

When a campaign is at A1 in DAM but the Master Assets folder is empty, the script treats this as a normal pre-asset state, not a failure.

Flow:

  1. Every poll: a1_retry_count is incremented for visibility, the script logs No master assets yet (poll N) - skipping until assets appear, and exits silently.
  2. At poll 20 (~1 hour at the 3-minute orchestrator cadence) the script sends a single a1_to_a2_no_assets_warning email so genuinely-stuck campaigns still surface.
  3. After poll 20, the script keeps skipping silently. a1_permanently_failed is never auto-set for empty folders.
  4. When assets eventually appear and A1→A2 succeeds, db.reset_a1_retry() clears the counter automatically.

The threshold lives in scripts/a1_to_a2_box_uploader.py as EMPTY_FOLDER_WARNING_THRESHOLD = 20.

The genuine-error case

The 3-retries-then-permanently-fail behavior still exists for actual folder-level errors (e.g. Assets folder not found (tried Master Assets)), which are caught by the script's exception handler. These DO mark a1_permanently_failed=TRUE after 3 failures and DO send the retry / permanently-failed emails.

db.increment_a1_retry() accepts mark_failed_at_max=True|False to switch between the two behaviors. The empty-folder branch passes False; the exception handler passes True (default).

Queue-slot filter

The A1→A2 script processes up to 2 campaigns per run (campaigns[:2]). Permanently-failed campaigns are filtered out before the slot cap so they no longer block the queue (scripts/a1_to_a2_box_uploader.py:652).

Database tracking

Four fields on the campaign_status table:

  • a1_retry_count (INTEGER): Number of polls where the folder was empty / errored. For empty-folder cases this can grow unbounded; reset on success.
  • a1_last_retry_at (TIMESTAMP): When last attempt occurred
  • a1_permanently_failed (BOOLEAN): TRUE only via the genuine-error path (after 3 failures), never via the empty-folder path
  • a1_failure_reason (TEXT): Why it failed (e.g., "Assets folder not found (tried Master Assets)")

Configuration

Empty-folder warning threshold

scripts/a1_to_a2_box_uploader.py:

EMPTY_FOLDER_WARNING_THRESHOLD = 20  # ~1 hour at 3-min poll cadence

Send the one-time warning sooner/later by adjusting this constant.

Genuine-error retry attempts

scripts/shared/database.pyincrement_a1_retry():

MAX_RETRIES = 3

Applies only when the caller passes mark_failed_at_max=True (default), i.e. the exception handler in process_campaign(). The empty-folder branch passes False and is unaffected.


Email Notifications

Empty-folder warning (one-time, at poll 20)

Template: a1_to_a2_no_assets_warning Subject: ⚠️ Campaign in A1 with no assets yet - {campaign_name} Recipients: Error notification list Sent: exactly once per stuck campaign, when a1_retry_count == 20. Counter resets on success, so a future re-stuck event would warn again.

Genuine-error retry email (attempts 12)

Template: a1_to_a2_no_assets_retry Subject: ⚠️ No Assets Found (Attempt X/3) - Campaign {name} Recipients: Error notification list Trigger: non-empty-folder errors caught by process_campaign()'s exception handler.

Genuine-error final failure (attempt 3)

Template: a1_to_a2_permanently_failed Subject: PERMANENTLY FAILED - Campaign {name} (No Assets After 3 Attempts) Recipients: Error notification list Content:

  • Campaign marked as permanently failed (campaign filtered from future queue runs)
  • Required actions to fix
  • SQL command to manually reset

Manual Operations

Check Campaign Retry Status

SELECT campaign_number, campaign_name, status,
       a1_retry_count, a1_last_retry_at,
       a1_permanently_failed, a1_failure_reason
FROM campaign_status
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';

Reset Single Campaign

UPDATE campaign_status
SET a1_retry_count = 0,
    a1_last_retry_at = NULL,
    a1_permanently_failed = FALSE,
    a1_failure_reason = NULL
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';

Or using psql command:

PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking <<EOF
UPDATE campaign_status
SET a1_retry_count = 0,
    a1_last_retry_at = NULL,
    a1_permanently_failed = FALSE,
    a1_failure_reason = NULL
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
EOF

Reset All Failed Campaigns

UPDATE campaign_status
SET a1_retry_count = 0,
    a1_last_retry_at = NULL,
    a1_permanently_failed = FALSE,
    a1_failure_reason = NULL
WHERE a1_permanently_failed = TRUE;

View All Failed Campaigns

SELECT campaign_number, campaign_name,
       a1_retry_count, a1_last_retry_at, a1_failure_reason
FROM campaign_status
WHERE a1_permanently_failed = TRUE
ORDER BY a1_last_retry_at DESC;

Failure Scenarios

Scenario 1: Temporary Empty Folder

What Happens:

  • Attempt 1: Email sent, retry counter = 1
  • Assets added to folder before attempt 2
  • Next run finds assets, processes successfully
  • Retry counter automatically reset to 0

Result: Problem self-resolves, minimal notifications

Scenario 2: Persistent Empty Folder

What Happens:

  • Attempt 1 (0 min): Email sent, retry counter = 1
  • Attempt 2 (3 min): Email sent, retry counter = 2
  • Attempt 3 (6 min): Email sent, retry counter = 3
  • Campaign marked permanently failed
  • Processing stops, no more emails

Result: Support team alerted, infinite emails prevented

Scenario 3: Wrong Status Assignment

What Happens:

  • Campaign set to A1 by mistake (no assets intended)
  • Fails 3 times, marked permanently failed
  • Admin realizes mistake, changes status to different value
  • Campaign no longer appears in A1 search results

Result: No reset needed, campaign excluded from processing


Testing

Test Retry Logic

  1. Create test campaign in DAM with A1 status
  2. Ensure Master Assets folder is empty
  3. Run A1→A2 script manually 3 times
  4. Verify emails received and database state
# Run 1
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2

# Check database
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"

# Run 2 (wait 3 minutes or run immediately for testing)
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2

# Check again
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"

# Run 3
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2

# Verify permanently failed
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed, a1_failure_reason FROM campaign_status WHERE a1_permanently_failed = TRUE;"

Test Reset Logic

# Reset the test campaign
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "UPDATE campaign_status SET a1_retry_count = 0, a1_permanently_failed = FALSE WHERE campaign_number = 'TEST_CAMPAIGN';"

# Run again
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2

# Verify it retries

Monitoring

Dashboard Query: Current Retry Status

SELECT
    COUNT(*) FILTER (WHERE a1_retry_count = 0) as "No Issues",
    COUNT(*) FILTER (WHERE a1_retry_count = 1) as "Attempt 1",
    COUNT(*) FILTER (WHERE a1_retry_count = 2) as "Attempt 2",
    COUNT(*) FILTER (WHERE a1_retry_count >= 3) as "Permanently Failed"
FROM campaign_status
WHERE status = 'A1';

Alert Query: Campaigns Near Failure

SELECT campaign_number, campaign_name, a1_retry_count, a1_last_retry_at
FROM campaign_status
WHERE status = 'A1'
  AND a1_retry_count >= 2
  AND a1_permanently_failed = FALSE
ORDER BY a1_retry_count DESC, a1_last_retry_at DESC;

Troubleshooting

Q: Campaign keeps failing even after adding assets

A: Check if campaign was marked permanently failed. Reset using SQL command above.

Q: Want to change from 3 to 5 retry attempts

A: Edit MAX_RETRIES = 3 in database.py line ~567. Also update email templates to reflect new maximum.

Q: How to disable retry logic completely?

A: Not recommended, but you can:

  1. Set MAX_RETRIES = 999 (effectively infinite)
  2. Or revert to old a1_to_a2_no_assets template without retry tracking

Q: Can I set different retry counts for different campaigns?

A: No, it's a global setting. All campaigns use same MAX_RETRIES value.

Q: What if I want to delete permanently failed campaigns from database?

A: Don't delete. Instead, change their status to something other than A1. They'll be excluded from processing automatically.


Future Enhancements

Potential improvements for future versions:

  1. Configurable retry timing:

    • Instead of relying on cron frequency (3 min)
    • Check a1_last_retry_at and skip if too recent
    • Allow exponential backoff (3 min, 10 min, 30 min)
  2. Campaign-specific retry limits:

    • Add optional a1_max_retries column
    • Allow different campaigns to have different thresholds
    • Default to global MAX_RETRIES if not set
  3. Automatic cleanup:

    • After 30 days, auto-reset permanently failed campaigns
    • Or send weekly digest of stuck campaigns
  4. Webhook notifications:

    • Send to external system when campaign permanently fails
    • Integrate with ticketing system
  5. Admin UI:

    • Web interface to view/reset retry status
    • Bulk reset operations

Code Locations

Quick reference for developers:

Component File Line Range
Retry check logic a1_to_a2_box_uploader.py ~176-186
Empty folder detection a1_to_a2_box_uploader.py ~193-231
Success reset a1_to_a2_box_uploader.py ~354-356
get_a1_retry_status() database.py ~522-558
increment_a1_retry() database.py ~560-620
reset_a1_retry() database.py ~622-655
Email templates notifier.py ~593-687
Database migration migrations/003_add_a1_retry_tracking.sql All

Change Log

January 31, 2026:

  • Initial implementation
  • 3-attempt retry mechanism
  • Permanent failure tracking
  • Two new email templates
  • This documentation created

Future updates will be logged here.