obsidian/wiki/concepts/sync-with-outbox.md
2026-04-27 11:11:54 +01:00

2.4 KiB

title tags created updated
Sync HTTP + SQLite Outbox Pattern
concept
architecture
resilience
cost-tracking
2026-04-27 2026-04-27

Sync HTTP + SQLite Outbox Pattern

The pattern used by the cost-tracker SDK to ensure usage events are never lost even if the cost-tracker service is temporarily unavailable.

The problem

The AI pipeline (Celery worker) calls ct.record(...) after each AI API call. If the cost-tracker service is down, a naive implementation would either:

  • Silently drop the event (cost data lost)
  • Raise an exception (AI pipeline fails — unacceptable)

The solution

record() → try POST /v1/usage/record
              ├── success → done
              └── failure (timeout / 5xx / network) → save to SQLite outbox
                                                           ↓
                                              background flusher (every 30s)
                                              retries all pending events with
                                              exponential backoff

Implementation details

SQLite outbox (one file per worker, default /tmp/cost_outbox.sqlite):

  • Schema: (id, ts, payload_json, attempts, last_attempt_at, status)
  • Written synchronously before returning from record() on failure
  • Never blocks the AI pipeline

Background flusher (asyncio background task):

  • Starts when CostTracker is initialised
  • Every 30 seconds: reads all status='pending' rows, retries POST /v1/usage/record
  • On success: marks status='sent'
  • After 10 failed attempts: marks status='dead', logs warning → human investigation needed

Graceful degradation:

  • record() never raises CostTrackerUnavailable — it's fire-and-forget via outbox
  • preflight() returns allow=true on connectivity failure by default (fail_open=True). Configurable.

Configuration

ct = CostTracker(
    ...
    outbox_path="/tmp/cost_outbox.sqlite",
    flush_interval_seconds=30,
    max_retry_attempts=10,
    fail_open=True,    # preflight returns allow=True when service unreachable
)

Monitoring

  • Outbox depth reported in SDK's /metrics endpoint (if enabled)
  • dead status rows require manual review — add to monitoring alert

Where this pattern applies

  • Any Oliver project using the oliver-cost-tracker SDK
  • Generally applicable to any fire-and-forget side-effect call where data loss is unacceptable but the consumer must not block the main flow