--- title: Preflight + Record Pattern tags: [concept, ai, cost-tracking, patterns] created: 2026-04-27 updated: 2026-04-27 --- # Preflight + Record Pattern The core usage-tracking pattern used by the AI Cost Tracker SDK. Every paid AI call follows the same three steps. ## The pattern ``` preflight(estimated_units) → call AI → record(actual_units) ``` 1. **Preflight** — before the AI call, ask the cost-tracker: "Is this workspace/project within budget?" - Input: model name + estimated units (tokens, chars, etc.) - Output: `allow=true/false`, estimated cost, `request_id` - If `allow=false` → raise `BudgetExceeded` before calling the AI API 2. **AI call** — the actual paid API call (unmodified) 3. **Record** — after the call, report actual usage - Input: `request_id` from preflight + actual units from response - Output: `event_id`, `cost_usd` - If cost-tracker is unavailable → SDK saves to SQLite outbox and retries in background (see [[wiki/concepts/sync-with-outbox|sync-with-outbox]]) ## Why two steps? - **Preflight** enables hard budget enforcement **before** money is spent - **Record** captures accurate actual usage (estimated ≠ actual for output tokens) - Decoupling protects the AI pipeline: if cost-tracker goes down after preflight, `record()` still succeeds via outbox ## Estimation accuracy Preflight uses estimated units because output token count is unknown before the call: | Provider | What we estimate | Accuracy | |---|---|---| | Gemini text | input tokens (`len/4`), output tokens (caller hint) | ±30% | | Gemini video | input tokens (file-size table), output tokens (hint) | ±50% | | ElevenLabs | chars (exact — `len(text)`) | 100% | | Google TTS | chars (exact — `len(text)`) | 100% | Over-estimation is better than under-estimation for budget enforcement. If you consistently over-estimate by 50%, tune the default `estimated_output_tokens` hint downward. ## Hard limit mechanics - Preflight computes `current_month_spend + estimated_cost` - If this exceeds `budget.amount_usd` AND `budget.hard_limit=True` → `allow=false` - The budget check is **eventual** (reads from pre-aggregated rollups + today's raw events), not transactional — brief overage is possible under high concurrency - This is acceptable for AI cost tracking: exact-to-the-cent enforcement would require distributed locks and add unacceptable latency ## Projects using this pattern - [[wiki/projects-overview/ai-cost-tracker|ai-cost-tracker]] (defines the pattern) - video-accessibility (first consumer, Phase 1)