---
title: Preflight + Record Pattern
tags: [concept, ai, cost-tracking, patterns]
created: 2026-04-27
updated: 2026-04-27
---

# Preflight + Record Pattern

The core usage-tracking pattern used by the AI Cost Tracker SDK. Every paid AI call follows the same three steps.

## The pattern

```
preflight(estimated_units) → call AI → record(actual_units)
```

1. **Preflight** — before the AI call, ask the cost-tracker: "Is this workspace/project within budget?"
   - Input: model name + estimated units (tokens, chars, etc.)
   - Output: `allow=true/false`, estimated cost, `request_id`
   - If `allow=false` → raise `BudgetExceeded` before calling the AI API

2. **AI call** — the actual paid API call (unmodified)

3. **Record** — after the call, report actual usage
   - Input: `request_id` from preflight + actual units from response
   - Output: `event_id`, `cost_usd`
   - If cost-tracker is unavailable → SDK saves to SQLite outbox and retries in background (see [[wiki/concepts/sync-with-outbox|sync-with-outbox]])

## Why two steps?

- **Preflight** enables hard budget enforcement **before** money is spent
- **Record** captures accurate actual usage (estimated ≠ actual for output tokens)
- Decoupling protects the AI pipeline: if cost-tracker goes down after preflight, `record()` still succeeds via outbox

## Estimation accuracy

Preflight uses estimated units because output token count is unknown before the call:

| Provider | What we estimate | Accuracy |
|---|---|---|
| Gemini text | input tokens (`len/4`), output tokens (caller hint) | ±30% |
| Gemini video | input tokens (file-size table), output tokens (hint) | ±50% |
| ElevenLabs | chars (exact — `len(text)`) | 100% |
| Google TTS | chars (exact — `len(text)`) | 100% |

Over-estimation is better than under-estimation for budget enforcement. If you consistently over-estimate by 50%, tune the default `estimated_output_tokens` hint downward.

## Hard limit mechanics

- Preflight computes `current_month_spend + estimated_cost`
- If this exceeds `budget.amount_usd` AND `budget.hard_limit=True` → `allow=false`
- The budget check is **eventual** (reads from pre-aggregated rollups + today's raw events), not transactional — brief overage is possible under high concurrency
- This is acceptable for AI cost tracking: exact-to-the-cent enforcement would require distributed locks and add unacceptable latency

## Projects using this pattern

- [[wiki/projects-overview/ai-cost-tracker|ai-cost-tracker]] (defines the pattern)
- video-accessibility (first consumer, Phase 1)