obsidian/wiki/concepts/preflight-record-pattern.md
2026-04-27 11:11:54 +01:00

59 lines
2.5 KiB
Markdown

---
title: Preflight + Record Pattern
tags: [concept, ai, cost-tracking, patterns]
created: 2026-04-27
updated: 2026-04-27
---
# Preflight + Record Pattern
The core usage-tracking pattern used by the AI Cost Tracker SDK. Every paid AI call follows the same three steps.
## The pattern
```
preflight(estimated_units) → call AI → record(actual_units)
```
1. **Preflight** — before the AI call, ask the cost-tracker: "Is this workspace/project within budget?"
- Input: model name + estimated units (tokens, chars, etc.)
- Output: `allow=true/false`, estimated cost, `request_id`
- If `allow=false` → raise `BudgetExceeded` before calling the AI API
2. **AI call** — the actual paid API call (unmodified)
3. **Record** — after the call, report actual usage
- Input: `request_id` from preflight + actual units from response
- Output: `event_id`, `cost_usd`
- If cost-tracker is unavailable → SDK saves to SQLite outbox and retries in background (see [[wiki/concepts/sync-with-outbox|sync-with-outbox]])
## Why two steps?
- **Preflight** enables hard budget enforcement **before** money is spent
- **Record** captures accurate actual usage (estimated ≠ actual for output tokens)
- Decoupling protects the AI pipeline: if cost-tracker goes down after preflight, `record()` still succeeds via outbox
## Estimation accuracy
Preflight uses estimated units because output token count is unknown before the call:
| Provider | What we estimate | Accuracy |
|---|---|---|
| Gemini text | input tokens (`len/4`), output tokens (caller hint) | ±30% |
| Gemini video | input tokens (file-size table), output tokens (hint) | ±50% |
| ElevenLabs | chars (exact — `len(text)`) | 100% |
| Google TTS | chars (exact — `len(text)`) | 100% |
Over-estimation is better than under-estimation for budget enforcement. If you consistently over-estimate by 50%, tune the default `estimated_output_tokens` hint downward.
## Hard limit mechanics
- Preflight computes `current_month_spend + estimated_cost`
- If this exceeds `budget.amount_usd` AND `budget.hard_limit=True``allow=false`
- The budget check is **eventual** (reads from pre-aggregated rollups + today's raw events), not transactional — brief overage is possible under high concurrency
- This is acceptable for AI cost tracking: exact-to-the-cent enforcement would require distributed locks and add unacceptable latency
## Projects using this pattern
- [[wiki/projects-overview/ai-cost-tracker|ai-cost-tracker]] (defines the pattern)
- video-accessibility (first consumer, Phase 1)