vault backup: 2026-04-17 13:19:25
This commit is contained in:
parent
2020a36116
commit
8f6e6108c2
7 changed files with 141 additions and 1 deletions
|
|
@ -23,6 +23,10 @@ created: 2026-04-17
|
|||
- **Local path:** `/Volumes/SSD/Projects/Oliver/Barclays-banner-builder`
|
||||
|
||||
## Sessions
|
||||
### 2026-04-17 – Create an idempotent deployment script for
|
||||
**Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets.
|
||||
**Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints.
|
||||
|
||||
### 2026-04-17 – Create an idempotent deployment script for
|
||||
**Asked:** Create an idempotent deployment script for Ubuntu server with Docker containers, database initialization, and migrations.
|
||||
**Done:** Added `delete` and `patch` methods to `apiClient` and committed changes.
|
||||
|
|
@ -175,6 +179,7 @@ created: 2026-04-17
|
|||
## Change Log
|
||||
| Date | Requested | Changed | Files |
|
||||
|------|-----------|---------|-------|
|
||||
| 2026-04-17 | Deployment & admin API | deploy.sh script, GET/POST/PATCH/DELETE user endpoints, frontend cleanup | deploy.sh, backend API routes, frontend build config |
|
||||
| 2026-04-17 | Deployment script | Docker build with caching, DB init, Alembic migrations, frontend cleanup | deploy.sh, apiClient.ts |
|
||||
| 2026-04-17 | Verified TypeScript build and pushed changes to git | Confirmed clean TypeScript compilation and committed code to repository | All project files |
|
||||
| 2026-04-17 | Deployment script | Docker build with caching, database init, Alembic migrations, frontend cleanup | deploy.sh, docker-compose.yml |
|
||||
|
|
|
|||
|
|
@ -290,3 +290,12 @@ tags: [daily]
|
|||
- 13:17 (<1min) | `memory-compiler`
|
||||
- **Asked:** How should a duplicate Tabby terminal article from the raw inbox be handled in the knowledge base?
|
||||
- **Done:** Added the new raw file to the sources field of the existing tabby-terminal.md article to track provenance without duplicating content.
|
||||
- 13:18 (<1min) | `memory-compiler`
|
||||
- **Asked:** Compile a raw todo list article into the wiki knowledge base with structured formatting.
|
||||
- **Done:** Migrated todo tracking documentation to wiki with TodoTracker class example and updated both index files.
|
||||
- 13:18 (<1min) | `Barclays-banner-builder`
|
||||
- **Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets.
|
||||
- **Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints.
|
||||
- 13:19 | `memory-compiler`
|
||||
- **Asked:** Compile a new article on cost tracking into the agent-sdk wiki section.
|
||||
- **Done:** Created cost-tracking.md with scoping, deduplication, per-model breakdown, and edge cases; updated both index files.
|
||||
|
|
|
|||
|
|
@ -30,7 +30,7 @@ This 3-hop pattern works for hundreds of articles without vector search.
|
|||
| [[wiki/web-agency/_index\|web-agency/]] | AI-assisted website building & selling: Claude Code, Nanobanana 2, Kling, LaunchPath MCP | 1 |
|
||||
| [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 19 |
|
||||
|
||||
| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 26 |
|
||||
| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 27 |
|
||||
| [[wiki/llm-models/_index\|llm-models/]] | OpenAI model catalog — GPT-5.x, o-series reasoning, audio/realtime, embeddings, moderation | 1 |
|
||||
| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery | 11 |
|
||||
|
||||
|
|
|
|||
|
|
@ -40,3 +40,4 @@ Build production AI agents using the same tools, agent loop, and context managem
|
|||
| [[wiki/agent-sdk/streaming-input\|streaming-input]] | Two input modes: AsyncGenerator (streaming, default) vs string (single message); capabilities, limitations, session resuming | raw/Streaming Input.md | 2026-04-17 |
|
||||
| [[wiki/agent-sdk/subagents\|subagents]] | Spawn subagents programmatically or via filesystem; context isolation, parallelization, tool restrictions, resuming, detection | raw/Subagents in the SDK.md | 2026-04-17 |
|
||||
| [[wiki/agent-sdk/todo-tracking\|todo-tracking]] | Built-in TodoWrite tool: lifecycle states, automatic triggers, real-time progress display, activeForm field | raw/Todo Lists.md | 2026-04-17 |
|
||||
| [[wiki/agent-sdk/cost-tracking\|cost-tracking]] | Token usage tracking, deduplicating parallel tool calls, per-model cost breakdown, accumulating across sessions | raw/Track cost and usage.md | 2026-04-17 |
|
||||
|
|
|
|||
125
wiki/agent-sdk/cost-tracking.md
Normal file
125
wiki/agent-sdk/cost-tracking.md
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
---
|
||||
title: "Cost and Token Usage Tracking"
|
||||
aliases: [cost-tracking, token-usage, usage-tracking]
|
||||
tags: [agent-sdk, cost, tokens, billing, observability]
|
||||
sources: [raw/Track cost and usage.md]
|
||||
created: 2026-04-17
|
||||
updated: 2026-04-17
|
||||
---
|
||||
|
||||
# Cost and Token Usage Tracking
|
||||
|
||||
The Claude Agent SDK exposes per-step and per-model token usage through the message stream. All cost figures are **client-side estimates** — not authoritative billing data.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
- `total_cost_usd` / `costUSD` are estimates computed from a bundled price table; use the [Usage and Cost API](https://platform.claude.com/docs/en/build-with-claude/usage-cost-api) or Console for billing truth
|
||||
- Cost is scoped to a single `query()` call — sessions do **not** auto-accumulate; sum manually
|
||||
- Parallel tool calls produce multiple assistant messages sharing the same `id` — **deduplicate by ID** to avoid inflated token counts
|
||||
- The `result` message is the most reliable place to read cost; prefer `total_cost_usd` there over summing per-step values
|
||||
- Costs are tracked even on failed/error result subtypes — tokens were consumed up to the failure point
|
||||
- Prompt caching is automatic; two extra fields `cache_creation_input_tokens` / `cache_read_input_tokens` track cache economics
|
||||
|
||||
## Scoping: query / step / session
|
||||
|
||||
| Scope | What it is | Cost reported? |
|
||||
|-------|------------|----------------|
|
||||
| `query()` call | One invocation; may involve multiple steps | Yes — in `result` message |
|
||||
| Step | Single request/response cycle within a `query()` | Yes — on each `AssistantMessage` |
|
||||
| Session | Multiple `query()` calls linked by session ID | No built-in total; accumulate yourself |
|
||||
|
||||
## Get Total Cost of a Query
|
||||
|
||||
Read `total_cost_usd` from the `result` message:
|
||||
|
||||
```typescript
|
||||
for await (const message of query({ prompt: "Summarize this project" })) {
|
||||
if (message.type === "result") {
|
||||
console.log(`Total cost: $${message.total_cost_usd}`);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Python equivalent: `message.total_cost_usd` on `ResultMessage`.
|
||||
|
||||
## Track Per-Step Usage (with Deduplication)
|
||||
|
||||
Parallel tool calls share the same `message.message.id`. Always deduplicate:
|
||||
|
||||
```typescript
|
||||
const seenIds = new Set<string>();
|
||||
let totalInputTokens = 0;
|
||||
let totalOutputTokens = 0;
|
||||
|
||||
for await (const message of query({ prompt: "..." })) {
|
||||
if (message.type === "assistant") {
|
||||
const msgId = message.message.id;
|
||||
if (!seenIds.has(msgId)) {
|
||||
seenIds.add(msgId);
|
||||
totalInputTokens += message.message.usage.input_tokens;
|
||||
totalOutputTokens += message.message.usage.output_tokens;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Python fields: `message.usage`, `message.message_id`.
|
||||
|
||||
## Break Down Usage Per Model
|
||||
|
||||
`result.modelUsage` (TS) / `result.model_usage` (Python) maps model name → tokens + cost. Useful for multi-model setups (e.g., Haiku subagents + Opus main agent):
|
||||
|
||||
```typescript
|
||||
for await (const message of query({ prompt: "..." })) {
|
||||
if (message.type !== "result") continue;
|
||||
for (const [model, usage] of Object.entries(message.modelUsage)) {
|
||||
console.log(`${model}: $${usage.costUSD.toFixed(4)}`);
|
||||
console.log(` Input: ${usage.inputTokens}, Output: ${usage.outputTokens}`);
|
||||
console.log(` Cache read: ${usage.cacheReadInputTokens}, Cache create: ${usage.cacheCreationInputTokens}`);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Accumulate Costs Across Multiple Calls
|
||||
|
||||
```typescript
|
||||
let totalSpend = 0;
|
||||
for (const prompt of prompts) {
|
||||
for await (const message of query({ prompt })) {
|
||||
if (message.type === "result") {
|
||||
totalSpend += message.total_cost_usd;
|
||||
}
|
||||
}
|
||||
}
|
||||
console.log(`Total spend: $${totalSpend.toFixed(4)}`);
|
||||
```
|
||||
|
||||
## Edge Cases
|
||||
|
||||
| Scenario | Guidance |
|
||||
|----------|----------|
|
||||
| Output token discrepancy for same ID | Use the highest value; prefer `total_cost_usd` from `result` |
|
||||
| Failed/error conversations | Always read cost from `result` regardless of `subtype` |
|
||||
| Cache tokens | Track `cache_creation_input_tokens` and `cache_read_input_tokens` separately; charged at different rates |
|
||||
| Price drift | Re-install SDK or use Usage API when accuracy matters |
|
||||
|
||||
## TypeScript vs Python Field Names
|
||||
|
||||
| Concept | TypeScript | Python |
|
||||
|---------|-----------|--------|
|
||||
| Per-step usage | `message.message.usage` | `message.usage` |
|
||||
| Per-step ID | `message.message.id` | `message.message_id` |
|
||||
| Per-model breakdown | `result.modelUsage` | `result.model_usage` |
|
||||
| Total cost | `result.total_cost_usd` | `result.total_cost_usd` |
|
||||
| Cache fields | `usage.cacheReadInputTokens` | `message.usage.get("cache_read_input_tokens", 0)` |
|
||||
|
||||
## Related Articles
|
||||
|
||||
- [[wiki/agent-sdk/agent-loop|Agent Loop]] — how steps and `query()` calls are structured
|
||||
- [[wiki/agent-sdk/observability-opentelemetry|Observability with OpenTelemetry]] — exporting traces and metrics to OTLP backends
|
||||
- [[wiki/agent-sdk/subagents|Subagents]] — multi-model setups where per-model cost breakdown matters
|
||||
- [[wiki/agent-sdk/streaming-output|Streaming Output]] — real-time message stream that carries usage events
|
||||
|
||||
## Sources
|
||||
|
||||
- `raw/Track cost and usage.md` — source: https://code.claude.com/docs/en/agent-sdk/cost-tracking
|
||||
Loading…
Add table
Reference in a new issue