vault backup: 2026-04-17 13:19:25

This commit is contained in:
Vadym Samoilenko 2026-04-17 13:19:25 +01:00
parent 2020a36116
commit 8f6e6108c2
7 changed files with 141 additions and 1 deletions

View file

@ -23,6 +23,10 @@ created: 2026-04-17
- **Local path:** `/Volumes/SSD/Projects/Oliver/Barclays-banner-builder`
## Sessions
### 2026-04-17 Create an idempotent deployment script for
**Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets.
**Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints.
### 2026-04-17 Create an idempotent deployment script for
**Asked:** Create an idempotent deployment script for Ubuntu server with Docker containers, database initialization, and migrations.
**Done:** Added `delete` and `patch` methods to `apiClient` and committed changes.
@ -175,6 +179,7 @@ created: 2026-04-17
## Change Log
| Date | Requested | Changed | Files |
|------|-----------|---------|-------|
| 2026-04-17 | Deployment & admin API | deploy.sh script, GET/POST/PATCH/DELETE user endpoints, frontend cleanup | deploy.sh, backend API routes, frontend build config |
| 2026-04-17 | Deployment script | Docker build with caching, DB init, Alembic migrations, frontend cleanup | deploy.sh, apiClient.ts |
| 2026-04-17 | Verified TypeScript build and pushed changes to git | Confirmed clean TypeScript compilation and committed code to repository | All project files |
| 2026-04-17 | Deployment script | Docker build with caching, database init, Alembic migrations, frontend cleanup | deploy.sh, docker-compose.yml |

View file

@ -290,3 +290,12 @@ tags: [daily]
- 13:17 (<1min) | `memory-compiler`
- **Asked:** How should a duplicate Tabby terminal article from the raw inbox be handled in the knowledge base?
- **Done:** Added the new raw file to the sources field of the existing tabby-terminal.md article to track provenance without duplicating content.
- 13:18 (<1min) | `memory-compiler`
- **Asked:** Compile a raw todo list article into the wiki knowledge base with structured formatting.
- **Done:** Migrated todo tracking documentation to wiki with TodoTracker class example and updated both index files.
- 13:18 (<1min) | `Barclays-banner-builder`
- **Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets.
- **Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints.
- 13:19 | `memory-compiler`
- **Asked:** Compile a new article on cost tracking into the agent-sdk wiki section.
- **Done:** Created cost-tracking.md with scoping, deduplication, per-model breakdown, and edge cases; updated both index files.

View file

@ -30,7 +30,7 @@ This 3-hop pattern works for hundreds of articles without vector search.
| [[wiki/web-agency/_index\|web-agency/]] | AI-assisted website building & selling: Claude Code, Nanobanana 2, Kling, LaunchPath MCP | 1 |
| [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 19 |
| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 26 |
| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 27 |
| [[wiki/llm-models/_index\|llm-models/]] | OpenAI model catalog — GPT-5.x, o-series reasoning, audio/realtime, embeddings, moderation | 1 |
| [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery | 11 |

View file

@ -40,3 +40,4 @@ Build production AI agents using the same tools, agent loop, and context managem
| [[wiki/agent-sdk/streaming-input\|streaming-input]] | Two input modes: AsyncGenerator (streaming, default) vs string (single message); capabilities, limitations, session resuming | raw/Streaming Input.md | 2026-04-17 |
| [[wiki/agent-sdk/subagents\|subagents]] | Spawn subagents programmatically or via filesystem; context isolation, parallelization, tool restrictions, resuming, detection | raw/Subagents in the SDK.md | 2026-04-17 |
| [[wiki/agent-sdk/todo-tracking\|todo-tracking]] | Built-in TodoWrite tool: lifecycle states, automatic triggers, real-time progress display, activeForm field | raw/Todo Lists.md | 2026-04-17 |
| [[wiki/agent-sdk/cost-tracking\|cost-tracking]] | Token usage tracking, deduplicating parallel tool calls, per-model cost breakdown, accumulating across sessions | raw/Track cost and usage.md | 2026-04-17 |

View file

@ -0,0 +1,125 @@
---
title: "Cost and Token Usage Tracking"
aliases: [cost-tracking, token-usage, usage-tracking]
tags: [agent-sdk, cost, tokens, billing, observability]
sources: [raw/Track cost and usage.md]
created: 2026-04-17
updated: 2026-04-17
---
# Cost and Token Usage Tracking
The Claude Agent SDK exposes per-step and per-model token usage through the message stream. All cost figures are **client-side estimates** — not authoritative billing data.
## Key Takeaways
- `total_cost_usd` / `costUSD` are estimates computed from a bundled price table; use the [Usage and Cost API](https://platform.claude.com/docs/en/build-with-claude/usage-cost-api) or Console for billing truth
- Cost is scoped to a single `query()` call — sessions do **not** auto-accumulate; sum manually
- Parallel tool calls produce multiple assistant messages sharing the same `id`**deduplicate by ID** to avoid inflated token counts
- The `result` message is the most reliable place to read cost; prefer `total_cost_usd` there over summing per-step values
- Costs are tracked even on failed/error result subtypes — tokens were consumed up to the failure point
- Prompt caching is automatic; two extra fields `cache_creation_input_tokens` / `cache_read_input_tokens` track cache economics
## Scoping: query / step / session
| Scope | What it is | Cost reported? |
|-------|------------|----------------|
| `query()` call | One invocation; may involve multiple steps | Yes — in `result` message |
| Step | Single request/response cycle within a `query()` | Yes — on each `AssistantMessage` |
| Session | Multiple `query()` calls linked by session ID | No built-in total; accumulate yourself |
## Get Total Cost of a Query
Read `total_cost_usd` from the `result` message:
```typescript
for await (const message of query({ prompt: "Summarize this project" })) {
if (message.type === "result") {
console.log(`Total cost: $${message.total_cost_usd}`);
}
}
```
Python equivalent: `message.total_cost_usd` on `ResultMessage`.
## Track Per-Step Usage (with Deduplication)
Parallel tool calls share the same `message.message.id`. Always deduplicate:
```typescript
const seenIds = new Set<string>();
let totalInputTokens = 0;
let totalOutputTokens = 0;
for await (const message of query({ prompt: "..." })) {
if (message.type === "assistant") {
const msgId = message.message.id;
if (!seenIds.has(msgId)) {
seenIds.add(msgId);
totalInputTokens += message.message.usage.input_tokens;
totalOutputTokens += message.message.usage.output_tokens;
}
}
}
```
Python fields: `message.usage`, `message.message_id`.
## Break Down Usage Per Model
`result.modelUsage` (TS) / `result.model_usage` (Python) maps model name → tokens + cost. Useful for multi-model setups (e.g., Haiku subagents + Opus main agent):
```typescript
for await (const message of query({ prompt: "..." })) {
if (message.type !== "result") continue;
for (const [model, usage] of Object.entries(message.modelUsage)) {
console.log(`${model}: $${usage.costUSD.toFixed(4)}`);
console.log(` Input: ${usage.inputTokens}, Output: ${usage.outputTokens}`);
console.log(` Cache read: ${usage.cacheReadInputTokens}, Cache create: ${usage.cacheCreationInputTokens}`);
}
}
```
## Accumulate Costs Across Multiple Calls
```typescript
let totalSpend = 0;
for (const prompt of prompts) {
for await (const message of query({ prompt })) {
if (message.type === "result") {
totalSpend += message.total_cost_usd;
}
}
}
console.log(`Total spend: $${totalSpend.toFixed(4)}`);
```
## Edge Cases
| Scenario | Guidance |
|----------|----------|
| Output token discrepancy for same ID | Use the highest value; prefer `total_cost_usd` from `result` |
| Failed/error conversations | Always read cost from `result` regardless of `subtype` |
| Cache tokens | Track `cache_creation_input_tokens` and `cache_read_input_tokens` separately; charged at different rates |
| Price drift | Re-install SDK or use Usage API when accuracy matters |
## TypeScript vs Python Field Names
| Concept | TypeScript | Python |
|---------|-----------|--------|
| Per-step usage | `message.message.usage` | `message.usage` |
| Per-step ID | `message.message.id` | `message.message_id` |
| Per-model breakdown | `result.modelUsage` | `result.model_usage` |
| Total cost | `result.total_cost_usd` | `result.total_cost_usd` |
| Cache fields | `usage.cacheReadInputTokens` | `message.usage.get("cache_read_input_tokens", 0)` |
## Related Articles
- [[wiki/agent-sdk/agent-loop|Agent Loop]] — how steps and `query()` calls are structured
- [[wiki/agent-sdk/observability-opentelemetry|Observability with OpenTelemetry]] — exporting traces and metrics to OTLP backends
- [[wiki/agent-sdk/subagents|Subagents]] — multi-model setups where per-model cost breakdown matters
- [[wiki/agent-sdk/streaming-output|Streaming Output]] — real-time message stream that carries usage events
## Sources
- `raw/Track cost and usage.md` — source: https://code.claude.com/docs/en/agent-sdk/cost-tracking