From 8f6e6108c2b920bc49531914ec3a8ded89da37d0 Mon Sep 17 00:00:00 2001 From: Vadym Samoilenko Date: Fri, 17 Apr 2026 13:19:25 +0100 Subject: [PATCH] vault backup: 2026-04-17 13:19:25 --- .../Barclays Banner Builder.md | 5 + 99 Daily/2026-04-17.md | 9 ++ raw/{ => _processed}/Todo Lists.md | 0 raw/{ => _processed}/Track cost and usage.md | 0 wiki/_master-index.md | 2 +- wiki/agent-sdk/_index.md | 1 + wiki/agent-sdk/cost-tracking.md | 125 ++++++++++++++++++ 7 files changed, 141 insertions(+), 1 deletion(-) rename raw/{ => _processed}/Todo Lists.md (100%) rename raw/{ => _processed}/Track cost and usage.md (100%) create mode 100644 wiki/agent-sdk/cost-tracking.md diff --git a/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md b/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md index 73d6f45..669ed44 100644 --- a/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md +++ b/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md @@ -23,6 +23,10 @@ created: 2026-04-17 - **Local path:** `/Volumes/SSD/Projects/Oliver/Barclays-banner-builder` ## Sessions +### 2026-04-17 – Create an idempotent deployment script for +**Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets. +**Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints. + ### 2026-04-17 – Create an idempotent deployment script for **Asked:** Create an idempotent deployment script for Ubuntu server with Docker containers, database initialization, and migrations. **Done:** Added `delete` and `patch` methods to `apiClient` and committed changes. @@ -175,6 +179,7 @@ created: 2026-04-17 ## Change Log | Date | Requested | Changed | Files | |------|-----------|---------|-------| +| 2026-04-17 | Deployment & admin API | deploy.sh script, GET/POST/PATCH/DELETE user endpoints, frontend cleanup | deploy.sh, backend API routes, frontend build config | | 2026-04-17 | Deployment script | Docker build with caching, DB init, Alembic migrations, frontend cleanup | deploy.sh, apiClient.ts | | 2026-04-17 | Verified TypeScript build and pushed changes to git | Confirmed clean TypeScript compilation and committed code to repository | All project files | | 2026-04-17 | Deployment script | Docker build with caching, database init, Alembic migrations, frontend cleanup | deploy.sh, docker-compose.yml | diff --git a/99 Daily/2026-04-17.md b/99 Daily/2026-04-17.md index 05e9fbd..b8ee66f 100644 --- a/99 Daily/2026-04-17.md +++ b/99 Daily/2026-04-17.md @@ -290,3 +290,12 @@ tags: [daily] - 13:17 (<1min) | `memory-compiler` - **Asked:** How should a duplicate Tabby terminal article from the raw inbox be handled in the knowledge base? - **Done:** Added the new raw file to the sources field of the existing tabby-terminal.md article to track provenance without duplicating content. +- 13:18 (<1min) | `memory-compiler` + - **Asked:** Compile a raw todo list article into the wiki knowledge base with structured formatting. + - **Done:** Migrated todo tracking documentation to wiki with TodoTracker class example and updated both index files. +- 13:18 (<1min) | `Barclays-banner-builder` + - **Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets. + - **Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints. +- 13:19 | `memory-compiler` + - **Asked:** Compile a new article on cost tracking into the agent-sdk wiki section. + - **Done:** Created cost-tracking.md with scoping, deduplication, per-model breakdown, and edge cases; updated both index files. diff --git a/raw/Todo Lists.md b/raw/_processed/Todo Lists.md similarity index 100% rename from raw/Todo Lists.md rename to raw/_processed/Todo Lists.md diff --git a/raw/Track cost and usage.md b/raw/_processed/Track cost and usage.md similarity index 100% rename from raw/Track cost and usage.md rename to raw/_processed/Track cost and usage.md diff --git a/wiki/_master-index.md b/wiki/_master-index.md index fe3a559..ded2101 100644 --- a/wiki/_master-index.md +++ b/wiki/_master-index.md @@ -30,7 +30,7 @@ This 3-hop pattern works for hundreds of articles without vector search. | [[wiki/web-agency/_index\|web-agency/]] | AI-assisted website building & selling: Claude Code, Nanobanana 2, Kling, LaunchPath MCP | 1 | | [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 19 | -| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 26 | +| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 27 | | [[wiki/llm-models/_index\|llm-models/]] | OpenAI model catalog — GPT-5.x, o-series reasoning, audio/realtime, embeddings, moderation | 1 | | [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery | 11 | diff --git a/wiki/agent-sdk/_index.md b/wiki/agent-sdk/_index.md index 2097988..9b9f228 100644 --- a/wiki/agent-sdk/_index.md +++ b/wiki/agent-sdk/_index.md @@ -40,3 +40,4 @@ Build production AI agents using the same tools, agent loop, and context managem | [[wiki/agent-sdk/streaming-input\|streaming-input]] | Two input modes: AsyncGenerator (streaming, default) vs string (single message); capabilities, limitations, session resuming | raw/Streaming Input.md | 2026-04-17 | | [[wiki/agent-sdk/subagents\|subagents]] | Spawn subagents programmatically or via filesystem; context isolation, parallelization, tool restrictions, resuming, detection | raw/Subagents in the SDK.md | 2026-04-17 | | [[wiki/agent-sdk/todo-tracking\|todo-tracking]] | Built-in TodoWrite tool: lifecycle states, automatic triggers, real-time progress display, activeForm field | raw/Todo Lists.md | 2026-04-17 | +| [[wiki/agent-sdk/cost-tracking\|cost-tracking]] | Token usage tracking, deduplicating parallel tool calls, per-model cost breakdown, accumulating across sessions | raw/Track cost and usage.md | 2026-04-17 | diff --git a/wiki/agent-sdk/cost-tracking.md b/wiki/agent-sdk/cost-tracking.md new file mode 100644 index 0000000..7f2d1e3 --- /dev/null +++ b/wiki/agent-sdk/cost-tracking.md @@ -0,0 +1,125 @@ +--- +title: "Cost and Token Usage Tracking" +aliases: [cost-tracking, token-usage, usage-tracking] +tags: [agent-sdk, cost, tokens, billing, observability] +sources: [raw/Track cost and usage.md] +created: 2026-04-17 +updated: 2026-04-17 +--- + +# Cost and Token Usage Tracking + +The Claude Agent SDK exposes per-step and per-model token usage through the message stream. All cost figures are **client-side estimates** — not authoritative billing data. + +## Key Takeaways + +- `total_cost_usd` / `costUSD` are estimates computed from a bundled price table; use the [Usage and Cost API](https://platform.claude.com/docs/en/build-with-claude/usage-cost-api) or Console for billing truth +- Cost is scoped to a single `query()` call — sessions do **not** auto-accumulate; sum manually +- Parallel tool calls produce multiple assistant messages sharing the same `id` — **deduplicate by ID** to avoid inflated token counts +- The `result` message is the most reliable place to read cost; prefer `total_cost_usd` there over summing per-step values +- Costs are tracked even on failed/error result subtypes — tokens were consumed up to the failure point +- Prompt caching is automatic; two extra fields `cache_creation_input_tokens` / `cache_read_input_tokens` track cache economics + +## Scoping: query / step / session + +| Scope | What it is | Cost reported? | +|-------|------------|----------------| +| `query()` call | One invocation; may involve multiple steps | Yes — in `result` message | +| Step | Single request/response cycle within a `query()` | Yes — on each `AssistantMessage` | +| Session | Multiple `query()` calls linked by session ID | No built-in total; accumulate yourself | + +## Get Total Cost of a Query + +Read `total_cost_usd` from the `result` message: + +```typescript +for await (const message of query({ prompt: "Summarize this project" })) { + if (message.type === "result") { + console.log(`Total cost: $${message.total_cost_usd}`); + } +} +``` + +Python equivalent: `message.total_cost_usd` on `ResultMessage`. + +## Track Per-Step Usage (with Deduplication) + +Parallel tool calls share the same `message.message.id`. Always deduplicate: + +```typescript +const seenIds = new Set(); +let totalInputTokens = 0; +let totalOutputTokens = 0; + +for await (const message of query({ prompt: "..." })) { + if (message.type === "assistant") { + const msgId = message.message.id; + if (!seenIds.has(msgId)) { + seenIds.add(msgId); + totalInputTokens += message.message.usage.input_tokens; + totalOutputTokens += message.message.usage.output_tokens; + } + } +} +``` + +Python fields: `message.usage`, `message.message_id`. + +## Break Down Usage Per Model + +`result.modelUsage` (TS) / `result.model_usage` (Python) maps model name → tokens + cost. Useful for multi-model setups (e.g., Haiku subagents + Opus main agent): + +```typescript +for await (const message of query({ prompt: "..." })) { + if (message.type !== "result") continue; + for (const [model, usage] of Object.entries(message.modelUsage)) { + console.log(`${model}: $${usage.costUSD.toFixed(4)}`); + console.log(` Input: ${usage.inputTokens}, Output: ${usage.outputTokens}`); + console.log(` Cache read: ${usage.cacheReadInputTokens}, Cache create: ${usage.cacheCreationInputTokens}`); + } +} +``` + +## Accumulate Costs Across Multiple Calls + +```typescript +let totalSpend = 0; +for (const prompt of prompts) { + for await (const message of query({ prompt })) { + if (message.type === "result") { + totalSpend += message.total_cost_usd; + } + } +} +console.log(`Total spend: $${totalSpend.toFixed(4)}`); +``` + +## Edge Cases + +| Scenario | Guidance | +|----------|----------| +| Output token discrepancy for same ID | Use the highest value; prefer `total_cost_usd` from `result` | +| Failed/error conversations | Always read cost from `result` regardless of `subtype` | +| Cache tokens | Track `cache_creation_input_tokens` and `cache_read_input_tokens` separately; charged at different rates | +| Price drift | Re-install SDK or use Usage API when accuracy matters | + +## TypeScript vs Python Field Names + +| Concept | TypeScript | Python | +|---------|-----------|--------| +| Per-step usage | `message.message.usage` | `message.usage` | +| Per-step ID | `message.message.id` | `message.message_id` | +| Per-model breakdown | `result.modelUsage` | `result.model_usage` | +| Total cost | `result.total_cost_usd` | `result.total_cost_usd` | +| Cache fields | `usage.cacheReadInputTokens` | `message.usage.get("cache_read_input_tokens", 0)` | + +## Related Articles + +- [[wiki/agent-sdk/agent-loop|Agent Loop]] — how steps and `query()` calls are structured +- [[wiki/agent-sdk/observability-opentelemetry|Observability with OpenTelemetry]] — exporting traces and metrics to OTLP backends +- [[wiki/agent-sdk/subagents|Subagents]] — multi-model setups where per-model cost breakdown matters +- [[wiki/agent-sdk/streaming-output|Streaming Output]] — real-time message stream that carries usage events + +## Sources + +- `raw/Track cost and usage.md` — source: https://code.claude.com/docs/en/agent-sdk/cost-tracking