From 8f6e6108c2b920bc49531914ec3a8ded89da37d0 Mon Sep 17 00:00:00 2001
From: Vadym Samoilenko <vadymsamoilenko@oliver.agency>
Date: Fri, 17 Apr 2026 13:19:25 +0100
Subject: [PATCH] vault backup: 2026-04-17 13:19:25

---
 .../Barclays Banner Builder.md                |   5 +
 99 Daily/2026-04-17.md                        |   9 ++
 raw/{ => _processed}/Todo Lists.md            |   0
 raw/{ => _processed}/Track cost and usage.md  |   0
 wiki/_master-index.md                         |   2 +-
 wiki/agent-sdk/_index.md                      |   1 +
 wiki/agent-sdk/cost-tracking.md               | 125 ++++++++++++++++++
 7 files changed, 141 insertions(+), 1 deletion(-)
 rename raw/{ => _processed}/Todo Lists.md (100%)
 rename raw/{ => _processed}/Track cost and usage.md (100%)
 create mode 100644 wiki/agent-sdk/cost-tracking.md

diff --git a/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md b/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md
index 73d6f45..669ed44 100644
--- a/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md	
+++ b/01 Projects/Barclays-banner-builder/Barclays Banner Builder.md	
@@ -23,6 +23,10 @@ created: 2026-04-17
 - **Local path:** `/Volumes/SSD/Projects/Oliver/Barclays-banner-builder`
 
 ## Sessions
+### 2026-04-17 – Create an idempotent deployment script for
+**Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets.
+**Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints.
+
 ### 2026-04-17 – Create an idempotent deployment script for
 **Asked:** Create an idempotent deployment script for Ubuntu server with Docker containers, database initialization, and migrations.
 **Done:** Added `delete` and `patch` methods to `apiClient` and committed changes.
@@ -175,6 +179,7 @@ created: 2026-04-17
 ## Change Log
 | Date | Requested | Changed | Files |
 |------|-----------|---------|-------|
+| 2026-04-17 | Deployment & admin API | deploy.sh script, GET/POST/PATCH/DELETE user endpoints, frontend cleanup | deploy.sh, backend API routes, frontend build config |
 | 2026-04-17 | Deployment script | Docker build with caching, DB init, Alembic migrations, frontend cleanup | deploy.sh, apiClient.ts |
 | 2026-04-17 | Verified TypeScript build and pushed changes to git | Confirmed clean TypeScript compilation and committed code to repository | All project files |
 | 2026-04-17 | Deployment script | Docker build with caching, database init, Alembic migrations, frontend cleanup | deploy.sh, docker-compose.yml |
diff --git a/99 Daily/2026-04-17.md b/99 Daily/2026-04-17.md
index 05e9fbd..b8ee66f 100644
--- a/99 Daily/2026-04-17.md	
+++ b/99 Daily/2026-04-17.md	
@@ -290,3 +290,12 @@ tags: [daily]
 - 13:17 (<1min) | `memory-compiler`
   - **Asked:** How should a duplicate Tabby terminal article from the raw inbox be handled in the knowledge base?
   - **Done:** Added the new raw file to the sources field of the existing tabby-terminal.md article to track provenance without duplicating content.
+- 13:18 (<1min) | `memory-compiler`
+  - **Asked:** Compile a raw todo list article into the wiki knowledge base with structured formatting.
+  - **Done:** Migrated todo tracking documentation to wiki with TodoTracker class example and updated both index files.
+- 13:18 (<1min) | `Barclays-banner-builder`
+  - **Asked:** Create an idempotent deployment script for Ubuntu with Docker, database initialization, and migrations without WebSockets.
+  - **Done:** Implemented Docker deployment script with cache management, database migrations, and frontend file cleanup; created admin user management API endpoints.
+- 13:19 | `memory-compiler`
+  - **Asked:** Compile a new article on cost tracking into the agent-sdk wiki section.
+  - **Done:** Created cost-tracking.md with scoping, deduplication, per-model breakdown, and edge cases; updated both index files.
diff --git a/raw/Todo Lists.md b/raw/_processed/Todo Lists.md
similarity index 100%
rename from raw/Todo Lists.md
rename to raw/_processed/Todo Lists.md
diff --git a/raw/Track cost and usage.md b/raw/_processed/Track cost and usage.md
similarity index 100%
rename from raw/Track cost and usage.md
rename to raw/_processed/Track cost and usage.md
diff --git a/wiki/_master-index.md b/wiki/_master-index.md
index fe3a559..ded2101 100644
--- a/wiki/_master-index.md
+++ b/wiki/_master-index.md
@@ -30,7 +30,7 @@ This 3-hop pattern works for hundreds of articles without vector search.
 | [[wiki/web-agency/_index\|web-agency/]] | AI-assisted website building & selling: Claude Code, Nanobanana 2, Kling, LaunchPath MCP | 1 |
 | [[wiki/dotfiles/_index\|dotfiles/]] | Linux terminal ricing: Kitty, Fish, WezTerm CLI, modern Rust CLI tools, LazyVim, unified themes, Tabby | 19 |
 
-| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 26 |
+| [[wiki/agent-sdk/_index\|agent-sdk/]] | Claude Agent SDK (formerly Claude Code SDK) — build autonomous AI agents in Python and TypeScript | 27 |
 | [[wiki/llm-models/_index\|llm-models/]] | OpenAI model catalog — GPT-5.x, o-series reasoning, audio/realtime, embeddings, moderation | 1 |
 | [[wiki/claude-code/_index\|claude-code/]] | Claude Code product docs — install, capabilities, surfaces, MCP, hooks, scheduling, multi-agent, plugins, skills, channels, error recovery | 11 |
 
diff --git a/wiki/agent-sdk/_index.md b/wiki/agent-sdk/_index.md
index 2097988..9b9f228 100644
--- a/wiki/agent-sdk/_index.md
+++ b/wiki/agent-sdk/_index.md
@@ -40,3 +40,4 @@ Build production AI agents using the same tools, agent loop, and context managem
 | [[wiki/agent-sdk/streaming-input\|streaming-input]] | Two input modes: AsyncGenerator (streaming, default) vs string (single message); capabilities, limitations, session resuming | raw/Streaming Input.md | 2026-04-17 |
 | [[wiki/agent-sdk/subagents\|subagents]] | Spawn subagents programmatically or via filesystem; context isolation, parallelization, tool restrictions, resuming, detection | raw/Subagents in the SDK.md | 2026-04-17 |
 | [[wiki/agent-sdk/todo-tracking\|todo-tracking]] | Built-in TodoWrite tool: lifecycle states, automatic triggers, real-time progress display, activeForm field | raw/Todo Lists.md | 2026-04-17 |
+| [[wiki/agent-sdk/cost-tracking\|cost-tracking]] | Token usage tracking, deduplicating parallel tool calls, per-model cost breakdown, accumulating across sessions | raw/Track cost and usage.md | 2026-04-17 |
diff --git a/wiki/agent-sdk/cost-tracking.md b/wiki/agent-sdk/cost-tracking.md
new file mode 100644
index 0000000..7f2d1e3
--- /dev/null
+++ b/wiki/agent-sdk/cost-tracking.md
@@ -0,0 +1,125 @@
+---
+title: "Cost and Token Usage Tracking"
+aliases: [cost-tracking, token-usage, usage-tracking]
+tags: [agent-sdk, cost, tokens, billing, observability]
+sources: [raw/Track cost and usage.md]
+created: 2026-04-17
+updated: 2026-04-17
+---
+
+# Cost and Token Usage Tracking
+
+The Claude Agent SDK exposes per-step and per-model token usage through the message stream. All cost figures are **client-side estimates** — not authoritative billing data.
+
+## Key Takeaways
+
+- `total_cost_usd` / `costUSD` are estimates computed from a bundled price table; use the [Usage and Cost API](https://platform.claude.com/docs/en/build-with-claude/usage-cost-api) or Console for billing truth
+- Cost is scoped to a single `query()` call — sessions do **not** auto-accumulate; sum manually
+- Parallel tool calls produce multiple assistant messages sharing the same `id` — **deduplicate by ID** to avoid inflated token counts
+- The `result` message is the most reliable place to read cost; prefer `total_cost_usd` there over summing per-step values
+- Costs are tracked even on failed/error result subtypes — tokens were consumed up to the failure point
+- Prompt caching is automatic; two extra fields `cache_creation_input_tokens` / `cache_read_input_tokens` track cache economics
+
+## Scoping: query / step / session
+
+| Scope | What it is | Cost reported? |
+|-------|------------|----------------|
+| `query()` call | One invocation; may involve multiple steps | Yes — in `result` message |
+| Step | Single request/response cycle within a `query()` | Yes — on each `AssistantMessage` |
+| Session | Multiple `query()` calls linked by session ID | No built-in total; accumulate yourself |
+
+## Get Total Cost of a Query
+
+Read `total_cost_usd` from the `result` message:
+
+```typescript
+for await (const message of query({ prompt: "Summarize this project" })) {
+  if (message.type === "result") {
+    console.log(`Total cost: $${message.total_cost_usd}`);
+  }
+}
+```
+
+Python equivalent: `message.total_cost_usd` on `ResultMessage`.
+
+## Track Per-Step Usage (with Deduplication)
+
+Parallel tool calls share the same `message.message.id`. Always deduplicate:
+
+```typescript
+const seenIds = new Set<string>();
+let totalInputTokens = 0;
+let totalOutputTokens = 0;
+
+for await (const message of query({ prompt: "..." })) {
+  if (message.type === "assistant") {
+    const msgId = message.message.id;
+    if (!seenIds.has(msgId)) {
+      seenIds.add(msgId);
+      totalInputTokens += message.message.usage.input_tokens;
+      totalOutputTokens += message.message.usage.output_tokens;
+    }
+  }
+}
+```
+
+Python fields: `message.usage`, `message.message_id`.
+
+## Break Down Usage Per Model
+
+`result.modelUsage` (TS) / `result.model_usage` (Python) maps model name → tokens + cost. Useful for multi-model setups (e.g., Haiku subagents + Opus main agent):
+
+```typescript
+for await (const message of query({ prompt: "..." })) {
+  if (message.type !== "result") continue;
+  for (const [model, usage] of Object.entries(message.modelUsage)) {
+    console.log(`${model}: $${usage.costUSD.toFixed(4)}`);
+    console.log(`  Input: ${usage.inputTokens}, Output: ${usage.outputTokens}`);
+    console.log(`  Cache read: ${usage.cacheReadInputTokens}, Cache create: ${usage.cacheCreationInputTokens}`);
+  }
+}
+```
+
+## Accumulate Costs Across Multiple Calls
+
+```typescript
+let totalSpend = 0;
+for (const prompt of prompts) {
+  for await (const message of query({ prompt })) {
+    if (message.type === "result") {
+      totalSpend += message.total_cost_usd;
+    }
+  }
+}
+console.log(`Total spend: $${totalSpend.toFixed(4)}`);
+```
+
+## Edge Cases
+
+| Scenario | Guidance |
+|----------|----------|
+| Output token discrepancy for same ID | Use the highest value; prefer `total_cost_usd` from `result` |
+| Failed/error conversations | Always read cost from `result` regardless of `subtype` |
+| Cache tokens | Track `cache_creation_input_tokens` and `cache_read_input_tokens` separately; charged at different rates |
+| Price drift | Re-install SDK or use Usage API when accuracy matters |
+
+## TypeScript vs Python Field Names
+
+| Concept | TypeScript | Python |
+|---------|-----------|--------|
+| Per-step usage | `message.message.usage` | `message.usage` |
+| Per-step ID | `message.message.id` | `message.message_id` |
+| Per-model breakdown | `result.modelUsage` | `result.model_usage` |
+| Total cost | `result.total_cost_usd` | `result.total_cost_usd` |
+| Cache fields | `usage.cacheReadInputTokens` | `message.usage.get("cache_read_input_tokens", 0)` |
+
+## Related Articles
+
+- [[wiki/agent-sdk/agent-loop|Agent Loop]] — how steps and `query()` calls are structured
+- [[wiki/agent-sdk/observability-opentelemetry|Observability with OpenTelemetry]] — exporting traces and metrics to OTLP backends
+- [[wiki/agent-sdk/subagents|Subagents]] — multi-model setups where per-model cost breakdown matters
+- [[wiki/agent-sdk/streaming-output|Streaming Output]] — real-time message stream that carries usage events
+
+## Sources
+
+- `raw/Track cost and usage.md` — source: https://code.claude.com/docs/en/agent-sdk/cost-tracking