semblance-dev

History

Vadym Samoilenko d0ad8e67be Fix backfill: use accumulated conversation context for prompt estimation Old logic used output text length as a proxy for prompt tokens — completely wrong. Real Gemini calls send the full conversation history as context, so prompt grows with every turn. New logic: - completion_tokens = len(response_text) / 3.8 (what was generated) - prompt_tokens = base_template + sum(all_prior_messages_in_fg) / 3.8 - persona_response base: 1500 tok (template + persona details + topic) - moderator base: 1200 tok (moderator template + fg context) - persona_generate base: 2500 tok (persona-detailed-generation.md template) Also: - Sorts messages chronologically per focus group before processing - Accumulates context correctly so turn N includes turns 0..N-1 as context - Idempotency via pre-fetched set instead of per-doc find_one queries - cost_usd breakdown now has correct input/output split (not 40/60 guess) - Dry-run prints per-focus-group cost estimates for sanity checking Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-04-24 19:11:01 +01:00
..
backfill_usage.py	Fix backfill: use accumulated conversation context for prompt estimation	2026-04-24 19:11:01 +01:00
generate_architecture_doc.py	Fix domain typo: oliver.solution → oliver.solutions across all files	2026-03-20 13:40:00 +00:00
populate_db.py	Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed	2026-03-20 12:51:18 +00:00
populate_db_direct.py	Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed	2026-03-20 12:51:18 +00:00
seed_model_pricing.py	Add LLM usage tracking infrastructure (Phases A-C)	2026-04-24 18:08:27 +01:00
setup_mongodb.sh	Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed	2026-03-20 12:51:18 +00:00