cohorta/backend/scripts
Vadym Samoilenko d0ad8e67be Fix backfill: use accumulated conversation context for prompt estimation
Old logic used output text length as a proxy for prompt tokens — completely
wrong. Real Gemini calls send the full conversation history as context, so
prompt grows with every turn.

New logic:
- completion_tokens = len(response_text) / 3.8 (what was generated)
- prompt_tokens = base_template + sum(all_prior_messages_in_fg) / 3.8
  - persona_response base: 1500 tok (template + persona details + topic)
  - moderator base: 1200 tok (moderator template + fg context)
  - persona_generate base: 2500 tok (persona-detailed-generation.md template)

Also:
- Sorts messages chronologically per focus group before processing
- Accumulates context correctly so turn N includes turns 0..N-1 as context
- Idempotency via pre-fetched set instead of per-doc find_one queries
- cost_usd breakdown now has correct input/output split (not 40/60 guess)
- Dry-run prints per-focus-group cost estimates for sanity checking

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 19:11:01 +01:00
..
backfill_usage.py Fix backfill: use accumulated conversation context for prompt estimation 2026-04-24 19:11:01 +01:00
generate_architecture_doc.py Fix domain typo: oliver.solution → oliver.solutions across all files 2026-03-20 13:40:00 +00:00
populate_db.py Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed 2026-03-20 12:51:18 +00:00
populate_db_direct.py Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed 2026-03-20 12:51:18 +00:00
seed_model_pricing.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
setup_mongodb.sh Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed 2026-03-20 12:51:18 +00:00