semblance-dev

Author	SHA1	Message	Date
Vadym Samoilenko	d0ad8e67be	Fix backfill: use accumulated conversation context for prompt estimation Old logic used output text length as a proxy for prompt tokens — completely wrong. Real Gemini calls send the full conversation history as context, so prompt grows with every turn. New logic: - completion_tokens = len(response_text) / 3.8 (what was generated) - prompt_tokens = base_template + sum(all_prior_messages_in_fg) / 3.8 - persona_response base: 1500 tok (template + persona details + topic) - moderator base: 1200 tok (moderator template + fg context) - persona_generate base: 2500 tok (persona-detailed-generation.md template) Also: - Sorts messages chronologically per focus group before processing - Accumulates context correctly so turn N includes turns 0..N-1 as context - Idempotency via pre-fetched set instead of per-doc find_one queries - cost_usd breakdown now has correct input/output split (not 40/60 guess) - Dry-run prints per-focus-group cost estimates for sanity checking Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 19:11:01 +01:00
Vadym Samoilenko	d7ee22e557	Fix backfill pricing: read from model_pricing collection + --delete-existing-estimates flag	2026-04-24 18:57:25 +01:00
Vadym Samoilenko	66c8e1762e	Fix backfill: handle list-type persona fields	2026-04-24 18:53:41 +01:00
Vadym Samoilenko	539c5eaaee	Fix backfill script: use focus_group_messages collection + correct field names	2026-04-24 18:49:59 +01:00
Vadym Samoilenko	915c81b8f1	Complete phases D–G: quota enforcement, token invalidation, admin writes, backfill Backend: - token_version in JWT (bump_token_version, get_token_version on User model); jwt_required checks tv claim → 401 on mismatch; login routes embed version - Quota pre-flight in all 3 LLM public methods (QuotaExceededError bubbles up) - AI runner catches QuotaExceededError → sets status paused_quota + emits WS event - Admin routes: POST /users (create), POST /users/<id>/reset-password, POST /pricing, GET /focus-groups with aggregated cost; PUT /users/<id> now bumps token_version on disable or role change - backfill_usage.py: idempotent estimated-event generator for historical data, tiktoken for GPT models, char/3.8 for Gemini, --dry-run flag Frontend: - 402 interceptor dispatches quota_exceeded CustomEvent - adminApi: createUser, resetPassword, createPricing, listFocusGroups - UsersTab: New User dialog + Reset Password in edit dialog - PricingTab: New Price dialog (model, provider, input/output/cached prices) - FocusGroupsTab: focus groups table sorted by total cost - Admin.tsx: 4th tab (Focus Groups) - FocusGroupSession: admin-only cost badge + dismissable quota exceeded banner Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 18:34:48 +01:00

5 commits