semblance-dev/backend/app/services
Vadym Samoilenko 3e9ccafad2 Add LLM usage tracking infrastructure (Phases A-C)
- Model renames: gpt-5.2 → gpt-5.4-2026-03-05, gemini-3-pro-preview → gemini-3.1-pro-preview; retire gpt-4.1 via alias fallback
- New: llm_usage_context.py (ContextVar-based attribution), model_pricing.py (tiered pricing + 60s cache), usage_event.py (append-only telemetry), quota.py (user/FG quota enforcement with 80% warning)
- Wire _record_usage into all 3 LLM methods; set_llm_context at every service entry point
- Fix admin_required decorator (was sync, never awaited User.find_by_id); add active_required and with_user_context decorators
- Inject user_id into ContextVar from JWT on every authenticated request
- Add DB indexes for usage_events, model_pricing, users collections
- Seed script for model pricing (gpt-5.4 single-tier, gemini-3.1 two-tier 200k threshold)
- Fix parse_json_response NameError (logger undefined at module level)
- 70 passing tests: conftest.py with sys.modules stubs, test_usage_infrastructure.py (52 tests), rewrite stale test_llm_service.py (18 tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 18:08:27 +01:00
..
ai_moderator_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
ai_persona_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
ai_runner_service.py changed permissions 2025-12-19 19:26:16 +00:00
autonomous_conversation_controller.py Fix AI loop hanging: add asyncio.wait_for timeouts on LLM calls 2026-03-23 19:17:36 +00:00
bulk_persona_export_service.py Migrate task result delivery from WebSocket to HTTP polling 2026-03-23 16:46:58 +00:00
conversation_context_service.py Fix root cause: naive vs aware datetime crash + stuck AI mode indicator 2026-03-23 19:30:04 +00:00
conversation_decision_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
conversation_state_manager.py Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed 2026-03-20 12:51:18 +00:00
customer_data_service.py changed permissions 2025-12-19 19:26:16 +00:00
focus_group_response_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
focus_group_service.py Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed 2026-03-20 12:51:18 +00:00
focus_group_summary_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
image_description_service.py Allow document uploads (PDF, DOCX, TXT, etc.) as focus group assets 2026-03-23 17:08:30 +00:00
key_theme_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
llm_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
llm_usage_context.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
msal_service.py Apply Jintech security audit remediation (sprint 3) — 87/92 findings fixed 2026-03-20 12:51:18 +00:00
persona_export_service.py changed permissions 2025-12-19 19:26:16 +00:00
persona_modification_service.py Add LLM usage tracking infrastructure (Phases A-C) 2026-04-24 18:08:27 +01:00
task_manager.py Migrate task result delivery from WebSocket to HTTP polling 2026-03-23 16:46:58 +00:00