- Add WARNING log when usage_metadata/usage is None so zero-cost events
are visible in logs instead of silently disappearing
- Capture thoughts_token_count from Gemini thinking models into reasoning field
(already included in candidates_token_count for billing, now also tracked separately)
- Add same warning for OpenAI missing usage object
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend:
- @active_required + @with_user_context applied to all LLM-invoking routes
in personas.py, focus_group_ai.py, ai_personas.py
- backend/app/routes/usage.py: GET /api/usage/me (MTD summary by feature),
GET /api/usage/focus-groups/<id> (owner or admin)
- Registered usage_bp in app/__init__.py
- llm_service._record_usage now emits usage_update WS event to focus group room
Frontend:
- useMyUsage + useFocusGroupUsage hooks
- MyUsage.tsx: personal billing dashboard (cost cards + per-feature table)
- /billing route (ProtectedRoute) + Billing nav link
- FocusGroupSession: quota_warning amber banner with Progress bar,
quota_exceeded + quota_warning WS events wired via websocketServiceNew
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend:
- token_version in JWT (bump_token_version, get_token_version on User model);
jwt_required checks tv claim → 401 on mismatch; login routes embed version
- Quota pre-flight in all 3 LLM public methods (QuotaExceededError bubbles up)
- AI runner catches QuotaExceededError → sets status paused_quota + emits WS event
- Admin routes: POST /users (create), POST /users/<id>/reset-password,
POST /pricing, GET /focus-groups with aggregated cost; PUT /users/<id>
now bumps token_version on disable or role change
- backfill_usage.py: idempotent estimated-event generator for historical data,
tiktoken for GPT models, char/3.8 for Gemini, --dry-run flag
Frontend:
- 402 interceptor dispatches quota_exceeded CustomEvent
- adminApi: createUser, resetPassword, createPricing, listFocusGroups
- UsersTab: New User dialog + Reset Password in edit dialog
- PricingTab: New Price dialog (model, provider, input/output/cached prices)
- FocusGroupsTab: focus groups table sorted by total cost
- Admin.tsx: 4th tab (Focus Groups)
- FocusGroupSession: admin-only cost badge + dismissable quota exceeded banner
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The autonomous loop was crashing on every decision with:
TypeError: can't subtract offset-naive and offset-aware datetimes
because MongoDB stores created_at without timezone info but the code
compared it against datetime.now(timezone.utc).
- conversation_context_service: make created_at timezone-aware before
subtraction (replace tzinfo=utc when naive)
- DiscussionPanel: fix sync effect — when server reports AI mode is
inactive, always clear localAiModeActive regardless of its value,
so the "AI is generating..." spinner doesn't get stuck when the
backend fails/stops before the frontend has confirmed AI mode started
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The autonomous conversation loop could hang indefinitely because
self.response_timeout=30 was defined but never used in wait_for().
- autonomous_conversation_controller: wrap generate_persona_response()
with asyncio.wait_for(timeout=120s); 30s was too short for production
LLMs, raised to 120s; TimeoutError returns an error dict so the loop
can continue or count toward consecutive_silence limit
- conversation_decision_service: add asyncio.wait_for(timeout=60s)
around LLMService.generate_content() for the decision call; add
asyncio import and explicit TimeoutError handling
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
google-genai SDK uses aiohttp when it's available in the environment
(installed via llama-index-core), causing AssertionError (connector is None)
on async requests. Pass httpx_async_client in HttpOptions to bypass aiohttp.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix missing await on FocusGroup.get_messages() (N-L1)
- Replace time.sleep with asyncio.sleep in key_theme_service and focus_group_service (N-P10)
- Replace flask import with quart in focus_groups.py (N-S3)
- Add logger.error before all 500 returns in focus_groups.py (N-P6)
- Add logging to silent except blocks across routes (N-M10, N-M11)
- Add @rate_limit to 6 remaining AI endpoints (N-H4)
- Add --confirm flag to populate scripts before delete_many (S-H2)
- Remove hardcoded Azure ID fallbacks from msal_service.py and msalConfig.ts (A-M2, F-H4)
- Centralize make_serializable() in utils.py, remove duplicates from 3 route files (N-P7)
- Replace all datetime.utcnow() with datetime.now(timezone.utc) across entire backend (M-L2)
- AuthContext.tsx: only mark token validated on 200 success, not on non-401 errors (F-H2)
- Rename authType → auth_type in auth.py (N-S4)
- Add security_report.md and security_report.pdf with full 92-finding status
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Focus groups created before the gpt-5.2 rename have llm_model='gpt-5'
stored in MongoDB. Without an alias, the backend falls through to the
Gemini provider and fails with an aiohttp AssertionError.
Adds MODEL_ALIASES mapping and _resolve_model() helper so gpt-5 is
transparently resolved to gpt-5.2. Also updates all llm_model checks
to accept both values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Swap model ID from gpt-5 to gpt-5.2 across all backend services,
frontend components, and documentation. Change default reasoning
effort from medium to low for faster responses.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stage 2 (detailed persona generation) was ignoring the audience brief and
research objective, causing the LLM to guess research context from demographics
alone. Now passes both values through to generate_persona() in all three
endpoints (generate-personas-full, complete-and-save-persona, complete-persona)
and auto-generates prompt customization via customize_persona_prompt() when
they are provided.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
logger is not defined at module level where get_gemini_client() lives.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This will help identify where exactly the AssertionError is occurring
in the google-genai SDK and what version is installed on the server.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Log full exception details: type, module, str, repr, args, and __dict__
to diagnose why Gemini errors are producing empty messages.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Catch genai_errors.APIError specifically and extract e.code and e.message
attributes for proper error logging. The generic str(e) was returning empty
strings for Google API errors, making debugging impossible.
- Import google.genai.errors for specific exception handling
- Add APIError catch before generic Exception in generate_content()
- Add APIError catch before generic Exception in generate_contextual_response()
- Properly categorize errors by HTTP code for retry logic (429/500+ retryable)
- Fix time.sleep to await asyncio.sleep in contextual response handler
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous event loop tracking approach still caused issues - when replacing
a cached client, its garbage collection triggers aclose() which tries to close
the aiohttp session on the wrong event loop.
Simplest fix: create a fresh client for each call. The overhead is minimal
compared to the actual LLM API call, and this completely avoids all event
loop mismatch issues in ASGI environments.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous lazy initialization fix wasn't sufficient - the genai.Client
internally caches async structures bound to the event loop at creation time.
With ASGI servers like Hypercorn, subsequent requests may come on different
event loop contexts, causing "Future attached to a different loop" errors.
Now tracks which event loop the client was created on and recreates it if
the loop has changed.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These files are already in .gitignore but were committed previously.
Removing them from tracking to prevent future conflicts.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The genai.Client and AsyncOpenAI clients were being created at module
import time, before the Quart/Hypercorn event loop existed. This caused
"Future attached to a different loop" errors when async calls were made,
resulting in autonomous focus group conversations stopping with
"excessive_silence".
Changed to lazy initialization - clients are now created on first use
within the running event loop context via get_gemini_client() and
get_openai_client() helper functions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create focus_group_summary_service.py to generate concise summaries
- Add prompt template for summary generation
- Integrate summary generation after discussion guide creation
- Display summary under focus group title in list view with fallback to description
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace suggestions panel with auto-apply: enhanced text now populates form fields automatically
- Add modal showing assumptions made during enhancement (3-5 bullet points)
- Reduce prompt scope by 50% for more focused, impactful enhancements
- Update backend to return enhanced text + assumptions instead of suggestions array
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Upgraded google-genai package from 1.31.0 to 1.52.0
- Updated DEFAULT_MODEL in llm_service.py to gemini-3-pro-preview
- Updated all backend routes, services, and models with new model string
- Updated all frontend components with new model string and display labels
- Updated CLAUDE.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- persona generation now retries up to 2 times on failure
- missing persona fields are intelligently completed based on context
- expanded required field validation from 5 to 9 fields
- prompt now explicitly lists all required fields with validation instructions
- fixed discussion guide cancellation check to handle object responses
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>