On server restart, stale active jobs are automatically resumed rather
than failed. Docs already parsed in a prior run are skipped (resume from
cache), docs stuck at 'parsing' are reset to 'pending' and re-parsed.
- Repository: add get_all_stale_active_jobs() and reset_stuck_parsing_docs()
- Service: skip already-parsed docs in _parse_doc(), reset stuck docs on start
- Main: recover stale jobs via asyncio.create_task() in lifespan startup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
knowledge_base_service and analysis_service were local variables inside
the lifespan() function — not module-level exports. Importing them via
'from app.main import ...' always failed with ImportError → 500.
Use request.app.state (same pattern as analysis_routes.py) instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updates all display labels (PDF report, campaign page, Knowledge Base card, analytics, status dashboard, checks overview) and aligns internal agent name in backend. Adds migration 010 to update the knowledge base display_name in production DB.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
legalAgentReview, brandAgentReview, channelBestPracticesAgentReview,
channelTechSpecsAgentReview, and leadAgentSummary are the actual keys
stored in the JSONB column — not legalAgent, brandAgent, channelAgent.*,
and leadAgent.summary which were causing empty CSV values.
Adds a server-side CSV export covering all campaign, proof, and version
data including agent RAG statuses. The export respects the active agency
filter so oversight admins can scope the download to a single agency.
- backend: `CampaignRepository.get_export_rows()` — flat join across
Campaign → Proof → ProofVersion with Agency and User, extracts agent
RAG statuses from the `agent_review` JSONB column
- backend: `GET /api/export/campaigns-csv` endpoint gated to
super_admin / oversight_admin, streams a dated CSV file
- frontend: `apiService.downloadCampaignsCsv(agencyId?)` — fetches blob
and triggers browser download
- frontend: threads `selectedAgencyId` prop from App → Campaigns →
CampaignList so the export uses the active filter
- frontend: Export CSV button in CampaignList header, visible only to
super_admin / oversight_admin, with spinner while downloading
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Now that REST polling removes the 30s GCP LB timeout constraint,
gemini-3.1-pro-preview is restored as primary and gemini-3-flash-preview
is used only when Pro fails or times out.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
POST /api/analyze submits an analysis job and returns job_id instantly.
GET /api/analyze/{job_id} returns progress + result; frontend polls every 2s.
Analysis runs as asyncio.create_task in the background — each HTTP request
completes in milliseconds, well within the 30s GCP Load Balancer limit.
- Add backend/app/services/job_store.py: in-memory AnalysisJob store with
30-min TTL cleanup
- Add backend/app/api/analysis_routes.py: POST + GET /api/analyze endpoints
with full analysis pipeline (hash check, DB persistence, PDF pages, etc.)
- Remove backend/app/websocket/: handlers.py, manager.py, __init__.py
- Update backend/app/main.py: wire analysis_router, store analysis_service
in app.state, drop all WebSocket imports and endpoint
- Update frontend/services/geminiService.ts: replace WS with fetch+poll;
function signatures unchanged so App.tsx / WIPReviewer.tsx need no edits
- Remove VITE_BACKEND_WS_URL from vite.config.ts, deploy.sh, .env.deploy.example
- Update cloudrun.yaml: remove WebSocket-specific session affinity annotation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gemini-3.1-pro-preview takes ~25s per call, hitting the GCP load
balancer's 30s hard timeout before analysis completes. Flash model
returns in ~5-8s, fitting comfortably within the limit. Pro model
kept as fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Frontend now sends client→server ping every 15s during analysis to keep
the GCP LB idle timeout alive from both directions. Backend responds
with pong. Previously only server→client heartbeats were sent, which
didn't reset the proxy's client-side idle timer.
Also updates favicon to Oliver brand mark (gold M).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upstream SSL terminator closes idle WS connections at ~26s. Heartbeat
at T+25 was racing with the close. 10s interval keeps the connection
alive through any proxy with up to ~20s idle timeout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add 25s heartbeat ping from backend to prevent Apache/proxy idle-timeout
killing the connection during 1-3 min analysis runs
- Handle heartbeat silently in both analyzeProof and analyzeWIPProof frontend handlers
- Run PDF rasterization via asyncio.to_thread so heartbeats aren't blocked
- Wrap analyze_proof with asyncio.wait_for(timeout=300) for a hard 5-min cap
- Log dropped send_message calls in ConnectionManager instead of swallowing silently
- cloudrun.yaml: add sessionAffinity, startup probe, raise containerConcurrency 4→10,
document DISABLE_AUTH option
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- knowledge_base_service.py: wrap Gemini distillation call in try/except
to fall back to fallback_client/fallback_model if primary times out,
matching the fallback behaviour in GeminiService._generate_content()
- models.py: fix SpecVersion.source_document_ids ORM type annotation from
Mapped[Optional[dict]] to Mapped[Optional[list]] — the field stores a
JSON array of document ID strings, not an object
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace self.gemini.client with self.gemini.primary_client on line 295 of
knowledge_base_service.py. GeminiService only exposes primary_client and
fallback_client — there is no client attribute. This caused all processing
jobs to fail at the distillation step, which is also why Version History
was always blank (no SpecVersion records were ever created).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add LLAMA_CLOUD_BASE_URL config option so the LlamaCloud regional
endpoint can be set without code changes (fixes 401/region errors
on production); pass it through to AsyncLlamaCloud client init
- Document LLAMA_CLOUD_BASE_URL in .env.deploy.example with EU endpoint
- Copy BAR-ModComms-logo-v5.png to frontend/public
- Sidebar: update logo reference v4 → v5
- PDF header: update logo v4 → v5, wrap in black (#000) band for
legibility, remove duplicate "Oliver" wordmark
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add VITE_BASE_PATH support to vite.config.ts so assets resolve correctly under /modcomms/ subpath
- Fix home URL in urlState.ts to use BASE_URL instead of hardcoded '/'
- Fix sidebar logo src to use BASE_URL prefix (Vite doesn't rewrite TSX src attributes)
- Fix Azure AD redirect/logout URIs to include BASE_URL subpath in authConfig.ts and App.tsx
- Add migration 009 to remove Mindshare/Zenith and add Rapp agency
- Update .env.deploy.example with production values for baic.oliver.solutions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Azure AD v1 access tokens (sts.windows.net issuer) use the 'upn' claim
for the user principal name/email, not 'email' or 'preferred_username'.
Add 'upn' as a fallback so email is correctly resolved on login.
Also add debug logging to show which claims are present.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a user already exists in the DB, get_or_create_from_azure was
returning early without updating their email from Azure AD claims.
Users created before email sync was in place would permanently show
empty emails in User Management.
Now syncs email from Azure AD claims on each login if the stored
email is empty.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Oversight admins can now create campaigns, upload proofs, and
flag/resolve issues when they have an agency assigned. They retain
all existing cross-agency read access for analytics, auditing, and
user management. Oversight admins without an agency see a read-only
campaigns view.
Changes:
- Add oversight_admin to canWrite permission in UserContext
- Guard readOnly for oversight_admin without agency in App.tsx
- Remove oversight_admin block from require_write_access dependency
- Remove WebSocket oversight_admin upload block in main.py
- Require agency for oversight_admin campaign creation in routes.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend: thread on_fallback callback through analysis chain
(gemini_service → agents → analysis_service → handlers). The handler
sends a 'model_fallback' WebSocket message exactly once per analysis
when the primary model is unavailable.
Frontend: handle 'model_fallback' WS message and show a dismissible
yellow toast at the bottom of the screen with an 8-second auto-dismiss.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
google-genai SDK expects http_options 'timeout' in milliseconds.
Passing 45 (seconds) was interpreted as 45ms → ~1s deadline,
which Google API rejected with 400 INVALID_ARGUMENT
'Manually set deadline 1s is too short. Minimum allowed deadline is 10s.'
Primary: 45_000ms (45s), Fallback: 150_000ms (150s)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
asyncio.wait_for cannot reliably cancel SDK-internal HTTP connections.
Replace with two genai.Client instances — one per model — each configured
with http_options={'timeout': N} so the TCP connection is actually torn
down when the deadline is reached.
Primary model: 45s, Fallback model: 150s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Log analysis showed fallback model responses up to 154s under parallel
load. 60s was too aggressive and would cause false timeouts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Primary model (gemini-3.1-pro-preview): 45s timeout
Fallback model (gemini-3-flash-preview): 60s timeout
Without timeouts, the fallback model under high load would wait
indefinitely, causing analysis to hang for 10+ minutes per file.
asyncio.TimeoutError from the primary model is now handled the same
as other exceptions (falls through to fallback).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a client disconnects (navigates away, closes tab) while analysis is
still running, the result send raises RuntimeError "WebSocket is not
connected". Catch this specifically as INFO rather than ERROR, and guard
the fallback send_message in the general Exception handler so it doesn't
raise a second uncaught error.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add selectinload(Campaign.agency) to get_with_proof_counts query so the
agency relationship is eagerly loaded. Without it, accessing campaign.agency.name
in the route triggered a lazy load in an async context, raising
sqlalchemy.exc.MissingGreenlet and returning HTTP 500.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gemini_service.py: if the primary model (gemini-3.1-pro-preview) is
unavailable or returns a permission error, all three call sites now
automatically retry with gemini-3-flash-preview before propagating failure.
cloudrun.yaml: new Cloud Run service definition that ensures stable
WebSocket operation — 10-minute request timeout (vs 60s default),
2 vCPU / 4Gi RAM for PDF rasterisation, min 1 warm instance to prevent
cold-start disconnects, and GEMINI_API_KEY sourced from Secret Manager
so the service can actually reach the Gemini API.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The authenticated user's DB ID was fetched in main.py for a role check
but never forwarded to handle_analyze_message, so Proof.created_by was
always NULL. This caused submitter_name and submitter_agency to resolve
to None on the Errors tab.
Fix: capture current_user_id from the role-check session in main.py,
pass it to handle_analyze_message, and forward it to
add_version_with_review as created_by. Newly submitted proofs will now
have their submitter recorded and visible in all three Auditing tabs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Allow oversight_admin users to view the User Management screen with
read-only access. They can see users, roles, agencies, and change
history but cannot edit roles, assign agencies, or create agencies.
Backend: open GET /users and GET /users/{id}/change-history to
oversight_admin (PUT /users stays super_admin only).
Frontend: add oversight_admin to sidebar nav and context permission,
render static text instead of dropdowns and hide the add-agency form
for non-super-admin users.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New GET /analytics/by-agency endpoint groups review metrics by agency.
The Analytics page now shows a sortable agency performance table with
pass rates, failures, errors, and legal review counts for each agency.
Only visible to super_admin and oversight_admin users. Selected agency
row is highlighted when the AgencyFilterBar is active.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Expire the SQLAlchemy cached user object after flush() so the
subsequent get_by_id() reloads the agency relationship with fresh
data. Previously the identity map returned the same Python object
with the old .agency, causing audit logs to record identical old
and new values on agency changes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Backend: Expose created_by field on CampaignResponse schema and all
response constructors in routes.py
- Frontend API layer: Add created_by to CampaignResponse interface and
createdBy to the frontend campaign converter
- Campaign list: Add column sorting (click headers to toggle asc/desc),
per-column text filter inputs below headers, and a "My Campaigns Only"
toggle that filters to campaigns created by the current user
- Default sort is lastModified descending to match existing behavior
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The selectinload for FlaggedItem.submitter and ResolvedItem.submitter
was not chaining .selectinload(User.agency), so the submitter's agency
was always None in the API response. This caused the "Submit Agency"
column to be empty in the Flags and Resolutions tabs of the Auditing
page.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a user_change_logs table to track all role and agency changes made
to users by super admins. Includes a change history modal in the User
Management screen (clock icon per row) showing timestamped, human-readable
change descriptions with the actor who made each change.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Unassigned (no agency) non-admin users previously saw ALL campaigns due to
a truthiness check that treated None agency_id as "no filter". This was a
security bug — they should see NO campaigns and be blocked from creating them.
Backend: Add _NO_AGENCY sentinel to distinguish "no filter" from "no agency",
add early-returns at all 5 list/analytics endpoints, fix _check_campaign_access
to explicitly reject unassigned users, and block campaign creation with 403.
Frontend: Add isUnassigned boolean to UserContext, show informational empty
state on Campaigns view, and reinforce readOnly for defense-in-depth.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace single-line bullet format with a structured two-part format
(**Issue:** / **Recommendation:**) in all specialist and lead agent
prompts. Update Gemini response schema description to match. Update
frontend formatFeedbackText and formatFeedbackTextForPDF to parse
**bold** markdown and preserve line breaks within multi-line bullets.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agents now show example corrections in the format they're recommending
(e.g. sentence case examples when recommending sentence case) to avoid
contradictions between advice and examples.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all "verdict" language in the lead agent prompt with "status/summary"
and add prescriptive opening-line templates so the LLM produces consistent
output without a "Verdict:" prefix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agents now spell out acronyms in full on first use (e.g. "Web Content
Accessibility Guidelines (WCAG)") for clarity. The instruction covers
common acronyms like WCAG, FSCS, GDE, APR, CTA, FCA, PRA, and T&Cs,
and applies to any acronym encountered in output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an IMPORTANT instruction block to all 5 agent prompt templates
(legal, brand, channel best practices, channel tech specs, lead) that
enforces: capitalisation after full stops and in labels, consistent
bullet-point ending style, and "e.g." without a trailing comma.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instructs all five agents (legal, brand, channel best practices,
channel tech specs, lead) to prefer simple vocabulary over complex
alternatives (e.g. "add" over "incorporate", "about" over "regarding").
Also fixes "constitute" → "qualify as" in the legal agent prompt itself.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
UAT feedback flagged the use of "violations", "violates", etc. as feeling
accusatory. Replaced all instances with constructive terms ("issues",
"doesn't align with", "doesn't meet") and added an explicit instruction
to all 5 agent prompt templates to avoid this language in output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added explicit UK English instruction to the Response Format section of
all five agents (legal, brand, channel best practices, channel tech specs,
lead) so output uses spellings like "authorised", "colour", "capitalise".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add CHECK constraint migration for users.role (super_admin, oversight_admin, agency_admin, basic_user)
- Add get_current_db_user dependency resolving Azure claims to User ORM with agency
- Add require_role() factory and require_write_access() dependency
- Auto-promote dev user to super_admin when DISABLE_AUTH=true
- Add /api/me, PUT /api/users/{id}, POST /api/agencies endpoints
- Apply agency-based data filtering on campaigns, analytics, audit routes
- Block oversight_admin from all mutation routes (campaigns, proofs, flags, resolves)
- Restrict dropdown option mutations to super_admin only
- Add role check in WebSocket handler to block oversight_admin from analysis
- Add CurrentUserResponse, UserUpdate, AgencyCreate schemas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add explicit no-citations instructions to all 5 agent prompts to prevent
Gemini from including page numbers, document names, or source citations
in analysis feedback. These references were unhelpful since the system
doesn't use RAG and users cannot action them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous prompts instructed Gemini to "remove redundancy, marketing
fluff, or content not relevant to..." which caused salient details —
especially unusual, granular, or edge-case instructions — to be lost
from spec output. Rewritten all 5 agent prompts (legal, brand_barclays,
brand_barclaycard, channel_best_practices, channel_tech_specs) to:
- Reframe the task as "restructure and organise" rather than "distil
and filter"
- Add a zero-tolerance detail-loss instruction with concrete examples
of unconventional rules that must be preserved
- Explicitly forbid omitting, summarising away, or paraphrasing
specific rules/values/conditions
- Allow merging only exact duplicates while keeping all unique content
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instruct Gemini to begin its feedback with a "Specifications checked"
line recapping the channel, sub-channel, and proof type metadata so
reviewers can confirm the correct specs were applied.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove proof_version_id from FlaggedItemCreate and ResolvedItemCreate
request schemas — the backend already derives it from URL path params.
The frontend was sending an empty string which caused Pydantic to reject
the request with 422, silently preventing flags/resolves from saving.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>