The ratecard summary tab now includes:
- Tier row (showing client_tier A/B/C per asset column) below the header
- Match Summary row (per-match caveat text) — split from combined caveats
- GMAL Standard Caveats row — split from combined caveats
Match summary and GMAL standard caveats were previously merged into a
single row, which made it hard to tell what came from the AI match vs
the standard GMAL clause. Splitting them surfaces both clearly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ratecard lines now store total_hours as per-1-asset hours (= base_hours,
linked to the GMAL row), with volume tracked separately. Aggregators
(team_shape, ratecard summary, Excel matrix, in-app ratecard tab) multiply
by volume themselves when computing total effort. Display behavior is
preserved; storage semantics are clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MultipleResultsFound when building ratecard because some assets had
duplicate selected matches (from re-running matching or YOLO).
Changed scalar_one_or_none() to scalars().first() to take the first.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: "name 'name' is not defined" error on line 300
The f-string example {name:"KV 360", tier:"Tier B"} was interpreted
as Python set literal, not as JSON text. Changed to parentheses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend:
- AppUser model with email, name, role (viewer/editor/admin), azure_oid
- Users API: GET /users/me (current user + role), GET /users (admin: list all),
PUT /users/{id}/role (admin: change role)
- Auto-create user on first login: first user = admin, rest = editor
- get_or_create_user helper for role lookup
- require_role helper for permission checks
Frontend:
- UserRoleContext provides role to all components
- useUserRole() hook: isAdmin, isEditor, isViewer
- Nav items filtered by role: GMAL Editor + Users only for admin
- Dashboard: Ingest button admin-only, New Project editor-only
- User Management page: list all users, change roles via dropdown
- Role badges: admin (red), editor (gold), viewer (grey)
Roles:
- Viewer: view projects, download exports
- Editor: create/edit projects, upload, match, build ratecards
- Admin: all + GMAL Editor, data ingest, user management
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tier fix (reverses previous "extract once" mistake):
- SEPARATE entry for EACH tier where volume > 0
- "KV 360" Tier A=No/0, Tier B=Yes/1, Tier C=Yes/1 → TWO entries
- Tier field matches column header exactly ("Tier B", "Tier C")
- Tiers with volume=0 or status=No are skipped
- Applied to both normal and deep extraction prompts
User context box (new Step 3 on Upload tab):
- Textarea where users give hints before extraction runs
- Examples: "Focus on Toolbox sheet", "Tier columns are D/F/H"
- Context prepended to Claude prompt in both normal and deep modes
- Passed through upload endpoint → background parse → AI calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Problem: Claude extracted the same asset 3 times (once per tier A/B/C),
creating duplicate entries like "Toolbox presentation deck" x3.
Fix: Both normal and deep extraction prompts now say:
- Extract each UNIQUE asset ONCE only
- Do NOT create duplicates for same asset at different tiers
- Use the "tier" field to record the tier label
- Skip assets with volume 0 across all tiers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: stop_reason=max_tokens - Claude ran out of output tokens
before finishing the tool call JSON for 50+ assets.
Fix:
- Bump max_tokens from 16000 to 32000 for both normal and deep extraction
- Tell Claude to keep descriptions SHORT (1 sentence max)
- Reduce input data to 35k chars (from 40k) to leave more room for output
- Better stop_reason logging on normal extraction too
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Log structure analysis length and data length before Pass 2
- Log stop_reason from Claude response
- If no assets returned, log the text response for debugging
- Truncate structure analysis to 4k chars if too long (leaves room for data)
- Reduce data to 40k chars (was 45k, combined with analysis was too large)
- Add instruction: "You MUST call extract_assets with at least one asset"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Problem: Header detection picked data rows (with Yes/No/numbers) as headers
because they had more filled cells than the actual header row (which had
merged cells with gaps). Result: data values became column labels, deep
extraction failed.
Fix:
- Header values must be text-like (not numbers, Yes/No, 0/1, ü, x, -)
- Only consecutive header rows count - stop scanning at first data row
- Multi-row headers combined (row 1 + row 2 both contribute)
- Tested against Wella Job Routes 2: correctly identifies row 2 as header
with "Buckets | Categories | Top 10 deliverables | Tier A | Tier B | Tier C"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New "Asset Summary" table at top of Ratecard tab showing:
#, Client Asset name, Tier, Matched GMAL, Volume, Total Hours
- Grand total row with volume sum and hours sum
- Appears above the existing "Hours by Role" detail table
- Section labels ("Asset Summary" / "Hours by Role") for clarity
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Step 1: Select files (shown with gold bullet points after selection)
- Step 2: Tier mapping is REQUIRED - must click "None" or a preset
before extraction is enabled. Red asterisk indicates required.
- Step 3: Choose Normal/Deep extraction mode, then click "Extract Assets"
- Upload no longer auto-triggers extraction on file select
- Uploaded filenames shown persistently on the Upload tab
- Tier confirmation state tracked (tierConfirmed)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Client Tier Mapping: how it works, common mappings, when to set it
- Brief Analysis: what it extracts, discovery questions (Red/Amber/Green), how it feeds into matching
- Team Shape: FTE formula, efficiency profiles (Conservative/Moderate/Aggressive), BTG tool stacking, AI model warning, Excel export tabs
- Refine Chat: example instructions and how it works
- GMAL Browser & Editor: what each does, AI description generation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New section: "Preparing Documents for Upload"
- Excel: delete hidden/blank/internal sheets, clean up random data,
check merged cells
- Word: remove cover pages, accept tracked changes
- General: smaller is better, upload multiple files, set tiers first,
try Normal first then Deep
- Updated Upload step: explains Normal vs Deep extraction with use cases
- Tips on when to use Deep Extraction
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Split deep extraction into two separate functions (pass1 + pass2)
so the background task can update DB between them
- Progress now shows:
"Pass 1/2: Analyzing structure... (this takes 20-40 seconds)"
"Pass 1 complete (23s). Pass 2/2: Extracting assets..."
"Deep extraction complete (52s total). Found 45 assets."
- Live elapsed timer (seconds) shown in the upload spinner
- Timer ticks every second so user knows it's not hung
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Upload accepts multiple files at once (hold Ctrl/Cmd to select)
- All files extracted and combined into one document for AI parsing
- Each file clearly labelled with filename separator in combined text
- Progress shows "Extracting text from file1.xlsx..." per file
- Source filename stores comma-separated list of all uploaded files
- Works with both Normal and Deep extraction modes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Problem: Complex client Excel files (30+ columns, merged cells, Q&A columns,
tier data) produced zero assets because the extraction was a dumb pipe dump
that lost all column context.
Fix:
- Smart Excel extraction: detects header rows, labels each value with its
column name, skips empty sheets, handles merged cells. Claude now sees
"Top 10 deliverables: Toolbox presentation deck | Tier A: Yes | 1"
instead of "Toolbox | Base | Toolbox presentation deck | ü' | Yes | 1"
- Two extraction modes on Upload tab:
- Normal: fast single-pass extraction (~$0.05)
- Deep Extraction: two-pass AI analysis (~$0.15-0.30)
Pass 1: Claude analyzes the spreadsheet structure
Pass 2: Claude extracts assets using the structural understanding
- Upload endpoint accepts ?mode=normal|deep query parameter
- Background parse shows "Deep extraction: analyzing structure (Pass 1 of 2)"
- Tested against both Wella files - header-aware extraction produces
clear labelled output
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Model type labels now show "(AI rates built-in)" for AI variants
- New Project page: warning when AI model type is selected
- Team Shape tab: amber warning banner when project uses AI model type
explaining that hours already include AI efficiency
- AI_MODEL_TYPES constant exported for reuse
- Prevents users from applying efficiency profiles on top of AI model rates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Tier mapping now on Upload tab (set BEFORE matching, not after)
- Added presets: S/M/L, Small/Medium/Large
- Full preset list: None, A/B/C, 1/2/3, S/M/L, Small/Med/Large, Gold/Silver/Bronze
- "None" button to clear tier mapping
- Removed "Expand to Tiers" button from Match Review (redundant)
- Helper text explains to set tiers before matching
- Matching uses pre-set tiers to pick correct GMAL complexity variant
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Doc parser now extracts tier labels (Tier A, A, Gold, etc.) per asset
- Matching uses tier to find the correct GMAL complexity variant:
- Claude matches to the GMAL family (asset type)
- Post-match lookup: (asset_name + target_complexity_level) finds exact variant
- e.g. "Banner - Tier A" with A=Complex → finds Complex variant by asset_name query
- Tier hint passed to Claude prompt for better matching
- No blind expansion - only the tier-appropriate GMAL is matched
- Expand to Tiers button still available for when client doesn't specify tiers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Tier mapping on projects: configurable label→complexity mapping
- Presets: A/B/C, 1/2/3, Gold/Silver/Bronze
- Stored as JSON on project.tier_mapping
- ClientAsset.client_tier field for tracking which tier an asset belongs to
- GMAL family endpoint: GET /gmal/assets/{id}/family returns all complexity variants
- Looks up by asset_name (NOT by GMAL number increment)
- Verified: families share asset_name across non-sequential GMAL IDs
- Expand to Tiers: POST /projects/{id}/expand-tiers
- Splits each matched asset into N tier variants (one per tier)
- Finds correct GMAL variant by asset_name + complexity_level query
- Creates new ClientAsset + Match per tier with correct GMAL
- Removes original un-tiered asset after expansion
- Frontend: tier preset buttons + expand button on Match Review tab
- Tier tags shown with label → complexity mapping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Brief Analysis now accepts pasted text OR uploaded file
- Textarea for typing/pasting brief directly (no upload required)
- Re-analyze button returns to input screen
- Team Shape Excel sheets now use formulas:
- FTE = Hours/1800 (formula)
- Adjusted Hours = Original * (1-eff%) (formula)
- Hours Saved = Original - Adjusted (formula)
- Headcount = IF/CEILING formula
- Base team shape also uses FTE + headcount formulas
- All sheets are now formula-driven, Finance can edit hours and see recalculation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Iterative Prompting:
- Chat box on Match Review tab for natural language refinement
- "re-run under 70%" / "ignore zero volume" / "set all volumes to 1"
- Claude interprets instruction into structured actions
- Actions: rematch_below_threshold, rematch_specific, delete_assets, set_volume
- Re-matches affected assets automatically after refinement
- Chat log shows instruction history
RFP/Brief Analysis:
- New "Brief Analysis" tab between Upload and Match Review
- Extracts: summary, objectives, KPIs, channels, audiences, deliverable categories,
constraints, timeline, budget, complexity assessment
- Generates prioritized discovery questions (Red/Amber/Green)
- Questions include category, rationale, and priority level
- Stored as JSON in project.brief_analysis field
- Uploaded files now saved to data dir for re-analysis
- Re-analyze button to refresh analysis
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Ratecard Summary: Total Hours column now uses =SUM() formulas
- Grand total row uses =SUM() formulas per column
- New "Assumptions & Rates" sheet with editable inputs:
- Global: Hours per FTE, Margin %, Overhead %
- Per-role: Day Rate (£), Annual Salary (£)
- Yellow highlighted input cells for Finance to edit
- Foundation for full formula-linked financial model
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Team Shape tab: profile selector (Conservative/Moderate/Aggressive)
- BTG tool toggles (Pencil, OMG, Creative X, Cortex, Semblance, Share of Model)
- Per-discipline rates shown inline with combined profile+tool percentages
- Efficiency % column in table showing rate per role
- Flat rate fallback still available (10/25/50/75/90%)
- Match feedback endpoint: POST /matches/{id}/feedback (confirm/reject)
- Feedback learning: confirmed matches stored, checked before AI calls
- Known matches applied instantly (no API call, $0 cost)
- Remaining unknowns sent to Claude as before
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ratecard Summary caveats row now combines AI match caveats with the
original GMAL asset caveats (labelled "GMAL Standard Caveats:") below.
Asset Detail sheet splits these into two separate columns.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inserts an "Assumptions / Caveats" row (row 2) in the Ratecard Summary
sheet so users can see each asset's AI-matched caveats without switching
to the Asset Detail tab. Uses the same amber colour scheme as the PDF report.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Assets with quantity 0 are meaningless downstream (produce 0 ratecard hours)
and clutter the review stage — filter them out at parse time.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Azure AD has /gsb/ registered, msalConfig had /gsb without slash.
AADSTS50011 error due to mismatch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Access tokens for User.Read scope have audience=graph.microsoft.com,
but the backend validates audience=CLIENT_ID. ID tokens always have
audience=CLIENT_ID so they validate correctly.
Also add upn claim fallback for email extraction from ID token.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
onRedirectNavigate was removed from EndSessionRequest in msal-browser v3+.
clearCache() clears local tokens without redirecting to Microsoft logout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- auth.py: replace synchronous httpx.get (blocked event loop) with
async httpx.AsyncClient; add key-rotation refresh on unknown kid
- App.tsx: use onRedirectNavigate: false so Sign out clears only the
local MSAL session without redirecting to Microsoft logout endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move AI parsing and matching into BackgroundTasks so both endpoints
return immediately instead of blocking until Claude finishes (~60s+)
- Frontend now polls project status after upload/match POST returns,
keeping the spinner/progress UI working as before
- Replace <a href> export links with programmatic Axios downloads to fix
missing /gsb base path and missing auth token (401 in production)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- docker-compose: DATA_DIR env var controls data volume mount
(defaults to ./data for local, /var/www/html/gmal-scope-builder/data on server)
- deploy.sh: resolve DATA_DIR from .env, default to persistent web dir
- deploy.sh: rm only frontend files from WEB_DIR, preserve data/ subdir
- .env.example: document DATA_DIR variable
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- MSAL.js (PKCE) browser-side auth against Azure Entra ID
- Bearer token interceptor on all API calls
- Backend JWT validation middleware (python-jose + JWKS)
- All API routes protected; /api/health stays public
- vite base set to /gsb/, BrowserRouter basename=/gsb
- docker-compose: remove frontend service, lock backend to 127.0.0.1:8002, remove dev volumes
- backend: 2 workers, no --reload
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Efficiency preview: toggle 10/25/50/75/90% to see adjusted FTE live
- Programme roles NOT reduced (they don't scale with AI)
- Excel export: select multiple efficiency levels, each gets its own tab
showing original vs adjusted hours/FTE/headcount with hours saved
- Export buttons on both Ratecard and Team Shape tabs
- team_shape service accepts efficiency_pct parameter
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New ai_descriptions service: generates rich brief-friendly descriptions
per GMAL asset via Claude, grouped by category (135/243 generated)
- Descriptions include client synonyms, inclusions/exclusions, use cases,
channel/format info, complexity differentiators
- GMAL Browser shows AI descriptions with green/amber status indicators
- GMAL Editor: editable AI descriptions, per-asset regenerate, batch generate all
- Matching catalog now includes AI descriptions for better semantic matching
- Fixed ORM session expiry bug: snapshot asset data before batch commits
- Fixed enum issue: removed unused UPLOADING/EXTRACTING statuses
- Added app-level logging (basicConfig) so service logs show in docker
- YOLO now batches 20 selections in parallel
- Matching returns 1 best match by default, extras only within 5% of top
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Team shape service: total_hours / 1800 = FTE per role
- Programme roles (6) flagged separately from delivery roles
- New API endpoint GET /projects/{id}/team-shape
- Team Shape tab in frontend with summary stats and role breakdown
- Sheet 3 "Team Shape" in Excel export with discipline grouping,
delivery vs programme split, FTE, rounded headcount, and summary
- Full GMAL catalog matching (replaced pre-filter with compact catalog)
- Upload progress stages with live polling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Upload now shows live stage progress (uploading -> extracting -> AI parsing -> done)
- Fix match group collapse: proper React state instead of DOM manipulation
- Replace pre-filter with full GMAL catalog sent to Claude (~3k tokens, <$0.01)
- FTS and keyword matching missed too many semantic matches
- Claude now sees all 243 assets and uses semantic understanding
- Improved system prompt with terminology bridges for better scoring
- Per-project AI cost tracking persisted to DB
- Parallel matching with cancel support
- Auto-select matches >= 80%, YOLO button for rest
- Debug panel for AI call inspection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>