gmal-scope-builder/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Overview

GMAL Scope Builder is a Dockerized AI-powered scoping tool that matches client deliverables (from uploaded Word/Excel documents) against a standardized GMAL asset database, then builds team ratecards and FTE models. The AI layer uses Claude Opus 4.6 for both document parsing and asset matching.

## Development Commands

### Docker (primary workflow)
```bash
docker compose build                          # Build all images
docker compose up -d                          # Start all services (bg)
docker compose logs backend --tail 50         # Backend logs
docker compose logs frontend --tail 20        # Frontend logs
docker compose down                           # Stop all services
```

### Services run on
- Frontend: http://localhost:3010
- Backend API: http://localhost:8001
- PostgreSQL: localhost:5433

### One-time setup
```bash
cp .env.example .env                          # Add ANTHROPIC_API_KEY
# Place GMAL Excel file in data/ directory
curl -X POST http://localhost:8001/api/gmal/ingest   # Populate GMAL catalog
```

### Frontend (without Docker)
```bash
cd frontend && npm install
npm run dev       # Vite dev server with HMR
npm run build     # TypeScript compile + production bundle
```

### Backend (without Docker)
```bash
cd backend && pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000
```

### Database operations
```bash
# Backup
docker compose exec db pg_dump -U scope_user -d scope_builder > backups/dump.sql

# Restore
docker compose exec -T db psql -U scope_user -d scope_builder < backups/dump.sql
```

## Architecture

### Stack
- **Frontend**: React 18 + TypeScript + Vite + React Router + Axios
- **Backend**: FastAPI + SQLAlchemy (async) + asyncpg + Uvicorn
- **Database**: PostgreSQL 16
- **AI**: Claude Opus 4.6 via Anthropic SDK (tool_use for structured output)
- **Document parsing**: openpyxl, python-docx

### Backend structure (`backend/app/`)
- **`main.py`**: FastAPI app, CORS config, router registration, AI usage/debug endpoints
- **`models/`**: SQLAlchemy ORM — `gmal.py` (catalog: GmalAsset, Role, GmalHours, ServiceLine) and `project.py` (workflow: Project, ClientAsset, Match, RatecardLine)
- **`services/`**: Core business logic — see flow below
- **`api/`**: Route handlers for `gmal`, `ingest`, `projects`, `matching`, `ratecard`
- **`schemas/`**: Pydantic request/response models
- **`utils/claude_client.py`**: Wraps Anthropic SDK with per-project + global token/cost tracking and a 50-call debug log

### Frontend structure (`frontend/src/`)
- **`App.tsx`**: Router, navigation bar, live AI cost tracker, expandable debug panel
- **`pages/`**: Dashboard, NewProject, ProjectView (main workflow), GmalBrowser, GmalEditor, Help
- **`api/client.ts`**: Axios instance pointing to backend
- **`types/index.ts`**: Shared TS interfaces + `MODEL_TYPE_LABELS` / `CONFIDENCE_COLORS` constants

### Core data flow

1. **Ingestion** — Excel file → `excel_parser.py` → `GmalAsset` + `Role` + `GmalHours` (per model type) in PostgreSQL

2. **Project creation** — User selects one of 5 **model types** (Current, AI-Enhanced, Offshore+, Local, Factory); this key drives which `GmalHours` rows are used throughout

3. **Document parsing** — Uploaded `.docx`/`.xlsx` → `doc_parser.py` extracts raw text → Claude with `extract_assets` tool returns structured `ClientAsset` list (name, description, volume, complexity hint)

4. **AI matching** — Each `ClientAsset` → `ai_matching.py` → Claude with `submit_matches` tool → ranked GMAL matches with confidence (`exact`/`close`/`multiple`/`none`), score 0–1, reasoning, caveats. Processed in batches of 10 with cancellation support.

5. **Ratecard building** — User selects a match per asset → `ratecard_builder.py` looks up `GmalHours[gmal_asset, model_type]`, multiplies by `ClientAsset.volume` → `RatecardLine` rows (one per role per asset)

6. **Team shape** — `team_shape.py` aggregates hours per role → FTE = total / 1800; efficiency slider (0–90%) is applied to **delivery roles only** (programme roles are not reduced)

7. **Export** — `export_excel.py` produces multi-tab workbook (ratecard, asset detail, team shape, efficiency); `export_pdf.py` produces caveats report

### Project status lifecycle
`draft` → `parsing` → `matching` → `review` → `building` → `finalized`

### AI cost tracking
Every Claude call records input/output tokens and USD cost (`$3/M` input, `$15/M` output) via `claude_client.py`. Costs are stored on the `Project` model and surfaced globally via `GET /ai/usage`. The frontend polls this and shows a live cost tracker + expandable debug panel.

## Key design decisions
- **All Claude calls use `tool_use`** for structured output — no fragile JSON parsing from free-text responses
- **`model_type` is set at project creation** and cannot change — it filters all `GmalHours` lookups
- **Programme roles are exempt from efficiency reduction** in team shape calculations (they don't scale with AI productivity)
- **Matching is async/batched** — supports cancellation mid-job; poll `/api/projects/{id}/status` for progress
- **The GMAL catalog (390 assets)** is ingested from a single Excel file in `data/`; re-run `/api/gmal/ingest` to reload after updating the file