Add README with per-stage agent reference

A single document a human can read to understand the whole pipeline: - High-level: what the platform is + a 17-row stage table. - Quick start (local dev) + smoke test commands. - Deployment to optical-dev: deploy.sh flow, port-pick logic, Apache include line, what to do on first boot vs re-run. - Architecture summary (state machine, artifacts, approvals, cost). - Per-stage agent reference (the heart of the doc): for each of the 9 Claude agents — what it's for, what it reads, output schema, and the system-prompt rules in plain English. Plus the non-Claude stages (qualification scorecard, Q&A export, ratecard build, efficiency profile, team shape) explained the same way. - Cross-cutting concerns: cost tracking, approvals + Mailgun, auth paths (dev-bypass vs Azure JWT), exact stage-machine rules, destructive-cascade rules per stage. - Testing: how to run the 118-test suite + what each file covers. - Repo layout. Goal: someone landing on the repo can read this front-to-back in 10 minutes and know what every stage does, what each agent is told, and how to run / deploy / test the thing.
2026-04-27 18:31:21 -04:00 · 2026-04-27 18:31:21 -04:00 · 7254d95f32
commit 7254d95f32
parent b553bef43a
1 changed files with 618 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -0,0 +1,618 @@
+# OLIVER Sales Operations Platform
+
+End-to-end RFP → mobilization pipeline for OLIVER's commercial team. Drop in a brief, walk it through 17 stages (intake → qualification → asset matching → ratecard → delivery model → team shape → caveats → approval gates → pitch → post-win planning), and produce a defensible proposal.
+
+V2 of the GMAL Scope Builder. Phase 1 covers stages 1–16; stage 17 (downstream system push to Salesforce / SharePoint) is deferred to Phase 2.
+
+---
+
+## Table of contents
+
+- [What it does](#what-it-does)
+- [The 17 stages](#the-17-stages)
+- [Quick start (local dev)](#quick-start-local-dev)
+- [Deployment (optical-dev)](#deployment-optical-dev)
+- [Architecture](#architecture)
+- [Per-stage agent reference](#per-stage-agent-reference)
+- [Cross-cutting concerns](#cross-cutting-concerns)
+- [Testing](#testing)
+- [Repo layout](#repo-layout)
+
+---
+
+## What it does
+
+A single Opportunity (the V1 "Project") progresses through a 17-stage state machine. Each stage either:
+
+- runs a specialised Claude agent over upstream artifacts and produces structured JSON,
+- captures a human decision (qualification scorecard, approval, deal status), or
+- exports a deliverable (Q&A pack, ratecard, pitch deck markdown).
+
+Stages cannot be skipped — `/stages/{n}/complete` enforces the order. Two stages (3 — Qualification, 14 — Approval Gate) require explicit `Approval` rows to be approved before they'll close. Approvals fan out via in-app notifications and Mailgun email.
+
+Every Claude call records its cost on the produced `stage_artifact` (tokens in / out / USD) so the per-stage spend is visible in the UI.
+
+---
+
+## The 17 stages
+
+| # | Stage | Status | Driver |
+|--|--|--|--|
+| 1 | Intake Opportunity | ✅ | Claude (Intake Agent) |
+| 2 | Read & Diagnose Brief | ✅ | Claude (Diagnosis Agent) |
+| 3 | Qualification Assessment | ✅ gated | Human (TROWLS) + approval |
+| 4 | Generate Client Q&A Pack | ✅ | Excel/Word export |
+| 5 | Ingest Client Answers | ✅ | Human edit |
+| 6 | Normalize Asset List | ✅ | Claude (Asset Normalizer) |
+| 7 | Match Assets to Job Routes | ✅ | Claude (Match Agent) |
+| 8 | Build Asset-Level Rate Card | ✅ | Pure-Python (GMAL hours × volume) |
+| 9 | Recommend Delivery Model | ✅ | Claude (Delivery Model Agent) |
+| 10 | Apply Efficiency Logic | ✅ | Human (sliders, persisted) |
+| 11 | Create Draft Team Shape | ✅ | Pure-Python (FTE calc) |
+| 12 | Identify Capability Gaps | ✅ | Claude (Capability Gap Agent) |
+| 13 | Generate Support Docs | ✅ | Claude (Support Docs Agent) |
+| 14 | Validation & Approval Gates | ✅ gated | Approval flow |
+| 15 | Build Pitch Materials | ✅ | Claude (Pitch Deck Agent) + markdown stub |
+| 16 | Delivery Planning (post-win) | ✅ | Claude (Implementation Plan Agent) |
+| 17 | Trigger Downstream Systems | ⏳ Phase 2 | Salesforce / SharePoint push |
+
+---
+
+## Quick start (local dev)
+
+```bash
+git clone git@bitbucket.org:zlalani/oliver-sales-ops-platform.git
+cd oliver-sales-ops-platform
+cp .env.example .env
+$EDITOR .env                    # set ANTHROPIC_API_KEY at minimum
+
+# Local dev mode: brings up db + redis + backend + Vite dev container
+COMPOSE_PROFILES=dev docker compose up -d
+
+# Runs at:
+#   http://localhost:3011/oliver-sales-ops-platform/   ← frontend (Vite HMR)
+#   http://localhost:8003/api/health                   ← backend
+#   localhost:5435 / localhost:6380                    ← postgres / redis
+```
+
+`.env` defaults give you `DEV_AUTH_BYPASS=true` so you skip the MSAL login gate and land as the admin user (`admin@oliver.agency`). The backend's auth middleware reads `DEV_AUTH_EMAIL` / `DEV_AUTH_NAME` / `DEV_AUTH_ROLE`.
+
+### Smoke test
+
+```bash
+curl http://localhost:8003/api/health
+# {"status":"ok","db":"ok"}
+
+curl http://localhost:8003/api/users/me
+# {"id":N,"email":"admin@oliver.agency","name":"OSOP Admin","role":"admin",...}
+```
+
+### Tests
+
+```bash
+python3 -m venv /tmp/osop_test_venv
+/tmp/osop_test_venv/bin/pip install -r backend/requirements-dev.txt psycopg2-binary
+cd backend && /tmp/osop_test_venv/bin/pytest tests/ -v
+# 118 tests, 109 pass, 9 skip (real-Anthropic), 0 fail
+```
+
+---
+
+## Deployment (optical-dev)
+
+The dev server hosts a stack of internal apps under one Apache vhost, each at its own URL prefix and backend port. This app sits at `/oliver-sales-ops-platform/`.
+
+```bash
+sudo git clone git@bitbucket.org:zlalani/oliver-sales-ops-platform.git /opt/oliver-sales-ops-platform
+cd /opt/oliver-sales-ops-platform
+sudo cp .env.example .env && sudo $EDITOR .env
+#  → set ANTHROPIC_API_KEY
+#  → set APP_PUBLIC_URL=https://optical-dev.oliver.solutions
+#  → leave DEV_AUTH_BYPASS=true until SSO is wired
+
+sudo ./deploy/deploy.sh
+```
+
+**What `deploy.sh` does:**
+
+1. Auto-picks free host ports (`OSOP_DB_PORT` / `OSOP_REDIS_PORT` / `OSOP_BACKEND_PORT`). If 8003 is taken, it scans 8004→8099 and persists the chosen port back to `.env`.
+2. Renders `deploy/apache-osop.conf` from `apache-osop.conf.tmpl` with the chosen backend port substituted in.
+3. `git pull && docker compose build && up -d` (db + redis + backend).
+4. Builds the Vite SPA in a one-shot `node:20` container, syncs the `dist/` to `/var/www/html/oliver-sales-ops-platform/`. Pipes `DEV_AUTH_BYPASS` through to `VITE_DEV_AUTH_BYPASS` so the SPA matches the backend's auth setting.
+5. Polls `/api/health` until ready, prints URLs + admin email.
+
+Flags: `--no-pull`, `--no-build`, `--no-frontend`, `--logs`.
+
+**Apache wiring (one-time):**
+
+```bash
+echo 'Include /opt/oliver-sales-ops-platform/deploy/apache-osop.conf' \
+  | sudo tee -a /etc/apache2/sites-enabled/optical-dev.oliver.solutions.conf
+sudo apachectl configtest && sudo systemctl reload apache2
+```
+
+The conf is regenerated on every deploy. If the backend port changes (because something else grabbed 8003), `deploy.sh` will tell you to reload Apache again.
+
+---
+
+## Architecture
+
+**Backend.** FastAPI + async SQLAlchemy + Alembic on Python 3.12. Postgres 16 for state, Redis for future Celery work. Anthropic SDK against Claude Opus 4.7. Azure JWT validation via python-jose; `DEV_AUTH_BYPASS=true` swaps the JWT path for a configurable identity (used until SSO is wired on the dev server).
+
+**Frontend.** React 18 + Vite + TypeScript. TanStack Query for server state, React Router for the per-stage views. MSAL (Azure SSO) is in the box but bypassed on the dev server. Mermaid renders the 17-stage flowchart on the About page.
+
+**Stages as a state machine.** Every Opportunity has 17 `stage_states` rows (created on opportunity creation; stage 1 starts `in_progress`, the rest `not_started`). `/stages/{n}/complete` validates the predecessor is completed and, for stages 3 and 14, that all linked `Approval` rows are `approved`.
+
+**Artifacts.** Each agent run persists a `stage_artifact` row carrying its JSON output + cost stamp (`cost_usd` / `input_tokens` / `output_tokens`). Re-runs produce new artifacts; the UI shows the latest per stage.
+
+**Approvals.** A user with EDITOR or ADMIN role can request approvals on stages 3 / 14. Each approval gets a unique `email_token` and fires a Mailgun email with a deeplink. The approver clicks through, lands on `/approvals/:id`, decides (with optional notes), and the opportunity owner gets notified.
+
+**Cost tracking.** Per-call usage is rolled up onto the `Opportunity` (cumulative) and stamped on the `stage_artifact` (per run). The Stage 8 panel shows the running total; every agent panel header shows its run cost.
+
+---
+
+## Per-stage agent reference
+
+This section is the operator's manual for each stage — what the agent does, what it reads, what it produces, and the rules baked into its prompt.
+
+### Stage 1 — Intake Opportunity
+
+**Agent:** Intake Agent — `backend/app/services/intake_agent.py`
+**Tool name:** `submit_intake_metadata`
+**Endpoint:** `POST /api/opportunities/{id}/intake`
+**Inputs:** all uploaded `OpportunityFile` rows (concat to ~150k char cap).
+**Output:** structured opportunity metadata.
+
+**Output schema:** `client_name`, `region`, `brands[]`, `service_types[]`, `deadline_iso`, `go_live_iso`, `summary`.
+
+**What the agent is told:**
+
+> You are the Intake Agent for the OLIVER Sales Ops Platform. Your job is to read the documents an account team has uploaded for a new opportunity and extract a tight structured summary so the rest of the pipeline can plan the response.
+>
+> Be conservative — only fill fields you can ground in the documents. Omit a field rather than guessing. Brands should be specific consumer-facing brands, not the umbrella client. Service types should be concrete categories (Content, eCommerce, Social, CRM, Production, Strategy, etc.) not vague phrases. Dates must be ISO (YYYY-MM-DD); convert phrases like "end of Q2 2026" to a sensible explicit date and mention the original phrase in the summary.
+
+**Side effect:** populates `Opportunity.client_name` / `region` / `brands` / `service_types` / `deadline` / `go_live` / `description` ONLY when those fields are blank — manual entries are never overwritten.
+
+---
+
+### Stage 2 — Read & Diagnose Brief
+
+**Agent:** Diagnosis Agent — `backend/app/services/diagnosis_agent.py`
+**Tool name:** `submit_brief_diagnosis`
+**Endpoint:** `POST /api/opportunities/{id}/diagnose`
+**Inputs:** all uploaded files (same as Stage 1).
+**Output:** structured diagnosis + ≥6 client clarification questions.
+
+**Output schema:** `deliverables[]` (name / category / volume_estimate / complexity_hint), `channels[]`, `markets[]`, `capabilities_required[]`, `kpis_slas[]`, `tech_asks[]`, `timelines[]`, `ambiguities[]`, `contradictions[]`, `complexity_assessment` (low/medium/high), `summary`, `clarifications[]` (category / question / rationale / priority red|amber|green).
+
+**What the agent is told:**
+
+> You are the Diagnosis Agent for the OLIVER Sales Ops Platform. The account team has uploaded an RFP / brief and you must read it like a senior strategist preparing a scope.
+>
+> For deliverables: name specific assets, not vague phrases. "Paid social statics" is better than "social content". Include volume even when fuzzy — capture the brief's own words ("TBC", "~50/month", "on demand").
+>
+> For ambiguities: anything the brief leaves open that affects effort. "Multiple markets" without a list is an ambiguity. "Standard turnaround" without days is an ambiguity. "Feedback iterations" without a count is an ambiguity.
+>
+> For contradictions: explicit conflicts (timeline says Q2, go-live says Q1; volumes say 200 in one section and 500 in another).
+>
+> **For clarifications (MANDATORY — minimum 6, target 8–15):** ambiguities and contradictions are observations; clarifications are the actual *questions* you'd send the client to resolve them. Every ambiguity you list MUST also appear here as an actionable, question-marked sentence with a category + priority + rationale. This is what Stage 4 packages into the client Q&A pack — if you leave it empty, the rest of the pipeline breaks.
+
+**Side effect:** wipes existing `source_stage=2` clarification rows and inserts fresh ones. Stage 5 ingests answers against these.
+
+---
+
+### Stage 3 — Qualification Assessment (gated)
+
+**Driver:** human — no Claude call.
+**Endpoints:**
+`POST /api/opportunities/{id}/qualification` (save TROWLS scorecard)
+`GET /api/opportunities/{id}/qualification`
+`POST /api/opportunities/{id}/stages/3/approvals` (request approval)
+`POST /api/approvals/{id}/decision` (approve / reject)
+
+**TROWLS dimensions** (each scored 0–10):
+
+- **T**iming — do we have time to win and deliver?
+- **R**elationship — how well do we know the client / decision-makers?
+- **O**pportunity — deal size, strategic value, future pipeline.
+- **W**hat We Know — quality of brief, market data, prior knowledge.
+- **L**ocation — entity availability, hiring viability, labour law fit.
+- **S**ector — sector experience + conflict-list checks.
+
+Total 0–60 → percentage 0–100 → recommendation: `proceed` (≥60), `slt_review` (≥50), `no_go` (<50). Stamped on `Opportunity.qualification_score`.
+
+**Gate.** `/stages/3/complete` 400s unless at least one `Approval` row exists for stage 3 AND every approval is `approved`.
+
+---
+
+### Stage 4 — Generate Client Q&A Pack
+
+**Driver:** pure-Python export — no Claude call.
+**Endpoints:**
+`GET /api/opportunities/{id}/qa-pack/excel` → .xlsx
+`GET /api/opportunities/{id}/qa-pack/word` → .docx
+
+Reads the `clarification_questions` table seeded by Stage 2, sorts by priority (RED → AMBER → GREEN), groups by category, and exports a populated client-facing pack. Excel has a colour-coded priority column + free-text answer column. Word groups by category with rationale + answer slot per question.
+
+---
+
+### Stage 5 — Ingest Client Answers
+
+**Driver:** human — no Claude call.
+**Endpoint:** `PUT /api/opportunities/{id}/clarifications/{cid}`
+
+Per-question editor on the frontend. The user pastes the client's reply into each `client_answer` field; status auto-flips to `answered` and `answered_at` is stamped. Items can also be `dismissed` (out of scope) or `pending` again.
+
+---
+
+### Stage 6 — Normalize Asset List
+
+**Agent:** Asset Normalizer — `backend/app/services/asset_normalizer.py`
+**Tool name:** `submit_normalized_assets`
+**Endpoint:** `POST /api/opportunities/{id}/assets/normalize`
+**Inputs:** uploaded files + Stage 2 diagnosis (when present).
+**Output:** clean `ClientAsset[]` list.
+
+**Output schema:** `assets[]` of `{ raw_name, raw_description, client_tier, volume }`.
+
+**What the agent is told:**
+
+> You are the Asset Normalizer for the OLIVER Sales Ops Platform. Each row is something a creative agency can scope hours against.
+>
+> Rules:
+> - One asset per row. If the brief says "toolbox A/B/C", emit three rows (one per tier).
+> - Use specific names ('PDP hero banner', not 'web content').
+> - Capture tier letters / bands when the brief uses them. Leave client_tier blank if it doesn't.
+> - Volume is integer. If the brief says ranges ("100-200"), pick the midpoint. If TBC, set 1 and put "Volume TBC — confirm with client" in the description.
+> - Do NOT invent assets the brief doesn't mention. Be exhaustive but honest.
+
+**Side effect:** wipes existing `ClientAsset` rows for the opportunity (cascading to matches + ratecard_lines), then inserts fresh ones.
+
+---
+
+### Stage 7 — Match Assets to Job Routes
+
+**Agent:** Match Agent — `backend/app/services/ai_matching.py`
+**Tool name:** `submit_matches`
+**Endpoint:** `POST /api/opportunities/{id}/match` (background task) → `GET /matches`
+**Inputs:** each ClientAsset + the **full GMAL catalog** (~243 hour-route entries, ~3-20k tokens depending on AI-enhanced descriptions) + Stage 2 brief context.
+**Output:** up to 3 candidate `Match` rows per ClientAsset, ranked.
+
+**Output schema:** `matches[]` of `{ gmal_id, confidence (exact|close|multiple|none), confidence_score (0-1), reasoning, caveats }`.
+
+**What the agent is told:**
+
+> You are a GMAL asset matching specialist for a creative production agency. You match client-described assets to the closest entry in the GMAL catalog.
+>
+> Guidelines:
+> - Match on the TYPE of deliverable first, then complexity.
+> - Bridge terminology: "KV" / "Key Visual" = Photography GMALs; "PDP" / "product listing" = eCommerce / Copywriting GMALs; "launch video" = Campaign Video GMALs; "social post" = Social GMALs; "banner" / "display" = Display / Standard Banner GMALs.
+> - Return your single best match. Only add a 2nd/3rd if within 5% of the top score.
+> - exact: 0.9–1.0. close: 0.6–0.89. none: <0.3.
+> - Always state caveats — what the GMAL covers vs what the client described.
+> - Match complexity literally — "simple banner" → Simple GMAL, not Complex.
+
+**Side effects:**
+
+- Wipes prior matches for this opportunity's assets first.
+- Auto-selects the rank-1 match when its score ≥ 0.8.
+- Saves a `matching_run` `stage_artifact` with the run total cost + counts.
+- Toggling `is_selected` to true on one match auto-deselects siblings (one selection per asset).
+
+---
+
+### Stage 8 — Build Asset-Level Rate Card
+
+**Driver:** pure-Python — no Claude call.
+**Service:** `backend/app/services/ratecard_builder.py`
+**Endpoints:**
+`POST /api/opportunities/{id}/ratecard/build`
+`GET /api/opportunities/{id}/ratecard`
+
+For each `ClientAsset` with a selected match, looks up `GmalHours[gmal_asset, model_type]` and creates a `RatecardLine` per role.
+
+**Bug-4 invariant (carried forward from V1).** `RatecardLine.total_hours` stores hours **per 1 asset** (= `base_hours`); `volume` lives on the row. Aggregators (this endpoint, team_shape, Excel matrix, frontend) multiply by `volume` themselves when computing total effort. Tests assert this directly.
+
+---
+
+### Stage 9 — Recommend Delivery Model
+
+**Agent:** Delivery Model Agent — `backend/app/services/delivery_model_agent.py`
+**Tool name:** `submit_delivery_model`
+**Endpoint:** `POST /api/opportunities/{id}/delivery-model`
+**Inputs:** Stage 2 diagnosis + opportunity's GMAL `model_type`.
+**Output:** headline + per-workflow-stage breakdown.
+
+**Output schema:** `headline` (`traditional` | `ai_supported` | `hybrid`), `summary`, `workflow_stages[]` of `{ stage, approach (manual|ai_supported|fully_automated), tooling[], rationale }`, `tooling_caveats[]`, `risks[]`.
+
+**What the agent is told:**
+
+> You are the Delivery Model Agent for the OLIVER Sales Ops Platform. The team has diagnosed the brief; now you recommend HOW the work should be delivered. Be honest about where AI tooling is genuinely productive vs. where it would dent quality.
+>
+> Tool capability cheat-sheet (as of 2026):
+> - Pencil: paid-social statics + simple static digital ads. Not motion. Not print.
+> - Creative-X: brand-policy automated checks across digital assets.
+> - Semblance: video tooling, mostly cutdowns/edits.
+> - OMG: media/programmatic dynamic creative.
+> - Google Vids / Synthesia: simple internal/eLearning motion only.
+> - Photography / TVC origination / Print mastering: manual.
+>
+> Mastering and origination are typically manual. Adaptation, localisation, social statics, and digital ad cut-downs are good AI candidates. Approvals, brand QA, and client review stay manual. If the brief includes motion at scale or print, the headline should be "hybrid" or "traditional" — don't oversell AI.
+
+---
+
+### Stage 10 — Apply Efficiency Logic
+
+**Driver:** human — no Claude call.
+**Endpoints:**
+`POST /api/opportunities/{id}/efficiency-profile` (save)
+`GET /api/opportunities/{id}/efficiency-profile`
+
+UI offers a scenario picker (Conservative / Moderate / Aggressive), a blanket-percentage slider, per-discipline override sliders (capped at 90%), tools-applied chips, and a notes field. Saved as a `stage_artifact` for the audit trail. Stage 11 reads these to compute team shape.
+
+Programme roles (Programme Director, Head of Project Management, etc.) are never reduced by efficiency, regardless of discipline-level overrides.
+
+---
+
+### Stage 11 — Create Draft Team Shape
+
+**Driver:** pure-Python — no Claude call.
+**Service:** `backend/app/services/team_shape.py`
+**Endpoint:** `GET /api/opportunities/{id}/team-shape?efficiency_pct=N&discipline_overrides={"Creative":40}`
+
+Aggregates `RatecardLine` hours per role (multiplying `base_hours × volume` per the bug-4 invariant), divides by `HOURS_PER_FTE = 1800`, applies efficiency (per-discipline overrides take precedence over the blanket percentage; programme roles always 0%), caps at `MAX_EFFICIENCY = 90`. Returns the FTE table grouped by discipline.
+
+---
+
+### Stage 12 — Identify Capability Gaps
+
+**Agent:** Capability Gap Agent — `backend/app/services/capability_gap_agent.py`
+**Tool name:** `submit_capability_gaps`
+**Endpoint:** `POST /api/opportunities/{id}/capability-gaps`
+**Inputs:** Stage 2 diagnosis + opportunity context.
+**Output:** in-scope core list + gaps with sourcing recommendations.
+
+**Output schema:** `core_in_scope[]`, `gaps[]` of `{ capability, criticality (red|amber|green), suggested_source (internal_sme|brandtech_partner|external_vendor), suggested_partner, rationale }`, `summary`.
+
+**What the agent is told:**
+
+> You are the Capability Gap Agent for the OLIVER Sales Ops Platform. OLIVER core capability is: in-house creative + content production (statics, motion, social, eCom, CRM, retail, basic strategy). Things OLIVER typically partners on:
+>
+> - SEO + organic search → Jellyfish.
+> - Performance media buying → media-specialist Brandtech partners.
+> - Social strategy at depth / community management → Gravity Road.
+> - TVC production at scale → external production company.
+> - Influencer/talent management → external agencies.
+> - Localisation in long-tail markets → external translators.
+>
+> Be specific. If the brief doesn't ask for it, don't list it. Mark criticality honestly: RED = we can't deliver without solving this, AMBER = we should partner but could limp, GREEN = nice-to-have.
+
+---
+
+### Stage 13 — Generate Support Docs
+
+**Agent:** Support Docs Agent — `backend/app/services/support_docs_agent.py`
+**Tool name:** `submit_support_docs`
+**Endpoint:** `POST /api/opportunities/{id}/support-docs`
+**Inputs:** Stages 2 (diagnosis), 9 (delivery model), 12 (capability gaps).
+**Output:** caveats / assumptions / SLAs / KPIs / governance.
+
+**Output schema:** `caveats[]`, `assumptions[]`, `slas[]` (≥3, of `{ deliverable, v1_days, v2_days, v3_days, responsible_party, notes }`), `kpis[]` (≥3, of `{ metric, target, measurement }`), `governance[]`, `summary`.
+
+**What the agent is told:**
+
+> You are the Support Docs Agent. Your job is to author the caveats, assumptions, SLAs, KPIs and governance clauses that go into the proposal.
+>
+> Default SLA pattern (mirrors PARASOL Studio SLAs 2024):
+> - 24-48h to acknowledge a brief.
+> - 2 rounds of amends standard.
+> - Static digital ad (5 assets): V1 5 days, V2 2 days, V3 1 day.
+> - eCom hero set: V1 7 days, V2 3 days, V3 1 day.
+> - CRM email template: V1 7 days, V2 3 days, V3 1 day.
+> - Motion social (one master + adapts): V1 14 days, V2 5 days, V3 2 days.
+> Adjust based on stated timeline pressure and the delivery model.
+>
+> Caveats default exclusions (carry forward unless brief explicitly includes): third-party fees, stock photography, talent fees, music licensing, shoot/production costs, localisation beyond N markets, languages beyond N. Be specific.
+>
+> Assumptions: anchor the price. "Brand templates supplied", "masters supplied", "feedback received within 48h", "final approval from a single named decision-maker", etc.
+>
+> KPIs: be honest about what's measurable. Quality, on-time delivery, % of jobs within SLA, first-time-right rate. Avoid vague aspirations.
+
+Phase 2 will swap the prompt-baked defaults for a real `template_library` sourced from PARASOL_Studio_SLAs_2024_V1.pptx.
+
+---
+
+### Stage 14 — Validation & Approval Gates (gated)
+
+**Driver:** approval flow only — no agent.
+**Endpoints:**
+`POST /api/opportunities/{id}/stages/14/approvals` (admin requests an approval per role)
+`POST /api/approvals/{id}/decision` (approver decides)
+`POST /api/opportunities/{id}/stages/14/complete`
+
+Same approval mechanism as Stage 3. Roles requested per the EMEA-DOA chain (commercial / delivery / solution / regional / deal_desk). Each request fires an in-app notification + Mailgun email with a deeplink to the approval page (`/approvals/:id` or `/approvals/by-token/:token`). The opportunity owner gets notified back when each decision lands.
+
+---
+
+### Stage 15 — Build Pitch Materials
+
+**Agent:** Pitch Deck Agent — `backend/app/services/pitch_deck_agent.py`
+**Tool name:** `submit_pitch_deck_outline`
+**Endpoint:** `POST /api/opportunities/{id}/pitch-deck`
+**Inputs:** every upstream artifact (intake, diagnosis, qualification, delivery model, capability gaps, support docs) + normalized assets list.
+**Output:** structured slide-by-slide outline.
+
+**Output schema:** `headline`, `deck_summary`, `slides[]` of `{ section (cover|context|approach|scope|team|commercials|governance|next_steps), title, key_points[] (3-6), speaker_notes, data_callout }`, `appendix[]`.
+
+**What the agent is told:**
+
+> You are the Pitch Deck Agent. The team has run the full intake → diagnosis → qualification → match → ratecard → delivery model → team shape → capability → caveats pipeline. Compose a tight, client-facing pitch deck outline that turns the platform's structured outputs into a slide-by-slide flow.
+>
+> Rules:
+> - 8-12 slides for a sales deck. Be ruthless. Don't pad.
+> - Section flow: cover → context → approach → scope → team → commercials → governance → next steps.
+> - Every slide has 3-6 punchy bullets, not paragraphs. Speaker notes carry the prose.
+> - Use real numbers from the platform: FTE, total hours, asset counts, cost, AI savings %, qualification score. Surface them as data_callouts.
+> - Don't invent. If the platform hasn't produced something, omit the slide rather than fabricate.
+
+**Also exposes** `GET /pitch-deck/markdown` (auto-composed quick deck from raw artifacts, no Claude) and `GET /pitch-deck/outline-markdown` (renders the agent's structured outline to markdown for download).
+
+Phase 2 will render the outline to a real `.pptx` via python-pptx + a branded template library.
+
+---
+
+### Stage 16 — Delivery Planning (post-win)
+
+**Agent:** Implementation Plan Agent — `backend/app/services/implementation_plan_agent.py`
+**Tool name:** `submit_implementation_plan`
+**Endpoint:** `POST /api/opportunities/{id}/implementation-plan`
+**Precondition:** `Opportunity.deal_status == 'won'` (otherwise 400).
+**Inputs:** Stages 2, 9, 12, 13.
+**Output:** phased rollout plan.
+
+**Output schema:** `summary`, `phases[]` (3-6) of `{ name, timeframe, objectives[], milestones[], owner }`, `market_rollout[]` of `{ market, go_live, dependencies[] }`, `training_and_adoption[]`, `compliance_and_policy[]`, `in_flight_metrics[]` of `{ metric, target, review_cadence }`, `risks[]`.
+
+**What the agent is told:**
+
+> You are the Implementation Plan Agent. The pitch WAS WON. Your job is to turn the proposal into an executable post-win rollout plan the delivery team can run from day 1.
+>
+> Rules:
+> - 3-6 phases. Be concrete with weeks / months / quarters.
+> - Use the brief's market list. If 12 EMEA markets, propose a sane market_rollout sequence (lead-market first, fast-followers, long tail).
+> - Training & adoption: include client-side enablement (briefing standards, asset library training, tool access, AI tooling onboarding when delivery model is AI-supported/hybrid).
+> - Compliance: legal sign-off, brand guideline rollout, DAM access, tool policy approval, data-handling, IP & talent rights for any shoots.
+> - in_flight_metrics: pull from Stage 13 KPIs where present, add operational ones (on-time %, first-time-right %, brief-to-final cycle time, % work in SLA).
+> - Owner: real role names. Programme Director / Account Lead / Delivery Lead / Client Lead / Client Marketing Lead.
+
+The frontend's Stage 16 panel also handles the deal-status switcher (Active / Won / Lost / Deprioritized) and a lessons-learned textarea for Lost / Deprioritized deals.
+
+---
+
+### Stage 17 — Trigger Downstream Systems
+
+**Phase 2 deliverable.** Will push the approved opportunity to Salesforce, link out to SharePoint Sales Encyclopedia, and notify regional approvers + talent/recruitment via email/Slack.
+
+---
+
+## Cross-cutting concerns
+
+### Cost tracking
+
+Every Claude call records `input_tokens / output_tokens / cost_usd` (priced at $3/M input, $15/M output for Opus 4.7). The cost is rolled up onto the `Opportunity` (cumulative spend) and stamped on the produced `stage_artifact` (per run). The Stage 8 panel shows the cumulative total; every Claude-driven stage panel header shows the `AgentRunCost` pill for its latest run.
+
+A typical full walkthrough on a complex brief (Versuni-class, 25+ assets, 10+ markets) runs about $1.40–$1.50 of Anthropic spend across 26 calls.
+
+### Approvals + email
+
+`Approval` rows belong to a stage and a role (`commercial` / `delivery` / `solution` / `regional` / `deal_desk`). Creating one fires an in-app notification + Mailgun email with a unique `email_token` deeplink (`/approvals/by-token/:token`). The approver decides with optional notes; the opportunity owner is notified back.
+
+Mailgun is optional — when `MAILGUN_API_KEY` is empty the service logs the would-be payload and returns success, so dev environments work without credentials.
+
+### Auth
+
+Two paths in `backend/app/middleware/auth.py`:
+
+- **DEV_AUTH_BYPASS=true:** every request is treated as the configured `DEV_AUTH_EMAIL` / `DEV_AUTH_NAME` / `DEV_AUTH_ROLE`. The auth middleware upserts an `AppUser` row with that email + role on first hit. `scripts/seed_admin.py` runs at container start to ensure the admin row exists idempotently.
+- **DEV_AUTH_BYPASS unset (production with SSO):** validates the bearer token as an Azure Entra ID JWT. Audience is `AZURE_CLIENT_ID`, issuer is `https://login.microsoftonline.com/{AZURE_TENANT_ID}/v2.0`. JWKS is cached in process memory; the cache is invalidated and refetched on a key miss.
+
+Same upsert path either way — first authenticated visit creates an `AppUser` with role=editor (or whatever `DEV_AUTH_ROLE` says under bypass). Admins are promoted via `/api/users/{id}/role` (when the user-management UI lands) or by direct DB update.
+
+### Stage-machine rules (exact)
+
+- Creating an Opportunity inserts 17 `stage_states` rows: stage 1 `in_progress`, stages 2-17 `not_started`.
+- `/stages/{n}/complete`:
+  - 400 if stage is already `completed`.
+  - 400 if status is `not_started` (i.e. predecessor not done).
+  - For stage `n ∈ {3, 14}`: 400 if no `Approval` rows exist or any are not `approved`.
+  - On success: status → `completed`, `completed_at` stamped. Stage `n+1` flips to `in_progress`. `Opportunity.current_stage` advances.
+- Agent endpoints (`/intake`, `/diagnose`, `/match`, etc.) intentionally don't enforce stage state — the user can iterate freely until they "close" a stage.
+
+### Per-stage destructive cascades
+
+Re-running these stages wipes downstream artifacts intentionally:
+
+- **Stage 6 normalize** — wipes `ClientAsset` rows, which CASCADE-deletes `Match` rows and `RatecardLine` rows.
+- **Stage 7 match** — wipes existing `Match` rows for the opportunity's assets first.
+- **Stage 8 ratecard build** — wipes `RatecardLine` rows for the opportunity first.
+- **Stage 2 diagnose** — wipes `clarification_questions` rows where `source_stage=2` (preserving any that were added manually or seeded by other stages).
+
+Tests assert these cascade rules.
+
+---
+
+## Testing
+
+```bash
+python3 -m venv /tmp/osop_test_venv
+/tmp/osop_test_venv/bin/pip install -r backend/requirements-dev.txt psycopg2-binary
+cd backend && /tmp/osop_test_venv/bin/pytest tests/ -v
+```
+
+Current state: **118 tests / 109 pass / 9 skipped (real-Anthropic, marked `@pytest.mark.requires_anthropic`) / 0 failures.**
+
+Test files in `backend/tests/`:
+
+- `test_opportunity_crud.py` — CRUD + defaults + model_type round-trip
+- `test_stage_machine.py` — 17-stage init, advance, double-complete, out-of-order
+- `test_stage_gating.py` — stages 3 and 14 reject without approvals
+- `test_files.py` — upload/extract for `.txt / .md / .docx / .xlsx`, `.exe` rejected
+- `test_intake_agent.py` — Stage 1 endpoints (real-Claude path skipped)
+- `test_diagnosis.py` — Stage 2 + clarification CRUD
+- `test_qualification.py` — TROWLS scoring, thresholds, validation
+- `test_qa_pack.py` — Stage 4 Excel + Word exports
+- `test_assets.py` — Stage 6 ClientAsset CRUD + normalize errors
+- `test_matching.py` — Stage 7 selection rules + 400/404 paths
+- `test_ratecard.py` — Stage 8 + bug-4 invariant explicitly asserted
+- `test_team_shape.py` — Stage 11 FTE math + override precedence
+- `test_agents_stages_9_12_13.py` — error paths for the 3 simple agents
+- `test_approvals.py` — request/decide/token flow + cross-opp 404
+- `test_notifications.py` — bell + mark-read + unread count
+- `test_schema_sanity.py` — list endpoint shapes
+
+The harness hits the live backend on `http://localhost:8003` (override with `OSOP_BASE_URL`) — tests the deployed surface, not an in-process app. `OSOP_TEST_DSN` overrides the psycopg2 DSN used for direct DB seeding.
+
+---
+
+## Repo layout
+
+```
+oliver-sales-ops-platform/
+├── backend/                  # FastAPI + SQLAlchemy + Alembic
+│   ├── alembic/versions/     # 6 migrations: 0001 → 0006
+│   ├── app/
+│   │   ├── api/              # routers per domain (opportunities, approvals, …)
+│   │   ├── middleware/       # auth (Azure JWT + dev bypass + AppUser upsert)
+│   │   ├── models/           # SQLAlchemy ORM
+│   │   ├── schemas/          # Pydantic request/response models
+│   │   ├── services/         # 9 Claude agents + utility services
+│   │   └── utils/            # claude_client (cost tracking, debug log)
+│   ├── scripts/seed_admin.py # idempotent admin user seed (runs at boot)
+│   ├── start.sh              # alembic upgrade head → seed_admin → uvicorn
+│   ├── requirements.txt
+│   ├── requirements-dev.txt
+│   └── tests/                # 16 pytest files, 118 tests
+│
+├── frontend/                 # React 18 + Vite + TS
+│   └── src/
+│       ├── api/              # axios + TanStack Query hooks per domain
+│       ├── auth/             # MSAL setup + AuthProvider
+│       ├── components/       # StageStepper, NotificationBell, AgentRunCost,
+│       │                     # Stage1Intake … Stage16Delivery (one per stage)
+│       └── pages/            # Dashboard, NewOpportunity, OpportunityView,
+│                             # ApprovalView, About
+│
+├── deploy/
+│   ├── deploy.sh                  # idempotent deploy: ports, build, sync,
+│   │                              # apache render, health probe
+│   ├── apache-osop.conf.tmpl      # Apache reverse-proxy template
+│   └── apache-osop.conf           # generated per-deploy (gitignored)
+│
+├── docker-compose.yml        # db + redis + backend (frontend behind dev profile)
+├── .env.example              # all env vars with safe defaults
+└── README.md                 # this file
+```
+
+---
+
+**License:** internal OLIVER tool. Not for redistribution.
+**Repo:** https://bitbucket.org/zlalani/oliver-sales-ops-platform