Add README with per-stage agent reference

A single document a human can read to understand the whole pipeline:

- High-level: what the platform is + a 17-row stage table.
- Quick start (local dev) + smoke test commands.
- Deployment to optical-dev: deploy.sh flow, port-pick logic,
  Apache include line, what to do on first boot vs re-run.
- Architecture summary (state machine, artifacts, approvals, cost).
- Per-stage agent reference (the heart of the doc): for each of the
  9 Claude agents — what it's for, what it reads, output schema, and
  the system-prompt rules in plain English. Plus the non-Claude
  stages (qualification scorecard, Q&A export, ratecard build,
  efficiency profile, team shape) explained the same way.
- Cross-cutting concerns: cost tracking, approvals + Mailgun, auth
  paths (dev-bypass vs Azure JWT), exact stage-machine rules,
  destructive-cascade rules per stage.
- Testing: how to run the 118-test suite + what each file covers.
- Repo layout.

Goal: someone landing on the repo can read this front-to-back in 10
minutes and know what every stage does, what each agent is told, and
how to run / deploy / test the thing.
This commit is contained in:
DJP 2026-04-27 18:31:21 -04:00
parent b553bef43a
commit 7254d95f32

618
README.md Normal file
View file

@ -0,0 +1,618 @@
# OLIVER Sales Operations Platform
End-to-end RFP → mobilization pipeline for OLIVER's commercial team. Drop in a brief, walk it through 17 stages (intake → qualification → asset matching → ratecard → delivery model → team shape → caveats → approval gates → pitch → post-win planning), and produce a defensible proposal.
V2 of the GMAL Scope Builder. Phase 1 covers stages 116; stage 17 (downstream system push to Salesforce / SharePoint) is deferred to Phase 2.
---
## Table of contents
- [What it does](#what-it-does)
- [The 17 stages](#the-17-stages)
- [Quick start (local dev)](#quick-start-local-dev)
- [Deployment (optical-dev)](#deployment-optical-dev)
- [Architecture](#architecture)
- [Per-stage agent reference](#per-stage-agent-reference)
- [Cross-cutting concerns](#cross-cutting-concerns)
- [Testing](#testing)
- [Repo layout](#repo-layout)
---
## What it does
A single Opportunity (the V1 "Project") progresses through a 17-stage state machine. Each stage either:
- runs a specialised Claude agent over upstream artifacts and produces structured JSON,
- captures a human decision (qualification scorecard, approval, deal status), or
- exports a deliverable (Q&A pack, ratecard, pitch deck markdown).
Stages cannot be skipped — `/stages/{n}/complete` enforces the order. Two stages (3 — Qualification, 14 — Approval Gate) require explicit `Approval` rows to be approved before they'll close. Approvals fan out via in-app notifications and Mailgun email.
Every Claude call records its cost on the produced `stage_artifact` (tokens in / out / USD) so the per-stage spend is visible in the UI.
---
## The 17 stages
| # | Stage | Status | Driver |
|--|--|--|--|
| 1 | Intake Opportunity | ✅ | Claude (Intake Agent) |
| 2 | Read & Diagnose Brief | ✅ | Claude (Diagnosis Agent) |
| 3 | Qualification Assessment | ✅ gated | Human (TROWLS) + approval |
| 4 | Generate Client Q&A Pack | ✅ | Excel/Word export |
| 5 | Ingest Client Answers | ✅ | Human edit |
| 6 | Normalize Asset List | ✅ | Claude (Asset Normalizer) |
| 7 | Match Assets to Job Routes | ✅ | Claude (Match Agent) |
| 8 | Build Asset-Level Rate Card | ✅ | Pure-Python (GMAL hours × volume) |
| 9 | Recommend Delivery Model | ✅ | Claude (Delivery Model Agent) |
| 10 | Apply Efficiency Logic | ✅ | Human (sliders, persisted) |
| 11 | Create Draft Team Shape | ✅ | Pure-Python (FTE calc) |
| 12 | Identify Capability Gaps | ✅ | Claude (Capability Gap Agent) |
| 13 | Generate Support Docs | ✅ | Claude (Support Docs Agent) |
| 14 | Validation & Approval Gates | ✅ gated | Approval flow |
| 15 | Build Pitch Materials | ✅ | Claude (Pitch Deck Agent) + markdown stub |
| 16 | Delivery Planning (post-win) | ✅ | Claude (Implementation Plan Agent) |
| 17 | Trigger Downstream Systems | ⏳ Phase 2 | Salesforce / SharePoint push |
---
## Quick start (local dev)
```bash
git clone git@bitbucket.org:zlalani/oliver-sales-ops-platform.git
cd oliver-sales-ops-platform
cp .env.example .env
$EDITOR .env # set ANTHROPIC_API_KEY at minimum
# Local dev mode: brings up db + redis + backend + Vite dev container
COMPOSE_PROFILES=dev docker compose up -d
# Runs at:
# http://localhost:3011/oliver-sales-ops-platform/ ← frontend (Vite HMR)
# http://localhost:8003/api/health ← backend
# localhost:5435 / localhost:6380 ← postgres / redis
```
`.env` defaults give you `DEV_AUTH_BYPASS=true` so you skip the MSAL login gate and land as the admin user (`admin@oliver.agency`). The backend's auth middleware reads `DEV_AUTH_EMAIL` / `DEV_AUTH_NAME` / `DEV_AUTH_ROLE`.
### Smoke test
```bash
curl http://localhost:8003/api/health
# {"status":"ok","db":"ok"}
curl http://localhost:8003/api/users/me
# {"id":N,"email":"admin@oliver.agency","name":"OSOP Admin","role":"admin",...}
```
### Tests
```bash
python3 -m venv /tmp/osop_test_venv
/tmp/osop_test_venv/bin/pip install -r backend/requirements-dev.txt psycopg2-binary
cd backend && /tmp/osop_test_venv/bin/pytest tests/ -v
# 118 tests, 109 pass, 9 skip (real-Anthropic), 0 fail
```
---
## Deployment (optical-dev)
The dev server hosts a stack of internal apps under one Apache vhost, each at its own URL prefix and backend port. This app sits at `/oliver-sales-ops-platform/`.
```bash
sudo git clone git@bitbucket.org:zlalani/oliver-sales-ops-platform.git /opt/oliver-sales-ops-platform
cd /opt/oliver-sales-ops-platform
sudo cp .env.example .env && sudo $EDITOR .env
# → set ANTHROPIC_API_KEY
# → set APP_PUBLIC_URL=https://optical-dev.oliver.solutions
# → leave DEV_AUTH_BYPASS=true until SSO is wired
sudo ./deploy/deploy.sh
```
**What `deploy.sh` does:**
1. Auto-picks free host ports (`OSOP_DB_PORT` / `OSOP_REDIS_PORT` / `OSOP_BACKEND_PORT`). If 8003 is taken, it scans 8004→8099 and persists the chosen port back to `.env`.
2. Renders `deploy/apache-osop.conf` from `apache-osop.conf.tmpl` with the chosen backend port substituted in.
3. `git pull && docker compose build && up -d` (db + redis + backend).
4. Builds the Vite SPA in a one-shot `node:20` container, syncs the `dist/` to `/var/www/html/oliver-sales-ops-platform/`. Pipes `DEV_AUTH_BYPASS` through to `VITE_DEV_AUTH_BYPASS` so the SPA matches the backend's auth setting.
5. Polls `/api/health` until ready, prints URLs + admin email.
Flags: `--no-pull`, `--no-build`, `--no-frontend`, `--logs`.
**Apache wiring (one-time):**
```bash
echo 'Include /opt/oliver-sales-ops-platform/deploy/apache-osop.conf' \
| sudo tee -a /etc/apache2/sites-enabled/optical-dev.oliver.solutions.conf
sudo apachectl configtest && sudo systemctl reload apache2
```
The conf is regenerated on every deploy. If the backend port changes (because something else grabbed 8003), `deploy.sh` will tell you to reload Apache again.
---
## Architecture
**Backend.** FastAPI + async SQLAlchemy + Alembic on Python 3.12. Postgres 16 for state, Redis for future Celery work. Anthropic SDK against Claude Opus 4.7. Azure JWT validation via python-jose; `DEV_AUTH_BYPASS=true` swaps the JWT path for a configurable identity (used until SSO is wired on the dev server).
**Frontend.** React 18 + Vite + TypeScript. TanStack Query for server state, React Router for the per-stage views. MSAL (Azure SSO) is in the box but bypassed on the dev server. Mermaid renders the 17-stage flowchart on the About page.
**Stages as a state machine.** Every Opportunity has 17 `stage_states` rows (created on opportunity creation; stage 1 starts `in_progress`, the rest `not_started`). `/stages/{n}/complete` validates the predecessor is completed and, for stages 3 and 14, that all linked `Approval` rows are `approved`.
**Artifacts.** Each agent run persists a `stage_artifact` row carrying its JSON output + cost stamp (`cost_usd` / `input_tokens` / `output_tokens`). Re-runs produce new artifacts; the UI shows the latest per stage.
**Approvals.** A user with EDITOR or ADMIN role can request approvals on stages 3 / 14. Each approval gets a unique `email_token` and fires a Mailgun email with a deeplink. The approver clicks through, lands on `/approvals/:id`, decides (with optional notes), and the opportunity owner gets notified.
**Cost tracking.** Per-call usage is rolled up onto the `Opportunity` (cumulative) and stamped on the `stage_artifact` (per run). The Stage 8 panel shows the running total; every agent panel header shows its run cost.
---
## Per-stage agent reference
This section is the operator's manual for each stage — what the agent does, what it reads, what it produces, and the rules baked into its prompt.
### Stage 1 — Intake Opportunity
**Agent:** Intake Agent — `backend/app/services/intake_agent.py`
**Tool name:** `submit_intake_metadata`
**Endpoint:** `POST /api/opportunities/{id}/intake`
**Inputs:** all uploaded `OpportunityFile` rows (concat to ~150k char cap).
**Output:** structured opportunity metadata.
**Output schema:** `client_name`, `region`, `brands[]`, `service_types[]`, `deadline_iso`, `go_live_iso`, `summary`.
**What the agent is told:**
> You are the Intake Agent for the OLIVER Sales Ops Platform. Your job is to read the documents an account team has uploaded for a new opportunity and extract a tight structured summary so the rest of the pipeline can plan the response.
>
> Be conservative — only fill fields you can ground in the documents. Omit a field rather than guessing. Brands should be specific consumer-facing brands, not the umbrella client. Service types should be concrete categories (Content, eCommerce, Social, CRM, Production, Strategy, etc.) not vague phrases. Dates must be ISO (YYYY-MM-DD); convert phrases like "end of Q2 2026" to a sensible explicit date and mention the original phrase in the summary.
**Side effect:** populates `Opportunity.client_name` / `region` / `brands` / `service_types` / `deadline` / `go_live` / `description` ONLY when those fields are blank — manual entries are never overwritten.
---
### Stage 2 — Read & Diagnose Brief
**Agent:** Diagnosis Agent — `backend/app/services/diagnosis_agent.py`
**Tool name:** `submit_brief_diagnosis`
**Endpoint:** `POST /api/opportunities/{id}/diagnose`
**Inputs:** all uploaded files (same as Stage 1).
**Output:** structured diagnosis + ≥6 client clarification questions.
**Output schema:** `deliverables[]` (name / category / volume_estimate / complexity_hint), `channels[]`, `markets[]`, `capabilities_required[]`, `kpis_slas[]`, `tech_asks[]`, `timelines[]`, `ambiguities[]`, `contradictions[]`, `complexity_assessment` (low/medium/high), `summary`, `clarifications[]` (category / question / rationale / priority red|amber|green).
**What the agent is told:**
> You are the Diagnosis Agent for the OLIVER Sales Ops Platform. The account team has uploaded an RFP / brief and you must read it like a senior strategist preparing a scope.
>
> For deliverables: name specific assets, not vague phrases. "Paid social statics" is better than "social content". Include volume even when fuzzy — capture the brief's own words ("TBC", "~50/month", "on demand").
>
> For ambiguities: anything the brief leaves open that affects effort. "Multiple markets" without a list is an ambiguity. "Standard turnaround" without days is an ambiguity. "Feedback iterations" without a count is an ambiguity.
>
> For contradictions: explicit conflicts (timeline says Q2, go-live says Q1; volumes say 200 in one section and 500 in another).
>
> **For clarifications (MANDATORY — minimum 6, target 815):** ambiguities and contradictions are observations; clarifications are the actual *questions* you'd send the client to resolve them. Every ambiguity you list MUST also appear here as an actionable, question-marked sentence with a category + priority + rationale. This is what Stage 4 packages into the client Q&A pack — if you leave it empty, the rest of the pipeline breaks.
**Side effect:** wipes existing `source_stage=2` clarification rows and inserts fresh ones. Stage 5 ingests answers against these.
---
### Stage 3 — Qualification Assessment (gated)
**Driver:** human — no Claude call.
**Endpoints:**
`POST /api/opportunities/{id}/qualification` (save TROWLS scorecard)
`GET /api/opportunities/{id}/qualification`
`POST /api/opportunities/{id}/stages/3/approvals` (request approval)
`POST /api/approvals/{id}/decision` (approve / reject)
**TROWLS dimensions** (each scored 010):
- **T**iming — do we have time to win and deliver?
- **R**elationship — how well do we know the client / decision-makers?
- **O**pportunity — deal size, strategic value, future pipeline.
- **W**hat We Know — quality of brief, market data, prior knowledge.
- **L**ocation — entity availability, hiring viability, labour law fit.
- **S**ector — sector experience + conflict-list checks.
Total 060 → percentage 0100 → recommendation: `proceed` (≥60), `slt_review` (≥50), `no_go` (<50). Stamped on `Opportunity.qualification_score`.
**Gate.** `/stages/3/complete` 400s unless at least one `Approval` row exists for stage 3 AND every approval is `approved`.
---
### Stage 4 — Generate Client Q&A Pack
**Driver:** pure-Python export — no Claude call.
**Endpoints:**
`GET /api/opportunities/{id}/qa-pack/excel` → .xlsx
`GET /api/opportunities/{id}/qa-pack/word` → .docx
Reads the `clarification_questions` table seeded by Stage 2, sorts by priority (RED → AMBER → GREEN), groups by category, and exports a populated client-facing pack. Excel has a colour-coded priority column + free-text answer column. Word groups by category with rationale + answer slot per question.
---
### Stage 5 — Ingest Client Answers
**Driver:** human — no Claude call.
**Endpoint:** `PUT /api/opportunities/{id}/clarifications/{cid}`
Per-question editor on the frontend. The user pastes the client's reply into each `client_answer` field; status auto-flips to `answered` and `answered_at` is stamped. Items can also be `dismissed` (out of scope) or `pending` again.
---
### Stage 6 — Normalize Asset List
**Agent:** Asset Normalizer — `backend/app/services/asset_normalizer.py`
**Tool name:** `submit_normalized_assets`
**Endpoint:** `POST /api/opportunities/{id}/assets/normalize`
**Inputs:** uploaded files + Stage 2 diagnosis (when present).
**Output:** clean `ClientAsset[]` list.
**Output schema:** `assets[]` of `{ raw_name, raw_description, client_tier, volume }`.
**What the agent is told:**
> You are the Asset Normalizer for the OLIVER Sales Ops Platform. Each row is something a creative agency can scope hours against.
>
> Rules:
> - One asset per row. If the brief says "toolbox A/B/C", emit three rows (one per tier).
> - Use specific names ('PDP hero banner', not 'web content').
> - Capture tier letters / bands when the brief uses them. Leave client_tier blank if it doesn't.
> - Volume is integer. If the brief says ranges ("100-200"), pick the midpoint. If TBC, set 1 and put "Volume TBC — confirm with client" in the description.
> - Do NOT invent assets the brief doesn't mention. Be exhaustive but honest.
**Side effect:** wipes existing `ClientAsset` rows for the opportunity (cascading to matches + ratecard_lines), then inserts fresh ones.
---
### Stage 7 — Match Assets to Job Routes
**Agent:** Match Agent — `backend/app/services/ai_matching.py`
**Tool name:** `submit_matches`
**Endpoint:** `POST /api/opportunities/{id}/match` (background task) → `GET /matches`
**Inputs:** each ClientAsset + the **full GMAL catalog** (~243 hour-route entries, ~3-20k tokens depending on AI-enhanced descriptions) + Stage 2 brief context.
**Output:** up to 3 candidate `Match` rows per ClientAsset, ranked.
**Output schema:** `matches[]` of `{ gmal_id, confidence (exact|close|multiple|none), confidence_score (0-1), reasoning, caveats }`.
**What the agent is told:**
> You are a GMAL asset matching specialist for a creative production agency. You match client-described assets to the closest entry in the GMAL catalog.
>
> Guidelines:
> - Match on the TYPE of deliverable first, then complexity.
> - Bridge terminology: "KV" / "Key Visual" = Photography GMALs; "PDP" / "product listing" = eCommerce / Copywriting GMALs; "launch video" = Campaign Video GMALs; "social post" = Social GMALs; "banner" / "display" = Display / Standard Banner GMALs.
> - Return your single best match. Only add a 2nd/3rd if within 5% of the top score.
> - exact: 0.91.0. close: 0.60.89. none: <0.3.
> - Always state caveats — what the GMAL covers vs what the client described.
> - Match complexity literally — "simple banner" → Simple GMAL, not Complex.
**Side effects:**
- Wipes prior matches for this opportunity's assets first.
- Auto-selects the rank-1 match when its score ≥ 0.8.
- Saves a `matching_run` `stage_artifact` with the run total cost + counts.
- Toggling `is_selected` to true on one match auto-deselects siblings (one selection per asset).
---
### Stage 8 — Build Asset-Level Rate Card
**Driver:** pure-Python — no Claude call.
**Service:** `backend/app/services/ratecard_builder.py`
**Endpoints:**
`POST /api/opportunities/{id}/ratecard/build`
`GET /api/opportunities/{id}/ratecard`
For each `ClientAsset` with a selected match, looks up `GmalHours[gmal_asset, model_type]` and creates a `RatecardLine` per role.
**Bug-4 invariant (carried forward from V1).** `RatecardLine.total_hours` stores hours **per 1 asset** (= `base_hours`); `volume` lives on the row. Aggregators (this endpoint, team_shape, Excel matrix, frontend) multiply by `volume` themselves when computing total effort. Tests assert this directly.
---
### Stage 9 — Recommend Delivery Model
**Agent:** Delivery Model Agent — `backend/app/services/delivery_model_agent.py`
**Tool name:** `submit_delivery_model`
**Endpoint:** `POST /api/opportunities/{id}/delivery-model`
**Inputs:** Stage 2 diagnosis + opportunity's GMAL `model_type`.
**Output:** headline + per-workflow-stage breakdown.
**Output schema:** `headline` (`traditional` | `ai_supported` | `hybrid`), `summary`, `workflow_stages[]` of `{ stage, approach (manual|ai_supported|fully_automated), tooling[], rationale }`, `tooling_caveats[]`, `risks[]`.
**What the agent is told:**
> You are the Delivery Model Agent for the OLIVER Sales Ops Platform. The team has diagnosed the brief; now you recommend HOW the work should be delivered. Be honest about where AI tooling is genuinely productive vs. where it would dent quality.
>
> Tool capability cheat-sheet (as of 2026):
> - Pencil: paid-social statics + simple static digital ads. Not motion. Not print.
> - Creative-X: brand-policy automated checks across digital assets.
> - Semblance: video tooling, mostly cutdowns/edits.
> - OMG: media/programmatic dynamic creative.
> - Google Vids / Synthesia: simple internal/eLearning motion only.
> - Photography / TVC origination / Print mastering: manual.
>
> Mastering and origination are typically manual. Adaptation, localisation, social statics, and digital ad cut-downs are good AI candidates. Approvals, brand QA, and client review stay manual. If the brief includes motion at scale or print, the headline should be "hybrid" or "traditional" — don't oversell AI.
---
### Stage 10 — Apply Efficiency Logic
**Driver:** human — no Claude call.
**Endpoints:**
`POST /api/opportunities/{id}/efficiency-profile` (save)
`GET /api/opportunities/{id}/efficiency-profile`
UI offers a scenario picker (Conservative / Moderate / Aggressive), a blanket-percentage slider, per-discipline override sliders (capped at 90%), tools-applied chips, and a notes field. Saved as a `stage_artifact` for the audit trail. Stage 11 reads these to compute team shape.
Programme roles (Programme Director, Head of Project Management, etc.) are never reduced by efficiency, regardless of discipline-level overrides.
---
### Stage 11 — Create Draft Team Shape
**Driver:** pure-Python — no Claude call.
**Service:** `backend/app/services/team_shape.py`
**Endpoint:** `GET /api/opportunities/{id}/team-shape?efficiency_pct=N&discipline_overrides={"Creative":40}`
Aggregates `RatecardLine` hours per role (multiplying `base_hours × volume` per the bug-4 invariant), divides by `HOURS_PER_FTE = 1800`, applies efficiency (per-discipline overrides take precedence over the blanket percentage; programme roles always 0%), caps at `MAX_EFFICIENCY = 90`. Returns the FTE table grouped by discipline.
---
### Stage 12 — Identify Capability Gaps
**Agent:** Capability Gap Agent — `backend/app/services/capability_gap_agent.py`
**Tool name:** `submit_capability_gaps`
**Endpoint:** `POST /api/opportunities/{id}/capability-gaps`
**Inputs:** Stage 2 diagnosis + opportunity context.
**Output:** in-scope core list + gaps with sourcing recommendations.
**Output schema:** `core_in_scope[]`, `gaps[]` of `{ capability, criticality (red|amber|green), suggested_source (internal_sme|brandtech_partner|external_vendor), suggested_partner, rationale }`, `summary`.
**What the agent is told:**
> You are the Capability Gap Agent for the OLIVER Sales Ops Platform. OLIVER core capability is: in-house creative + content production (statics, motion, social, eCom, CRM, retail, basic strategy). Things OLIVER typically partners on:
>
> - SEO + organic search → Jellyfish.
> - Performance media buying → media-specialist Brandtech partners.
> - Social strategy at depth / community management → Gravity Road.
> - TVC production at scale → external production company.
> - Influencer/talent management → external agencies.
> - Localisation in long-tail markets → external translators.
>
> Be specific. If the brief doesn't ask for it, don't list it. Mark criticality honestly: RED = we can't deliver without solving this, AMBER = we should partner but could limp, GREEN = nice-to-have.
---
### Stage 13 — Generate Support Docs
**Agent:** Support Docs Agent — `backend/app/services/support_docs_agent.py`
**Tool name:** `submit_support_docs`
**Endpoint:** `POST /api/opportunities/{id}/support-docs`
**Inputs:** Stages 2 (diagnosis), 9 (delivery model), 12 (capability gaps).
**Output:** caveats / assumptions / SLAs / KPIs / governance.
**Output schema:** `caveats[]`, `assumptions[]`, `slas[]` (≥3, of `{ deliverable, v1_days, v2_days, v3_days, responsible_party, notes }`), `kpis[]` (≥3, of `{ metric, target, measurement }`), `governance[]`, `summary`.
**What the agent is told:**
> You are the Support Docs Agent. Your job is to author the caveats, assumptions, SLAs, KPIs and governance clauses that go into the proposal.
>
> Default SLA pattern (mirrors PARASOL Studio SLAs 2024):
> - 24-48h to acknowledge a brief.
> - 2 rounds of amends standard.
> - Static digital ad (5 assets): V1 5 days, V2 2 days, V3 1 day.
> - eCom hero set: V1 7 days, V2 3 days, V3 1 day.
> - CRM email template: V1 7 days, V2 3 days, V3 1 day.
> - Motion social (one master + adapts): V1 14 days, V2 5 days, V3 2 days.
> Adjust based on stated timeline pressure and the delivery model.
>
> Caveats default exclusions (carry forward unless brief explicitly includes): third-party fees, stock photography, talent fees, music licensing, shoot/production costs, localisation beyond N markets, languages beyond N. Be specific.
>
> Assumptions: anchor the price. "Brand templates supplied", "masters supplied", "feedback received within 48h", "final approval from a single named decision-maker", etc.
>
> KPIs: be honest about what's measurable. Quality, on-time delivery, % of jobs within SLA, first-time-right rate. Avoid vague aspirations.
Phase 2 will swap the prompt-baked defaults for a real `template_library` sourced from PARASOL_Studio_SLAs_2024_V1.pptx.
---
### Stage 14 — Validation & Approval Gates (gated)
**Driver:** approval flow only — no agent.
**Endpoints:**
`POST /api/opportunities/{id}/stages/14/approvals` (admin requests an approval per role)
`POST /api/approvals/{id}/decision` (approver decides)
`POST /api/opportunities/{id}/stages/14/complete`
Same approval mechanism as Stage 3. Roles requested per the EMEA-DOA chain (commercial / delivery / solution / regional / deal_desk). Each request fires an in-app notification + Mailgun email with a deeplink to the approval page (`/approvals/:id` or `/approvals/by-token/:token`). The opportunity owner gets notified back when each decision lands.
---
### Stage 15 — Build Pitch Materials
**Agent:** Pitch Deck Agent — `backend/app/services/pitch_deck_agent.py`
**Tool name:** `submit_pitch_deck_outline`
**Endpoint:** `POST /api/opportunities/{id}/pitch-deck`
**Inputs:** every upstream artifact (intake, diagnosis, qualification, delivery model, capability gaps, support docs) + normalized assets list.
**Output:** structured slide-by-slide outline.
**Output schema:** `headline`, `deck_summary`, `slides[]` of `{ section (cover|context|approach|scope|team|commercials|governance|next_steps), title, key_points[] (3-6), speaker_notes, data_callout }`, `appendix[]`.
**What the agent is told:**
> You are the Pitch Deck Agent. The team has run the full intake → diagnosis → qualification → match → ratecard → delivery model → team shape → capability → caveats pipeline. Compose a tight, client-facing pitch deck outline that turns the platform's structured outputs into a slide-by-slide flow.
>
> Rules:
> - 8-12 slides for a sales deck. Be ruthless. Don't pad.
> - Section flow: cover → context → approach → scope → team → commercials → governance → next steps.
> - Every slide has 3-6 punchy bullets, not paragraphs. Speaker notes carry the prose.
> - Use real numbers from the platform: FTE, total hours, asset counts, cost, AI savings %, qualification score. Surface them as data_callouts.
> - Don't invent. If the platform hasn't produced something, omit the slide rather than fabricate.
**Also exposes** `GET /pitch-deck/markdown` (auto-composed quick deck from raw artifacts, no Claude) and `GET /pitch-deck/outline-markdown` (renders the agent's structured outline to markdown for download).
Phase 2 will render the outline to a real `.pptx` via python-pptx + a branded template library.
---
### Stage 16 — Delivery Planning (post-win)
**Agent:** Implementation Plan Agent — `backend/app/services/implementation_plan_agent.py`
**Tool name:** `submit_implementation_plan`
**Endpoint:** `POST /api/opportunities/{id}/implementation-plan`
**Precondition:** `Opportunity.deal_status == 'won'` (otherwise 400).
**Inputs:** Stages 2, 9, 12, 13.
**Output:** phased rollout plan.
**Output schema:** `summary`, `phases[]` (3-6) of `{ name, timeframe, objectives[], milestones[], owner }`, `market_rollout[]` of `{ market, go_live, dependencies[] }`, `training_and_adoption[]`, `compliance_and_policy[]`, `in_flight_metrics[]` of `{ metric, target, review_cadence }`, `risks[]`.
**What the agent is told:**
> You are the Implementation Plan Agent. The pitch WAS WON. Your job is to turn the proposal into an executable post-win rollout plan the delivery team can run from day 1.
>
> Rules:
> - 3-6 phases. Be concrete with weeks / months / quarters.
> - Use the brief's market list. If 12 EMEA markets, propose a sane market_rollout sequence (lead-market first, fast-followers, long tail).
> - Training & adoption: include client-side enablement (briefing standards, asset library training, tool access, AI tooling onboarding when delivery model is AI-supported/hybrid).
> - Compliance: legal sign-off, brand guideline rollout, DAM access, tool policy approval, data-handling, IP & talent rights for any shoots.
> - in_flight_metrics: pull from Stage 13 KPIs where present, add operational ones (on-time %, first-time-right %, brief-to-final cycle time, % work in SLA).
> - Owner: real role names. Programme Director / Account Lead / Delivery Lead / Client Lead / Client Marketing Lead.
The frontend's Stage 16 panel also handles the deal-status switcher (Active / Won / Lost / Deprioritized) and a lessons-learned textarea for Lost / Deprioritized deals.
---
### Stage 17 — Trigger Downstream Systems
**Phase 2 deliverable.** Will push the approved opportunity to Salesforce, link out to SharePoint Sales Encyclopedia, and notify regional approvers + talent/recruitment via email/Slack.
---
## Cross-cutting concerns
### Cost tracking
Every Claude call records `input_tokens / output_tokens / cost_usd` (priced at $3/M input, $15/M output for Opus 4.7). The cost is rolled up onto the `Opportunity` (cumulative spend) and stamped on the produced `stage_artifact` (per run). The Stage 8 panel shows the cumulative total; every Claude-driven stage panel header shows the `AgentRunCost` pill for its latest run.
A typical full walkthrough on a complex brief (Versuni-class, 25+ assets, 10+ markets) runs about $1.40$1.50 of Anthropic spend across 26 calls.
### Approvals + email
`Approval` rows belong to a stage and a role (`commercial` / `delivery` / `solution` / `regional` / `deal_desk`). Creating one fires an in-app notification + Mailgun email with a unique `email_token` deeplink (`/approvals/by-token/:token`). The approver decides with optional notes; the opportunity owner is notified back.
Mailgun is optional — when `MAILGUN_API_KEY` is empty the service logs the would-be payload and returns success, so dev environments work without credentials.
### Auth
Two paths in `backend/app/middleware/auth.py`:
- **DEV_AUTH_BYPASS=true:** every request is treated as the configured `DEV_AUTH_EMAIL` / `DEV_AUTH_NAME` / `DEV_AUTH_ROLE`. The auth middleware upserts an `AppUser` row with that email + role on first hit. `scripts/seed_admin.py` runs at container start to ensure the admin row exists idempotently.
- **DEV_AUTH_BYPASS unset (production with SSO):** validates the bearer token as an Azure Entra ID JWT. Audience is `AZURE_CLIENT_ID`, issuer is `https://login.microsoftonline.com/{AZURE_TENANT_ID}/v2.0`. JWKS is cached in process memory; the cache is invalidated and refetched on a key miss.
Same upsert path either way — first authenticated visit creates an `AppUser` with role=editor (or whatever `DEV_AUTH_ROLE` says under bypass). Admins are promoted via `/api/users/{id}/role` (when the user-management UI lands) or by direct DB update.
### Stage-machine rules (exact)
- Creating an Opportunity inserts 17 `stage_states` rows: stage 1 `in_progress`, stages 2-17 `not_started`.
- `/stages/{n}/complete`:
- 400 if stage is already `completed`.
- 400 if status is `not_started` (i.e. predecessor not done).
- For stage `n ∈ {3, 14}`: 400 if no `Approval` rows exist or any are not `approved`.
- On success: status → `completed`, `completed_at` stamped. Stage `n+1` flips to `in_progress`. `Opportunity.current_stage` advances.
- Agent endpoints (`/intake`, `/diagnose`, `/match`, etc.) intentionally don't enforce stage state — the user can iterate freely until they "close" a stage.
### Per-stage destructive cascades
Re-running these stages wipes downstream artifacts intentionally:
- **Stage 6 normalize** — wipes `ClientAsset` rows, which CASCADE-deletes `Match` rows and `RatecardLine` rows.
- **Stage 7 match** — wipes existing `Match` rows for the opportunity's assets first.
- **Stage 8 ratecard build** — wipes `RatecardLine` rows for the opportunity first.
- **Stage 2 diagnose** — wipes `clarification_questions` rows where `source_stage=2` (preserving any that were added manually or seeded by other stages).
Tests assert these cascade rules.
---
## Testing
```bash
python3 -m venv /tmp/osop_test_venv
/tmp/osop_test_venv/bin/pip install -r backend/requirements-dev.txt psycopg2-binary
cd backend && /tmp/osop_test_venv/bin/pytest tests/ -v
```
Current state: **118 tests / 109 pass / 9 skipped (real-Anthropic, marked `@pytest.mark.requires_anthropic`) / 0 failures.**
Test files in `backend/tests/`:
- `test_opportunity_crud.py` — CRUD + defaults + model_type round-trip
- `test_stage_machine.py` — 17-stage init, advance, double-complete, out-of-order
- `test_stage_gating.py` — stages 3 and 14 reject without approvals
- `test_files.py` — upload/extract for `.txt / .md / .docx / .xlsx`, `.exe` rejected
- `test_intake_agent.py` — Stage 1 endpoints (real-Claude path skipped)
- `test_diagnosis.py` — Stage 2 + clarification CRUD
- `test_qualification.py` — TROWLS scoring, thresholds, validation
- `test_qa_pack.py` — Stage 4 Excel + Word exports
- `test_assets.py` — Stage 6 ClientAsset CRUD + normalize errors
- `test_matching.py` — Stage 7 selection rules + 400/404 paths
- `test_ratecard.py` — Stage 8 + bug-4 invariant explicitly asserted
- `test_team_shape.py` — Stage 11 FTE math + override precedence
- `test_agents_stages_9_12_13.py` — error paths for the 3 simple agents
- `test_approvals.py` — request/decide/token flow + cross-opp 404
- `test_notifications.py` — bell + mark-read + unread count
- `test_schema_sanity.py` — list endpoint shapes
The harness hits the live backend on `http://localhost:8003` (override with `OSOP_BASE_URL`) — tests the deployed surface, not an in-process app. `OSOP_TEST_DSN` overrides the psycopg2 DSN used for direct DB seeding.
---
## Repo layout
```
oliver-sales-ops-platform/
├── backend/ # FastAPI + SQLAlchemy + Alembic
│ ├── alembic/versions/ # 6 migrations: 0001 → 0006
│ ├── app/
│ │ ├── api/ # routers per domain (opportunities, approvals, …)
│ │ ├── middleware/ # auth (Azure JWT + dev bypass + AppUser upsert)
│ │ ├── models/ # SQLAlchemy ORM
│ │ ├── schemas/ # Pydantic request/response models
│ │ ├── services/ # 9 Claude agents + utility services
│ │ └── utils/ # claude_client (cost tracking, debug log)
│ ├── scripts/seed_admin.py # idempotent admin user seed (runs at boot)
│ ├── start.sh # alembic upgrade head → seed_admin → uvicorn
│ ├── requirements.txt
│ ├── requirements-dev.txt
│ └── tests/ # 16 pytest files, 118 tests
├── frontend/ # React 18 + Vite + TS
│ └── src/
│ ├── api/ # axios + TanStack Query hooks per domain
│ ├── auth/ # MSAL setup + AuthProvider
│ ├── components/ # StageStepper, NotificationBell, AgentRunCost,
│ │ # Stage1Intake … Stage16Delivery (one per stage)
│ └── pages/ # Dashboard, NewOpportunity, OpportunityView,
│ # ApprovalView, About
├── deploy/
│ ├── deploy.sh # idempotent deploy: ports, build, sync,
│ │ # apache render, health probe
│ ├── apache-osop.conf.tmpl # Apache reverse-proxy template
│ └── apache-osop.conf # generated per-deploy (gitignored)
├── docker-compose.yml # db + redis + backend (frontend behind dev profile)
├── .env.example # all env vars with safe defaults
└── README.md # this file
```
---
**License:** internal OLIVER tool. Not for redistribution.
**Repo:** https://bitbucket.org/zlalani/oliver-sales-ops-platform