Full-stack app that turns HP customer briefs (master asset + regional supporting docs) into a set of branded Word deliverables via a RAG + agent pipeline. Stack - FastAPI + SQLAlchemy + pgvector + RQ (backend, Python 3.12) - React + Vite + TypeScript + Tailwind + TanStack Query (frontend) - Claude Opus 4.7 (generation) + Haiku 4.5 (translation/OCR) - Voyage voyage-3 or OpenAI text-embedding-3-small (embeddings) - python-docx (branded Word output, Montserrat + HP blue) - Docker Compose (5 services) Features - 6 built-in deliverable types (leadership themes, regional enrichment, LinkedIn posts, webinar spec, infographic specs, ABM enablement) - Data-driven deliverable types: admins add new types at runtime via prompt + JSON schema + template_json — no code, no deploy - Generic schema-driven review form + generic Word template renderer - Document ingestion pipeline with translation, chunking, pgvector RAG - Pluggable auth provider (password now, Entra SSO later); admin/user roles - Re-roll / retry on every deliverable; cascading delete; brief editing; inline document upload; progress hints; router-level ErrorBoundary - Admin panel with test-render preview for new deliverable types - Help page at /help with architecture overview and usage guide 82 backend tests passing, 18 skipped (gated live-API tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.5 KiB
5.5 KiB
HP Content Agent App — Shared Build Spec
This is the contract between parallel builders. Honour it exactly so the pieces fit.
Monorepo root
/Users/daveporter/Desktop/CODING-2024/HP-TASKS/app/
Python
Python 3.12. Backend package root: backend/app/. Use pyproject.toml in backend/.
Deliverable types (enum — use these exact strings everywhere)
leadership_themesregional_enrichmentlinkedin_postswebinar_specinfographic_specsabm_enablement
Module interfaces (cross-component contracts)
app.schemas (per deliverable)
from app.schemas.leadership_themes import LeadershipThemes- One top-level pydantic
BaseModelper deliverable type, file named after the type. - Each exposes
.model_json_schema()for Claude tool schemas. - Also export a map:
app.schemas.registry.SCHEMA_BY_TYPE: dict[str, type[BaseModel]]
app.hp_branding.render
render_to_bytes(deliverable_type: str, content: dict) -> bytes— returns.docxbytes.- Internally dispatches to
renderers/<type>.pywith arender(doc_content) -> Documentor similar.
app.agents.generate
generate_deliverable(brief_id: UUID, deliverable_type: str, session: Session) -> dict— returns validated structured content (already dict-form of the pydantic schema).- Internally: loads brief + retrieved chunks + system prompt, calls Claude with tool_choice, parses, validates.
app.ingestion.orchestrator
ingest_document(document_id: UUID, session: Session) -> None— synchronous; called by RQ worker. Extracts text, translates, chunks, embeds, persists.
app.workers.tasks
ingest_document_task(document_id)— RQ task wrapping ingestion.generate_deliverable_task(generation_id)— RQ task wrapping agent. Updatesgenerations.statusandstructured_content.
DB schema (SQLAlchemy + pgvector)
Tables:
users(id uuid pk, email unique, name, password_hash, role text check role in ('admin','user'), created_at)briefs(id uuid pk, name, region, audience, brief_text, created_by fk users, created_at)documents(id uuid pk, brief_id fk briefs cascade, kind text check kind in ('master','supporting'), filename, storage_path, mime_type, language, page_count, extracted_text text, translated_text_en text, uploaded_at, ingestion_status text default 'pending')doc_chunks(id uuid pk, document_id fk documents cascade, chunk_index int, text text, embedding vector(1024))— use Voyage voyage-3 (1024 dims) OR Anthropic-compatible. If Voyage not available, usevector(1536)and OpenAI text-embedding-3-small.generations(id uuid pk, brief_id fk briefs, deliverable_type text, status text, structured_content jsonb, tokens_used int, cost_usd numeric, error text, started_at, completed_at)exports(id uuid pk, generation_id fk generations, docx_path text, generated_at)
REST API (stable paths for frontend)
All JSON unless noted. Auth via Authorization: Bearer <jwt> OR access_token httpOnly cookie.
POST /auth/login{email, password} → {access_token, user}POST /auth/logoutGET /auth/me→ current userPOST /users(admin only) {email, name, password, role}GET /users(admin only)GET /briefs— user sees own; admin sees allPOST /briefs{name, region, audience, brief_text}GET /briefs/:idPOST /briefs/:id/duplicate→ new brief idDELETE /briefs/:idPOST /briefs/:id/documents(multipart) file, kind → documentGET /briefs/:id/documentsDELETE /documents/:idPOST /briefs/:id/generations{deliverable_types: string[]} → array of generation ids, jobs enqueuedGET /briefs/:id/generations→ status + structured_content per deliverableGET /generations/:idPATCH /generations/:id{structured_content} — user editsPOST /generations/:id/export→ {download_url}GET /exports/:id/download→ file
Frontend env
Base API URL from VITE_API_URL (default http://localhost:8000).
Docker services
postgresimagepgvector/pgvector:pg16port 5432redisimageredis:7-alpineport 6379apibuild./backendport 8000workerbuild./backendcommandrq worker defaultfrontendbuild./frontendport 5173 (dev) or 80 (prod)
Secrets (via .env)
ANTHROPIC_API_KEY, VOYAGE_API_KEY (optional), OPENAI_API_KEY (optional, fallback embeddings), POSTGRES_USER=hp, POSTGRES_PASSWORD=hp, POSTGRES_DB=hp_content_agent, JWT_SECRET, REDIS_URL=redis://redis:6379/0.
Golden fixtures (for verification)
Existing files in /Users/daveporter/Desktop/CODING-2024/HP-TASKS/:
IT_moment_to_lead.pdf— master docPOLISH/— 6 supporting docsHP_Four_Leadership_Themes.docx,HP_Poland_Enriched_Leadership_Themes.docx,HP_Task3_Content_Repurposing_Deliverables.docx— golden output references
Existing scripts (reference for porting branding, DO NOT re-import)
build_poland_enriched.py— Montserrat renderer with shade_cell, styled_table, add_body, add_quote, bullet, sub_bullet, add_scenario_box (lines ~53-134)build_task3_deliverables.py— same palette +post_blockhelper for LinkedIn posts (lines ~89-118)convert_to_docx.py— Calibri variant (older; use for structure only, standardise on Montserrat)- System prompt:
HP_Content_Agent_System_Instructions.md— copy verbatim tobackend/app/agents/prompts/system.md
Conventions
- No secrets committed.
.envgitignored. - All Python files typed; use
from __future__ import annotations. - Frontend strict mode on.
- Each component README-free unless truly needed; root README covers setup.