Multi-Agent AI Systems

Pattern where multiple specialized AI agents run in parallel, with a lead/orchestrator agent synthesizing results.

Key Takeaways

Parallel specialist agents + lead synthesizer = reliable, multi-perspective analysis
HTTP polling (not WebSocket) for delivering async agent results on GCP
Each agent should have a focused, single-concern prompt (Legal, Brand, Tone, Channel)
Autonomous orchestration needs explicit "next speaker" logic to prevent infinite loops
Background task execution (ai_runner_service.py) keeps agents non-blocking

When to Use

Content review requiring multiple compliance dimensions (legal + brand + tone)
Focus group simulation (multi-persona conversations)
Any task benefiting from multiple independent perspectives before synthesis

Architecture Patterns

Pattern 1: Parallel Specialists + Lead (Mod Comms)

Input (proof image/PDF)
    ↓
┌──────────────────────────────────┐
│ Agent 1: Legal compliance        │
│ Agent 2: Brand adherence         │  ← run in PARALLEL
│ Agent 3: Tone of voice           │
│ Agent 4: Channel suitability     │
└──────────────────────────────────┘
    ↓
Lead Agent: synthesize verdict
    ↓
Result (via HTTP polling)

Pattern 2: Autonomous Multi-Persona (Semblance)

Input (discussion brief)
    ↓
Persona generator (Gemini) → N personas
    ↓
Conversation controller
    ├── conversation_decision_service.py  ← next speaker logic
    ├── conversation_context_service.py   ← shared state + history
    └── ai_runner_service.py              ← background task execution
    ↓
Socket.IO → frontend (real-time)
    ↓
Theme extraction + analytics

Pattern 3: RAG + Structured AI (Enterprise Nexus)

Document upload → Firecrawl crawl
    ↓
AI content structuring (pre-indexing)
    ↓
Qdrant vector DB (10-page batch merge)
    ↓
Query → vector search → LLM synthesis

Projects Using This Pattern

01 Projects/modcomms/Mod Comms — 4 parallel specialist agents + lead synthesizer (Gemini Pro/Flash)
01 Projects/semblance/Semblance — Autonomous focus group with N personas (Gemini 3 Pro, GPT-4.1, GPT-5.2)
01 Projects/enterprise-ai-hub-nexus/Enterprise AI Hub Nexus — RAG with structured AI pre-processing (Qdrant)

Gotchas & Lessons

GCP 30s LB timeout kills streaming delivery — always use HTTP polling for agent results (see wiki/architecture/gcp-deployment-lb-timeout)
Semblance: naive vs aware datetime crash — always use timezone-aware datetimes in async contexts
"Next speaker" logic in autonomous mode must have termination conditions to prevent infinite loops
Cross-loop WebSocket emit in Semblance was unreliable — polling fallback was more stable
Orphaned vectors in Qdrant need periodic cleanup (Enterprise Nexus has a "Purge orphaned vectors" button)

wiki/architecture/gcp-deployment-lb-timeout — critical deployment constraint
wiki/architecture/rag-architecture — retrieval-augmented generation
wiki/tech-patterns/python-ai-agents — model selection + structured output

3.6 KiB Raw Blame History

Multi-Agent AI Systems

Key Takeaways

When to Use

Architecture Patterns

Pattern 1: Parallel Specialists + Lead (Mod Comms)

Pattern 2: Autonomous Multi-Persona (Semblance)

Pattern 3: RAG + Structured AI (Enterprise Nexus)

Projects Using This Pattern

Gotchas & Lessons

Related

3.6 KiB

Raw Blame History