obsidian/wiki/tech-patterns/python-ai-agents.md
2026-04-15 10:48:47 +01:00

5.1 KiB
Raw Permalink Blame History

title aliases tags sources created updated
Python AI Integration Patterns
ai-agents
claude-api
gemini-api
openai
llm-integration
ai
claude
gemini
openai
llamaindex
python
agents
01 Projects/gmal-scope-builder
01 Projects/modcomms
01 Projects/semblance
01 Projects/enterprise-ai-hub-nexus
01 Projects/olivas
01 Projects/pimco-charts
01 Projects/video-accessibility
01 Projects/pdf-accessibility
01 Projects/build-a-squad
01 Projects/ac-helper
2026-04-15 2026-04-15

Python AI Integration Patterns

How Oliver projects integrate LLMs — model selection, structured output, multi-model fallback.

Key Takeaways

  • Claude (Anthropic): tool_use for structured JSON output — used in GMAL, OliVAS, PIMCO, PDF Accessibility
  • Gemini (Google): dominant model for real-time/streaming analysis — Mod Comms, Video Accessibility, AC Helper, Build A Squad
  • OpenAI: used in Semblance (GPT-4.1, GPT-5.2) and Solventum
  • LlamaIndex: RAG orchestration in Sandbox NotebookLM (multi-model via llm_factory.py)
  • DeepGaze: specialized vision model for OliVAS (saliency/attention prediction)
  • Always implement a fallback model: Gemini Pro → Flash (Mod Comms pattern)

When to Use Which Model

Task Model Why
Structured JSON output Claude (tool_use) Reliable schema adherence
Document analysis / image Gemini 2.5 Pro Multimodal, fast
Real-time persona/focus groups Gemini 3 Pro Preview / GPT-4.1 Quality personas
Metadata generation OpenAI (Solventum) Client preference
RAG retrieval LlamaIndex + any model Framework abstraction
Chart/SVG generation Claude API Precise code output
Prompt optimization Claude Sonnet 4.6 Design analysis
Video captions Gemini 2.5 Pro Native video understanding

Structured Output — Claude tool_use

# GMAL pattern: Claude tool_use for schema-bound output
response = client.messages.create(
    model="claude-opus-4-6",
    tools=[{"name": "parse_brief", "input_schema": BriefSchema}],
    tool_choice={"type": "tool", "name": "parse_brief"},
    messages=[{"role": "user", "content": brief_text}]
)
result = response.content[0].input  # typed dict

Multi-Model Fallback (Mod Comms Pattern)

# Primary: Gemini Pro, fallback: Flash
try:
    result = await gemini_pro.generate(prompt)
except Exception:
    result = await gemini_flash.generate(prompt)

LLM Factory Pattern (Sandbox NotebookLM)

# llm_factory.py — abstract model selection
def get_llm_by_type(llm_type: str) -> BaseLLM:
    mapping = {"gemini-pro": GeminiPro(), "gpt4": OpenAI("gpt-4.1"), ...}
    return mapping[llm_type]

def get_structured_llm(schema: type) -> BaseLLM:
    ...  # wraps output in Pydantic model

AI Agents (Mod Comms Multi-Agent)

4 specialist agents (Legal, Brand, Tone, Channel)
    → run in parallel
    → Lead agent synthesizes verdict

See wiki/architecture/multi-agent-ai-systems for full pattern.

Projects Using This Pattern

Gotchas & Lessons

  • Long AI calls (>30s) will timeout on GCP LB — use HTTP polling, not streaming (see wiki/architecture/gcp-deployment-lb-timeout)
  • Increase Vite proxy timeout to 5 min for AI-heavy endpoints
  • Claude tool_use is more reliable than prompt-based JSON for structured output
  • max_tokens defaults (4096) too low for document generation — set 819216000
  • Gemini models change frequently — update model IDs in llm_factory.py regularly (Sandbox 2026-03-31)
  • GCP Vision + Claude hybrid (PDF Accessibility) — GCV handles image extraction, Claude does semantic analysis