99 lines
5.1 KiB
Markdown
99 lines
5.1 KiB
Markdown
---
|
||
title: "Python AI Integration Patterns"
|
||
aliases: [ai-agents, claude-api, gemini-api, openai, llm-integration]
|
||
tags: [ai, claude, gemini, openai, llamaindex, python, agents]
|
||
sources: [01 Projects/gmal-scope-builder, 01 Projects/modcomms, 01 Projects/semblance, 01 Projects/enterprise-ai-hub-nexus, 01 Projects/olivas, 01 Projects/pimco-charts, 01 Projects/video-accessibility, 01 Projects/pdf-accessibility, 01 Projects/build-a-squad, 01 Projects/ac-helper]
|
||
created: 2026-04-15
|
||
updated: 2026-04-15
|
||
---
|
||
|
||
# Python AI Integration Patterns
|
||
|
||
How Oliver projects integrate LLMs — model selection, structured output, multi-model fallback.
|
||
|
||
## Key Takeaways
|
||
- **Claude** (Anthropic): `tool_use` for structured JSON output — used in GMAL, OliVAS, PIMCO, PDF Accessibility
|
||
- **Gemini** (Google): dominant model for real-time/streaming analysis — Mod Comms, Video Accessibility, AC Helper, Build A Squad
|
||
- **OpenAI**: used in Semblance (GPT-4.1, GPT-5.2) and Solventum
|
||
- **LlamaIndex**: RAG orchestration in Sandbox NotebookLM (multi-model via `llm_factory.py`)
|
||
- **DeepGaze**: specialized vision model for OliVAS (saliency/attention prediction)
|
||
- Always implement a **fallback model**: Gemini Pro → Flash (Mod Comms pattern)
|
||
|
||
## When to Use Which Model
|
||
|
||
| Task | Model | Why |
|
||
|------|-------|-----|
|
||
| Structured JSON output | Claude (`tool_use`) | Reliable schema adherence |
|
||
| Document analysis / image | Gemini 2.5 Pro | Multimodal, fast |
|
||
| Real-time persona/focus groups | Gemini 3 Pro Preview / GPT-4.1 | Quality personas |
|
||
| Metadata generation | OpenAI (Solventum) | Client preference |
|
||
| RAG retrieval | LlamaIndex + any model | Framework abstraction |
|
||
| Chart/SVG generation | Claude API | Precise code output |
|
||
| Prompt optimization | Claude Sonnet 4.6 | Design analysis |
|
||
| Video captions | Gemini 2.5 Pro | Native video understanding |
|
||
|
||
## Structured Output — Claude `tool_use`
|
||
```python
|
||
# GMAL pattern: Claude tool_use for schema-bound output
|
||
response = client.messages.create(
|
||
model="claude-opus-4-6",
|
||
tools=[{"name": "parse_brief", "input_schema": BriefSchema}],
|
||
tool_choice={"type": "tool", "name": "parse_brief"},
|
||
messages=[{"role": "user", "content": brief_text}]
|
||
)
|
||
result = response.content[0].input # typed dict
|
||
```
|
||
|
||
## Multi-Model Fallback (Mod Comms Pattern)
|
||
```python
|
||
# Primary: Gemini Pro, fallback: Flash
|
||
try:
|
||
result = await gemini_pro.generate(prompt)
|
||
except Exception:
|
||
result = await gemini_flash.generate(prompt)
|
||
```
|
||
|
||
## LLM Factory Pattern (Sandbox NotebookLM)
|
||
```python
|
||
# llm_factory.py — abstract model selection
|
||
def get_llm_by_type(llm_type: str) -> BaseLLM:
|
||
mapping = {"gemini-pro": GeminiPro(), "gpt4": OpenAI("gpt-4.1"), ...}
|
||
return mapping[llm_type]
|
||
|
||
def get_structured_llm(schema: type) -> BaseLLM:
|
||
... # wraps output in Pydantic model
|
||
```
|
||
|
||
## AI Agents (Mod Comms Multi-Agent)
|
||
```
|
||
4 specialist agents (Legal, Brand, Tone, Channel)
|
||
→ run in parallel
|
||
→ Lead agent synthesizes verdict
|
||
```
|
||
See [[wiki/architecture/multi-agent-ai-systems|multi-agent-ai-systems]] for full pattern.
|
||
|
||
## Projects Using This Pattern
|
||
- [[01 Projects/gmal-scope-builder/GMAL Scope Builder|GMAL Scope Builder]] — Claude Opus 4.6, tool_use, structured ratecard output
|
||
- [[01 Projects/modcomms/Mod Comms|Mod Comms]] — Gemini Pro + Flash fallback, 4 parallel agents
|
||
- [[01 Projects/semblance/Semblance|Semblance]] — Gemini 3 Pro Preview, GPT-4.1, GPT-5.2, multi-persona
|
||
- [[01 Projects/enterprise-ai-hub-nexus/Enterprise AI Hub Nexus|Enterprise Nexus]] — multi-model RAG via LlamaIndex + Qdrant
|
||
- [[01 Projects/sandbox-notebookllamalm-nextjs/Sandbox NotebookLM|Sandbox NotebookLM]] — LlamaIndex, llm_factory, 7 studio generators
|
||
- [[01 Projects/olivas/OliVAS|OliVAS]] — Claude Sonnet 4.6 design analysis + DeepGaze vision
|
||
- [[01 Projects/pimco-charts/PIMCO Charts|PIMCO Charts]] — Claude API, SVG code generation
|
||
- [[01 Projects/video-accessibility/Video Accessibility Platform|Video Accessibility]] — Gemini 2.5 Pro, VTT captions, audio description
|
||
- [[01 Projects/pdf-accessibility/PDF Accessibility Checker|PDF Accessibility]] — Claude + Google Cloud Vision
|
||
- [[01 Projects/build-a-squad/Build A Squad|Build A Squad]] — Gemini (client-side, no backend)
|
||
- [[01 Projects/ac-helper/AC Helper|AC Helper]] — Gemini, natural language commands
|
||
|
||
## Gotchas & Lessons
|
||
- Long AI calls (>30s) will timeout on GCP LB — use HTTP polling, not streaming (see [[wiki/architecture/gcp-deployment-lb-timeout|gcp-deployment-lb-timeout]])
|
||
- Increase Vite proxy timeout to 5 min for AI-heavy endpoints
|
||
- Claude `tool_use` is more reliable than prompt-based JSON for structured output
|
||
- `max_tokens` defaults (4096) too low for document generation — set 8192–16000
|
||
- Gemini models change frequently — update model IDs in `llm_factory.py` regularly (Sandbox 2026-03-31)
|
||
- GCP Vision + Claude hybrid (PDF Accessibility) — GCV handles image extraction, Claude does semantic analysis
|
||
|
||
## Related
|
||
- [[wiki/architecture/multi-agent-ai-systems|multi-agent-ai-systems]]
|
||
- [[wiki/architecture/rag-architecture|rag-architecture]]
|
||
- [[wiki/architecture/gcp-deployment-lb-timeout|gcp-deployment-lb-timeout]]
|