| title |
aliases |
tags |
sources |
created |
updated |
| Python AI Integration Patterns |
| ai-agents |
| claude-api |
| gemini-api |
| openai |
| llm-integration |
|
| ai |
| claude |
| gemini |
| openai |
| llamaindex |
| python |
| agents |
|
| 01 Projects/gmal-scope-builder |
| 01 Projects/modcomms |
| 01 Projects/semblance |
| 01 Projects/enterprise-ai-hub-nexus |
| 01 Projects/olivas |
| 01 Projects/pimco-charts |
| 01 Projects/video-accessibility |
| 01 Projects/pdf-accessibility |
| 01 Projects/build-a-squad |
| 01 Projects/ac-helper |
|
2026-04-15 |
2026-04-15 |
Python AI Integration Patterns
How Oliver projects integrate LLMs — model selection, structured output, multi-model fallback.
Key Takeaways
- Claude (Anthropic):
tool_use for structured JSON output — used in GMAL, OliVAS, PIMCO, PDF Accessibility
- Gemini (Google): dominant model for real-time/streaming analysis — Mod Comms, Video Accessibility, AC Helper, Build A Squad
- OpenAI: used in Semblance (GPT-4.1, GPT-5.2) and Solventum
- LlamaIndex: RAG orchestration in Sandbox NotebookLM (multi-model via
llm_factory.py)
- DeepGaze: specialized vision model for OliVAS (saliency/attention prediction)
- Always implement a fallback model: Gemini Pro → Flash (Mod Comms pattern)
When to Use Which Model
| Task |
Model |
Why |
| Structured JSON output |
Claude (tool_use) |
Reliable schema adherence |
| Document analysis / image |
Gemini 2.5 Pro |
Multimodal, fast |
| Real-time persona/focus groups |
Gemini 3 Pro Preview / GPT-4.1 |
Quality personas |
| Metadata generation |
OpenAI (Solventum) |
Client preference |
| RAG retrieval |
LlamaIndex + any model |
Framework abstraction |
| Chart/SVG generation |
Claude API |
Precise code output |
| Prompt optimization |
Claude Sonnet 4.6 |
Design analysis |
| Video captions |
Gemini 2.5 Pro |
Native video understanding |
Structured Output — Claude tool_use
# GMAL pattern: Claude tool_use for schema-bound output
response = client.messages.create(
model="claude-opus-4-6",
tools=[{"name": "parse_brief", "input_schema": BriefSchema}],
tool_choice={"type": "tool", "name": "parse_brief"},
messages=[{"role": "user", "content": brief_text}]
)
result = response.content[0].input # typed dict
Multi-Model Fallback (Mod Comms Pattern)
# Primary: Gemini Pro, fallback: Flash
try:
result = await gemini_pro.generate(prompt)
except Exception:
result = await gemini_flash.generate(prompt)
LLM Factory Pattern (Sandbox NotebookLM)
# llm_factory.py — abstract model selection
def get_llm_by_type(llm_type: str) -> BaseLLM:
mapping = {"gemini-pro": GeminiPro(), "gpt4": OpenAI("gpt-4.1"), ...}
return mapping[llm_type]
def get_structured_llm(schema: type) -> BaseLLM:
... # wraps output in Pydantic model
AI Agents (Mod Comms Multi-Agent)
4 specialist agents (Legal, Brand, Tone, Channel)
→ run in parallel
→ Lead agent synthesizes verdict
See wiki/architecture/multi-agent-ai-systems for full pattern.
Projects Using This Pattern
- 01 Projects/gmal-scope-builder/GMAL Scope Builder — Claude Opus 4.6, tool_use, structured ratecard output
- 01 Projects/modcomms/Mod Comms — Gemini Pro + Flash fallback, 4 parallel agents
- 01 Projects/semblance/Semblance — Gemini 3 Pro Preview, GPT-4.1, GPT-5.2, multi-persona
- 01 Projects/enterprise-ai-hub-nexus/Enterprise AI Hub Nexus — multi-model RAG via LlamaIndex + Qdrant
- 01 Projects/sandbox-notebookllamalm-nextjs/Sandbox NotebookLM — LlamaIndex, llm_factory, 7 studio generators
- 01 Projects/olivas/OliVAS — Claude Sonnet 4.6 design analysis + DeepGaze vision
- 01 Projects/pimco-charts/PIMCO Charts — Claude API, SVG code generation
- 01 Projects/video-accessibility/Video Accessibility Platform — Gemini 2.5 Pro, VTT captions, audio description
- 01 Projects/pdf-accessibility/PDF Accessibility Checker — Claude + Google Cloud Vision
- 01 Projects/build-a-squad/Build A Squad — Gemini (client-side, no backend)
- 01 Projects/ac-helper/AC Helper — Gemini, natural language commands
Gotchas & Lessons
- Long AI calls (>30s) will timeout on GCP LB — use HTTP polling, not streaming (see wiki/architecture/gcp-deployment-lb-timeout)
- Increase Vite proxy timeout to 5 min for AI-heavy endpoints
- Claude
tool_use is more reliable than prompt-based JSON for structured output
max_tokens defaults (4096) too low for document generation — set 8192–16000
- Gemini models change frequently — update model IDs in
llm_factory.py regularly (Sandbox 2026-03-31)
- GCP Vision + Claude hybrid (PDF Accessibility) — GCV handles image extraction, Claude does semantic analysis
Related