| title |
aliases |
tags |
sources |
created |
updated |
| Gemini Model Catalog |
| google-gemini-models |
| gemini-api-models |
|
| llm |
| google |
| gemini |
| api |
| models |
|
| raw/Gemini API Google AI for Developers.md |
|
2026-05-08 |
2026-05-08 |
Overview
Google's Gemini API model lineup spans text, audio, image, video, music, embeddings, and robotics. Models are grouped by generation (2.5, 3.x) and tier (Pro > Flash > Flash-Lite). All available via Google AI for Developers.
Gemini 2.5 Family (Current Stable)
| Model |
Tier |
Status |
Best For |
gemini-2.5-pro |
Pro |
Stable |
Complex tasks, deep reasoning, coding |
gemini-2.5-flash |
Flash |
Stable |
Price-performance, low-latency, high-volume |
gemini-2.5-flash-lite |
Flash-Lite |
Stable |
Fastest + cheapest multimodal in 2.5 family |
| Gemini 2.5 Flash Live |
Flash |
Preview |
Real-time conversational agents, sub-second audio |
| Gemini 2.5 Flash TTS |
Flash |
Preview |
Controllable text-to-speech, fine style/pacing control |
| Gemini 2.5 Pro TTS |
Pro |
Preview |
High-fidelity TTS for podcasts, audiobooks |
| Imagen 4 (Nano Banana) |
Flash |
Stable |
Native image gen/editing, fast creative workflows |
Gemini 3 Family (Preview / Upcoming)
| Model |
Status |
Best For |
gemini-3.1-pro-preview |
Preview |
Advanced reasoning, agentic, vibe coding |
gemini-3-flash-preview |
Preview |
Frontier-class perf at lower cost |
gemini-3.1-flash-lite |
Stable |
Frontier-class, budget-friendly |
gemini-3.1-flash-live-preview |
Preview |
Real-time dialogue, voice-first AI |
gemini-3.1-flash-tts-preview |
Preview |
Low-latency speech generation |
| Nano Banana 2 (image) |
Preview |
High-efficiency production image gen + editing |
| Nano Banana Pro (image) |
Preview |
Studio-quality 4K, complex layouts, text rendering |
Note: "Nano Banana" appears to be Google's display alias for Imagen-series image generation models (Imagen 4 family).
Audio Models
| Model |
Latency |
Use Case |
| Gemini 3.1 Flash Live |
Low |
A2A real-time dialogue, voice-first apps |
| Gemini 3.1 Flash TTS |
Low |
TTS with expressive audio tags |
| Gemini 2.5 Flash Live |
Low |
Bidirectional voice + video agents, native audio reasoning |
| Gemini 2.5 Flash TTS |
Low |
Cost-efficient real-time TTS |
| Gemini 2.5 Pro TTS |
High fidelity |
Structured workflows (podcasts, audiobooks) |
Generative Media Models
| Model |
Type |
Status |
| Nano Banana 2 |
Image gen/edit |
Preview |
| Nano Banana Pro |
Image gen/edit |
Preview |
| Nano Banana (2.5) |
Image gen/edit |
Stable |
| Imagen 4 |
Text-to-image (up to 2K) |
Stable |
| Veo 3.1 |
Video + synced audio |
Preview |
| Veo 3.1 Lite |
Low-cost video gen/edit |
Preview |
Music Generation Models
| Model |
Use Case |
| Lyria 3 Pro |
Full-length songs, structural coherence |
| Lyria 3 Clip |
Short clips, loops, ≤30 sec previews |
| Lyria RealTime Experimental |
Granular control, real-time streaming |
Tool & Agent Models
| Model |
Capability |
| Computer Use Preview |
Screen vision + UI actions (click, type, navigate) — browser automation |
| Gemini Deep Research Preview |
Agentic multi-step research across hundreds of sources, cited reports |
| Gemini Deep Research Max Preview |
Maximum comprehensiveness version of Deep Research |
Specialized Task Models
| Model |
Type |
Notes |
| Gemini Embedding 2 |
Multimodal embedding |
Text + image + video + audio + PDF → unified vector space |
| Gemini Embedding |
Text embedding |
High-dimensional vectors for semantic search, RAG |
| Gemini Robotics-ER 1.6 |
Embodied reasoning |
Physical space understanding, multi-step robotic tasks |
Model Version Naming Conventions
| Channel |
Example ID |
Behavior |
| Stable |
gemini-2.5-flash |
Fixed version, rarely changes. Recommended for production |
| Preview |
gemini-2.5-flash-preview-09-2025 |
Production-eligible, billing enabled, ≥2 weeks deprecation notice |
| Latest |
gemini-flash-latest |
Auto-updates to newest release; 2-week email notice before swap |
| Experimental |
gemini-*-exp-* |
Not for production, restricted rate limits, may disappear |
Deprecated / Shut Down
| Model |
Status |
Notes |
| Gemini 2.0 Flash |
Deprecated |
1M context, native tool use |
| Gemini 2.0 Flash-Lite |
Deprecated |
Fastest gen-2 |
| Gemini 3 Pro Preview |
Shut down |
|
Key Takeaways
- 2.5 Flash is the current go-to for cost/performance balance; 2.5 Pro for complex reasoning
- Gemini 3.x is in preview — 3.1 Flash-Lite is already stable, rest require preview access
- Gemini has dedicated audio Live models for real-time voice agents (A2A streaming)
- "Nano Banana" = Google's display name for Imagen-series image models in the API console
- Computer Use model can automate browser UIs natively — similar to Anthropic's computer use
- Gemini Embedding 2 is multimodal — handles text, images, video, audio, PDF in one embedding space
- Use
gemini-flash-latest alias with caution — it hot-swaps on new releases with only 2-week notice
- Experimental models have no stability guarantees and restrictive rate limits
Related Articles
Sources