Gemini Embedding Models Split by API Channel

Google's embedding models are split across two API channels and the availability does not match what the documentation implies. gemini-embedding-001 works with a standard AI Studio API key (via v1beta endpoint). text-embedding-004 and text-multilingual-embedding-002 — which are often presented as the "standard" embedding models — require Vertex AI credentials and are not accessible via a Gemini API key alone.

Key Points

gemini-embedding-001 → AI Studio key (generativelanguage.googleapis.com/v1beta) ✓
text-embedding-004 → Vertex AI credentials required (service account / ADC) ✗ on standard API key
text-multilingual-embedding-002 → Vertex AI only ✗ on standard API key
The error with wrong channel: 400 INVALID_ARGUMENT: Model ... is not supported
For RAG pipelines using AI Studio keys, use gemini-embedding-001 (1536-dim output)

Details

Model Availability Matrix

Model	AI Studio Key	Vertex AI Creds	Dimensions	Notes
`gemini-embedding-001`	✅	✅	1536	Default for API key users
`text-embedding-004`	❌	✅	768	Vertex-only
`text-multilingual-embedding-002`	❌	✅	768	Vertex-only, multilingual

Using gemini-embedding-001 with AI Studio Key

import google.generativeai as genai

genai.configure(api_key="AIza...")

def embed_text(text: str) -> list[float]:
    result = genai.embed_content(
        model="models/gemini-embedding-001",
        content=text,
        task_type="RETRIEVAL_DOCUMENT"  # or RETRIEVAL_QUERY, SEMANTIC_SIMILARITY
    )
    return result["embedding"]  # 1536-dimensional vector

Using text-embedding-004 with Vertex AI

from vertexai.language_models import TextEmbeddingModel
import vertexai

# Requires: GOOGLE_APPLICATION_CREDENTIALS env var pointing to service account JSON
# or: gcloud auth application-default login
vertexai.init(project="my-project", location="us-central1")

model = TextEmbeddingModel.from_pretrained("text-embedding-004")
embeddings = model.get_embeddings(["Hello world"])
vector = embeddings[0].values  # 768-dimensional

LiteLLM Abstraction (Recommended for Portability)

# LiteLLM handles the routing transparently
from litellm import embedding

# AI Studio path
response = embedding(
    model="gemini/gemini-embedding-001",
    input=["Hello world"],
    api_key="AIza..."
)

# Vertex AI path
response = embedding(
    model="vertex_ai/text-embedding-004",
    input=["Hello world"],
    vertex_project="my-project",
    vertex_location="us-central1"
)

Qdrant Collection Dimension Mismatch Warning

When switching between models, the Qdrant collection vector size must match:

# gemini-embedding-001 → 1536 dimensions
client.create_collection("docs", vectors_config=VectorParams(size=1536, distance=Distance.COSINE))

# text-embedding-004 → 768 dimensions
client.create_collection("docs", vectors_config=VectorParams(size=768, distance=Distance.COSINE))

Changing embedding models requires recreating the collection and re-indexing all documents.

wiki/architecture/rag-architecture — RAG pipeline where embedding model choice matters
wiki/concepts/gemini-conversation-cost-scaling — related Gemini API cost gotcha

Sources

daily/2026-04-29.md — Hit when trying to use text-embedding-004 with an AI Studio API key in a RAG pipeline

3.7 KiB Raw Blame History