Priced to help you bring your app to the world
Available now
Available now
Available now
Available now
Our fastest multimodal model with great performance for diverse, repetitive tasks and a 1 million context window. Now generally available for production use.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
1 million TPM (tokens per minute)
1,500 RPD (requests per day)
Price (input)
Free of charge
Context caching
Free of charge, up to 1 million tokens of storage per hour
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
1000 RPM (requests per minute)
4 million TPM (tokens per minute)
Price (input)
$0.35 / 1 million tokens (for prompts up to 128K tokens)
$0.70 / 1 million tokens (for prompts longer than 128K)
Context caching
$0.0875 / 1 million tokens (for prompts up to 128K tokens)
$0.175 / 1 million tokens (for prompts longer than 128K)
$1.00 / 1 million tokens per hour (storage)
Price (output)
$1.05 / 1 million tokens (for prompts up to 128K tokens)
$2.10 / 1 million tokens (for prompts longer than 128K)
Prompts/responses used to improve our products
Our next-generation model with a breakthrough 2 million context window. Now generally available for production use.
Free of charge*
Rate Limits**
2 RPM (requests per minute)
32,000 TPM (tokens per minute)
50 RPD (requests per day)
Price (input)
Free of charge
Context caching
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
4 million TPM (tokens per minute)
10,000 RPD (requests per day)
Price (input)
$3.50 / 1 million tokens (for prompts up to 128K tokens)
$7.00 / 1 million tokens (for prompts longer than 128K)
Context caching
$0.875 / 1 million tokens (for prompts up to 128K tokens)
$1.75 / 1 million tokens (for prompts longer than 128K)
$4.50 / 1 million tokens per hour (storage)
Price (output)
$10.50 / 1 million tokens (for prompts up to 128K tokens)
$21.00 / 1 million tokens (for prompts longer than 128K)
Prompts/responses used to improve our products
Our first-generation model offering only text and image reasoning. Generally available for production use.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
32,000 TPM (tokens per minute)
1,500 RPD (requests per day)
Price (input)
Free of charge
Context caching
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
120,000 TPM (tokens per minute)
30,000 RPD (requests per day)
Price (input)
$0.50 / 1 million tokens
Context caching
Not available
Price (output)
$1.50 / 1 million tokens
Prompts/responses used to improve our products
Our state-of-the-art text embedding model.
Free of charge*
Rate Limits**
1,500 RPM (requests per minute)
Price (input)
Free of charge
Context caching
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
*Gemini API free tier usage restrictions apply to EEA (including EU), the UK and CH. See Billing FAQs for details.
**Specified rate limits are not guaranteed and actual capacity may vary. Apply for an increased maximum rate limit (for paid tier only).
***Tuned model inference costs are billed at the same price as the base models. To get help with billing, see Cloud Billing support.
****Prices may differ from the prices listed here and the prices offered on Vertex AI. For Vertex prices, see the Vertex documentation.
Build with Vertex AI on Google Cloud