
Batch Mode in the Gemini API: Process more for less- Google Developers Blog
Discover how Batch Mode in Gemini API allows developers to submit large jobs, offload processing, and get results in 24 hours, offering simplicity, convenience, and cost-savings.
Gemini models are now available in Batch Mode
Today, we’re excited to introduce a batch mode in the Gemini API, a new asynchronous endpoint designed specifically for high-throughput, non-latency-critical workloads. The Gemini API Batch Mode allows you to submit large jobs, offload the scheduling and processing, and retrieve your results within 24 hours—all at a 50% discount compared to our synchronous APIs.
Process more for less
Batch Mode is the perfect tool for any task where you have your data ready upfront and don’t need an immediate response. By separating these large jobs from your real-time traffic, you unlock three key benefits:- Cost savings: Batch jobs are priced at 50% less than the standard rate for a given model
- Higher throughput: Batch Mode has even higher rate limits
- Easy API calls: No need to manage complex client-side queuing or retry logic. Available results are returned within a 24-hour window.