Google’s Gemini CLI now features Flash model

Google has just rolled out Gemini 3 Flash inside Gemini CLI, and it’s a move that feels like a nod to developers who live in the terminal and want speed without sacrificing quality. The new model is designed for high-frequency workflows—think rapid prototyping, code edits, stress testing—where waiting on a slower, more expensive model can break your flow. What makes Gemini 3 Flash stand out is its balance: it’s faster and cheaper than Gemini 3 Pro, yet still capable of handling reasoning-heavy tasks that used to demand the Pro tier. In fact, Google says it hits a SWE-bench Verified score of 78% for agentic coding, outperforming not only the older 2.5 series but even Gemini 3 Pro in certain scenarios.

The rollout is fairly broad. Paid tier customers of Gemini CLI now have access to both Gemini 3 Pro and Flash, while free tier users who joined the waitlist are being onboarded gradually. Once you update Gemini CLI to version 0.21.1, you can toggle preview features and start experimenting with Gemini 3 Flash directly in your terminal. The CLI’s auto-routing system helps decide when to use Pro for complex reasoning and when Flash is enough, but you can also manually select which model to run.

What’s striking is how Google positions Flash as raising the “performance floor.” In demos, it’s shown to handle tasks like generating a 3D voxel simulation of the Golden Gate Bridge—something that previously required Pro models—without breaking logic. It can process massive context windows, like a pull request thread with 1,000 comments, and zero in on the single actionable change. It can even generate and debug Python scripts for realistic load testing, simulating concurrent user traffic across multiple scenarios. These aren’t just flashy demos; they’re the kinds of everyday developer headaches that Flash seems built to smooth out.

The bigger picture here is cost and accessibility. Gemini 3 Flash is available at less than a quarter of the cost of Gemini 3 Pro, while being three times faster than the 2.5 Pro, according to Google’s benchmarks. For developers, that means you can stay in the flow longer, iterate faster, and still trust the model to deliver coherent, functional code. It’s not just about speed—it’s about making high-quality AI assistance viable for the kind of repetitive, high-volume tasks that dominate real-world coding.

This release also signals Google’s intent to make Gemini CLI more than just a playground for AI models. By combining Pro and Flash, the CLI now feels like a full-stack assistant for developers, capable of scaling from quick fixes to complex reasoning. It’s a reminder that AI in the terminal isn’t just about novelty—it’s about reshaping the way developers work day to day, with models that adapt to the pace and complexity of the task at hand.