Google introduces Flow and Veo 3 to simplify AI video creation

Google’s splashy Google I/O 2025 keynote was packed with AI headlines: a next-generation assistant (Gemini), handwriting-to-code tools (Jules), and more. But for filmmakers, animators, and anyone who’s ever thought, “I wish making that short clip were as easy as writing a sentence,” the real star of the show might be Flow—Google’s new AI video “editing” suite built on its latest generation of generative models.

Imagine sketching out a shot list in plain English—“a close-up of a bubbling brook, then pan left to reveal a deer grazing at dawn”—and having an eight-second clip appear, complete with sound effects. That’s the idea behind Flow, which stitches together short AI-generated video “takes” into mini sequences using simple drag-and-drop tools and an integrated “scene builder.” You can kick things off with a text-to-video prompt or feed Flow a handful of reference images (“ingredients”), helping the model nail the look and feel you want—whether that’s a photorealistic cityscape or a stylized cartoon vignette.

In demos shared at I/O, Google showed a clip that starts as an animated scene, pulls back to reveal the animation playing on an old-school TV set, then zooms out again to show a cozy living room—and finally, the “camera” flies through a window to follow a delivery truck rumbling down the street. It really does feel like a lightweight NLE (non-linear editor), except every frame was dreamed up by AI instead of filmed with a camera.

According to the official Google blog, Flow was co-designed with filmmakers, YouTubers, and visual artists, and it weaves together three of Google’s core models—Veo for video, Imagen for stills, and Gemini for text understanding—into a single creative sandbox.

Flow wouldn’t exist without its video engines, and Google used I/O to unveil two major updates:

Veo 3 is a top-to-bottom overhaul. It boosts resolution (up to 4K output), injects real-world physics (shadows, reflections, fluid dynamics), and—crucially—generates native audio tracks alongside video. In one demo, Veo 3 produced a moody street scene complete with distant car horns, chirping birds, and two characters exchanging dialogue, all in a single eight-second clip. It also “gets” longer, more complex prompts, so you can chain together a mini-narrative (“elderly librarian finds a glowing book, opens it, and steps into a magical forest”), and Veo 3 will attempt to honor each beat.
Veo 2, meanwhile, picked up some practical polish:
- Reference-powered video lets you drop in images of characters, props, or preferred cinematography styles so that every clip stays on-brand.
- Camera controls such as dollies, zooms, and rotations give you more cinematic finesse.
- Outpainting extends your frame beyond the original aspect ratio—great when you need to shift between portrait and landscape for social-media repurposing.
- Object add/remove lets you seamlessly insert or erase elements (spaceships, stray pedestrians, your cat), with AI handling scale, lighting, and shadow to keep things believable.

These Veo 2 features are rolling out in Flow today, with Vertex AI API support for enterprises coming in the weeks ahead.

Google’s Imagen model has always turned heads for photorealism, and Imagen 4 refines that craft with crisper details—think individual strands of hair, water droplets on a leaf, or the weave of a tweed jacket. It also finally conquers AI’s Achilles’ heel: text. Whether you’re designing a fake movie poster or drafting a whimsical greeting card, any letters in your scene now look like actual handwriting or typeset fonts, not gibberish blobs.

On top of finer fidelity, Imagen 4 can export images up to 2K in multiple aspect ratios, making it more than just a web-demo novelty. If you need to drop a generated asset into a slide deck or print it on canvas, you won’t be left squinting at pixelation.

Flow, Veo 3, and Imagen 4 aren’t free: Google is bundling them into two subscription tiers under the “Google AI” umbrella:

AI Pro ($19.99/month) includes Flow with Veo 2 and Imagen 4 access—enough to experiment, storyboard, and crank out a few dozen micro-clips each month.
AI Ultra ($249.99/month) ups the ante with Veo 3’s native audio generation, higher usage caps across all models (video, image, and beyond), and early access to bleeding-edge features like Veo 3’s dialogue engine..

At launch, both Pro and Ultra are limited to U.S. subscribers, but Google says it plans to expand to more countries “soon.”

AI-generated video has been tinkered with by researchers, startups, and open-source enthusiasts for months, but it’s never felt as turnkey as this. By wrapping sophisticated video-and-audio synthesis in an approachable editor, Google is effectively saying, “We want anyone with a story in their head to try this—no film school required.”

Of course, there are open questions. Will studios trust eight-second clips as proof-of-concept for feature-length projects? How will copyright and fair use evolve when “ingredients” can be traced back to reference images? And can Flow handle the subtlety of human performance, or will every digital actor feel like a CGI mannequin?

But for now, Flow represents a big step toward democratizing visual storytelling. Instead of wrangling camera rigs, lighting grids, and sound mixers, creators can focus on the spark of an idea—and let Google’s AI models handle the shot list. As with any new creative tool, the best work often comes from unexpected tinkerers: indie musicians making music videos, educators illustrating concepts, or hobbyists crafting animated short stories.

If you’re in the U.S. and itching to turn that idea into motion, it might be time to pick up a Google AI Pro trial (or dive deep with AI Ultra) and give Flow a spin. After all, if a two-minute pitch can be distilled into a few eight-second clips, who knows what stories you’ll tell next?