By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleTech

Google unveils Gemini 2.5 Computer Use AI that can browse the web like a human

Google’s Gemini 2.5 Computer Use AI can navigate web pages, fill forms, and interact with browsers visually, offering developers new tools for automation and UI testing.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Oct 9, 2025, 7:12 AM EDT
Share
We may get a commission from retail offers. Learn more
Split screen showing code on the left side and 'Gemini 2.5 Computer Use' text overlaid on a blue gradient background with coding symbols and a cursor icon on the right side.
Image: Google
SHARE

On October 7, 2025, Google unveiled Gemini 2.5 Computer Use, a specialized version of its Gemini family that doesn’t just answer questions — it acts inside a browser. Send it a goal, give it a screenshot and a short action history, and it will reason about the interface, then click, type, scroll, drag and drop to try to finish the job. The pitch is simple: some tasks can’t be solved with an API or a backend call, so teach the model to use the same graphical interfaces humans do.

Gemini 2.5 Computer Use works in a loop. Your app sends the model a user request, a screenshot of the page, and a recent action history. The model replies with a function-like instruction — “click here,” “type that,” “drag this” — which your client executes in the browser. After the action runs, the client sends a new screenshot and URL back, and the model plans the next step. Google says the tool currently exposes 13 UI actions and is optimized for browsers (it shows promise on mobile), but it’s not built to control desktop OS features.

Google published a set of demos — sped up 3× in the videos — showing the model doing things like signing up users, reorganizing a sticky-note board and even playing a game of 2048. Those demos live in a hosted demo on Browserbase; developers can also access the preview through Google AI Studio and Vertex AI.

Google says Gemini 2.5 Computer Use “outperforms leading alternatives on multiple web and mobile control benchmarks,” showing a mix of high accuracy and low latency on evaluation suites such as Online-Mind2Web, WebVoyager and AndroidWorld. Browserbase — an independent testbed that runs models through browser automation harnesses — published its own evaluations and found Google’s preview model to be a strong performer. As always with vendor benchmarks, take the absolute numbers with a grain of salt; the trend, however, is clear: major labs are now prioritizing the speed and robustness of browser-action loops.

Where it fits in the contest for “agents”

This move sits squarely in a broader industry push to give AI agents the ability to complete multi-step, real-world tasks. OpenAI, at its latest Dev Day, leaned into apps and deeper ChatGPT integrations that let the assistant operate inside partner apps and services; Anthropic shipped “computer use” capabilities for Claude last year that can control a virtual keyboard and cursor in richer ways. Google’s angle is narrower and explicit: browser-native control, with an API and safety controls designed for that environment. That both narrows the attack surface and makes integration more practical for web-based automation tasks.

Safety, guardrails and the obvious worries

Google says it trained safety features into the model and provides developer controls — for example, a per-step safety service that vets each proposed action and system-level instructions that can require user confirmation before high-risk actions (purchases, system-level changes, anything that might compromise a device). The company explicitly calls out risks such as prompt injection, scams, and the possibility of misusing automation to bypass protections like CAPTCHAs. Those safeguards are important, but they’re not a silver bullet; building safe, reliable automations still requires careful systems design and testing.

Two realities to keep in mind: first, browser-based agents make it easy to automate useful workflows — filling forms, scraping data from sites that lack APIs, and end-to-end UI testing. Second, the same capability can be abused to automate fraud, data-exfiltration or large-scale scraping unless operators lock scopes, throttle actions and add robust monitoring. Google provides tools; developers must enforce them.

Why developers should care (and what they’ll actually build)

For teams that build or test web apps, this is a practical tool. Google mentions internal use-cases like UI testing (automatically finding and recovering from test breakage), automating repetitive data-entry behind authenticated sessions, and prototype personal assistants that can shop or schedule by interacting with web pages. For startups and automation shops, the promise is clear: replace brittle DOM scripts or ad-hoc RPA hacks with an LLM-driven “screen-aware” loop that understands visual context. Browserbase and other integration projects already show people pairing Gemini’s output with Playwright or similar runtimes to run the action loop.

The practical limits today

Even with the hype, Gemini 2.5 Computer Use has boundaries. It’s a preview, optimized for browsers; Google says it’s not yet tuned for desktop OS control. The model’s action set is finite (13 actions at launch), and real-world pages are messy — dynamic elements, multi-frame UIs, rate limits, and anti-bot protections still complicate automation. In other words: it’s a big step, but not an all-powerful desktop puppet.

Final read: a cautious excitement

Gemini 2.5 Computer Use is an important milestone in the “agent” era. It reframes what “using a computer” means for an AI: not issuing API calls from a server, but seeing an interface and reasoning about next steps the way a person would. That unlocks a lot of real productivity gains — faster UI testing, better automation for apps without APIs, new kinds of personal assistants — while also raising classic AI governance questions about misuse, privacy and reliability.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Gemini AI (formerly Bard)
Most Popular

What is Amazon Prime Video and how does it work for cord-cutters

The iPhone 18 Pro camera story Apple wanted to tell—and the Halide lawsuit it got

Opera GX releases native Linux build with full feature set

Google tests Gemini Mac app with Desktop Intelligence

Sony ULT Wear with ULT bass button falls to $140 in rare discount

Also Read
A tilted laptop or monitor screen showing the X.com homepage with a large white stylized “X” logo on a dark background and blurred login/signup interface elements on the right side.

Jury says Elon Musk misled Twitter investors in $44 billion deal

Meta logo on big screen and Mark Zuckerberg silhouette. Facebook company, Meta Platforms.

Meta’s metaverse isn’t dead, just awkwardly alive

Amazon smile logo

Amazon is building an Alexa phone to fix its Fire Phone mistakes

A Windows 11 desktop wallpaper with a blue abstract swirl is shown in four quadrants, each demonstrating a different taskbar position: bottom horizontal taskbar, top horizontal taskbar, left vertical taskbar, and right vertical taskbar.

Windows 11 will soon let you move the taskbar again

Windows 11 logo with white Windows icon and ‘Windows 11’ text on a solid blue background.

You can now pause Windows updates for as long as you want

Aqara Camera Hub G350

The first Matter camera is here — and it’s from Aqara

Hermès Paddock Duo charger

The most expensive way to charge an iPhone comes from Hermès

This image shows the OpenAI logo prominently displayed in white text against a vibrant, abstract background. The background features swirling patterns of deep green, turquoise blue, and occasional splashes of purple and pink. The texture resembles a watercolor or digital painting with fluid, organic forms that create a sense of movement across the image. The high-contrast white "OpenAI" text stands out clearly against this colorful, artistic backdrop.

OpenAI superapp: agentic ChatGPT, Codex, and Atlas in one place

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.