By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleTech

Google unveils Gemini 2.5 Computer Use AI that can browse the web like a human

Google’s Gemini 2.5 Computer Use AI can navigate web pages, fill forms, and interact with browsers visually, offering developers new tools for automation and UI testing.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Oct 9, 2025, 7:12 AM EDT
Share
We may get a commission from retail offers. Learn more
Split screen showing code on the left side and 'Gemini 2.5 Computer Use' text overlaid on a blue gradient background with coding symbols and a cursor icon on the right side.
Image: Google
SHARE

On October 7, 2025, Google unveiled Gemini 2.5 Computer Use, a specialized version of its Gemini family that doesn’t just answer questions — it acts inside a browser. Send it a goal, give it a screenshot and a short action history, and it will reason about the interface, then click, type, scroll, drag and drop to try to finish the job. The pitch is simple: some tasks can’t be solved with an API or a backend call, so teach the model to use the same graphical interfaces humans do.

Gemini 2.5 Computer Use works in a loop. Your app sends the model a user request, a screenshot of the page, and a recent action history. The model replies with a function-like instruction — “click here,” “type that,” “drag this” — which your client executes in the browser. After the action runs, the client sends a new screenshot and URL back, and the model plans the next step. Google says the tool currently exposes 13 UI actions and is optimized for browsers (it shows promise on mobile), but it’s not built to control desktop OS features.

Google published a set of demos — sped up 3× in the videos — showing the model doing things like signing up users, reorganizing a sticky-note board and even playing a game of 2048. Those demos live in a hosted demo on Browserbase; developers can also access the preview through Google AI Studio and Vertex AI.

Google says Gemini 2.5 Computer Use “outperforms leading alternatives on multiple web and mobile control benchmarks,” showing a mix of high accuracy and low latency on evaluation suites such as Online-Mind2Web, WebVoyager and AndroidWorld. Browserbase — an independent testbed that runs models through browser automation harnesses — published its own evaluations and found Google’s preview model to be a strong performer. As always with vendor benchmarks, take the absolute numbers with a grain of salt; the trend, however, is clear: major labs are now prioritizing the speed and robustness of browser-action loops.

Where it fits in the contest for “agents”

This move sits squarely in a broader industry push to give AI agents the ability to complete multi-step, real-world tasks. OpenAI, at its latest Dev Day, leaned into apps and deeper ChatGPT integrations that let the assistant operate inside partner apps and services; Anthropic shipped “computer use” capabilities for Claude last year that can control a virtual keyboard and cursor in richer ways. Google’s angle is narrower and explicit: browser-native control, with an API and safety controls designed for that environment. That both narrows the attack surface and makes integration more practical for web-based automation tasks.

Safety, guardrails and the obvious worries

Google says it trained safety features into the model and provides developer controls — for example, a per-step safety service that vets each proposed action and system-level instructions that can require user confirmation before high-risk actions (purchases, system-level changes, anything that might compromise a device). The company explicitly calls out risks such as prompt injection, scams, and the possibility of misusing automation to bypass protections like CAPTCHAs. Those safeguards are important, but they’re not a silver bullet; building safe, reliable automations still requires careful systems design and testing.

Two realities to keep in mind: first, browser-based agents make it easy to automate useful workflows — filling forms, scraping data from sites that lack APIs, and end-to-end UI testing. Second, the same capability can be abused to automate fraud, data-exfiltration or large-scale scraping unless operators lock scopes, throttle actions and add robust monitoring. Google provides tools; developers must enforce them.

Why developers should care (and what they’ll actually build)

For teams that build or test web apps, this is a practical tool. Google mentions internal use-cases like UI testing (automatically finding and recovering from test breakage), automating repetitive data-entry behind authenticated sessions, and prototype personal assistants that can shop or schedule by interacting with web pages. For startups and automation shops, the promise is clear: replace brittle DOM scripts or ad-hoc RPA hacks with an LLM-driven “screen-aware” loop that understands visual context. Browserbase and other integration projects already show people pairing Gemini’s output with Playwright or similar runtimes to run the action loop.

The practical limits today

Even with the hype, Gemini 2.5 Computer Use has boundaries. It’s a preview, optimized for browsers; Google says it’s not yet tuned for desktop OS control. The model’s action set is finite (13 actions at launch), and real-world pages are messy — dynamic elements, multi-frame UIs, rate limits, and anti-bot protections still complicate automation. In other words: it’s a big step, but not an all-powerful desktop puppet.

Final read: a cautious excitement

Gemini 2.5 Computer Use is an important milestone in the “agent” era. It reframes what “using a computer” means for an AI: not issuing API calls from a server, but seeing an interface and reasoning about next steps the way a person would. That unlocks a lot of real productivity gains — faster UI testing, better automation for apps without APIs, new kinds of personal assistants — while also raising classic AI governance questions about misuse, privacy and reliability.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Gemini AI (formerly Bard)
Most Popular

DJI’s FC200 and T200 drones push industrial delivery and agriculture into the 200kg era

ChatGPT for Clinicians is now free for verified US doctors

DJI Osmo Mobile 8P debuts with detachable remote and smarter tracking

GoPro Mission 1 series is powerful, pricey, and not for casual users

Opera GX Playground bundles panic button, Fake My History and Grass Touching Corner

Also Read
Tesla humanoid robot Optimus standing outdoors near a building entrance, raising one hand in a waving gesture. The robot has a sleek black-and-gold design with a reflective black face panel and “TESLA” branding on its chest. Part of a Tesla Cybercab vehicle is visible in the foreground, with trees, landscaping, and people walking in the background.

Elon Musk blames copycats for delayed Tesla Optimus reveal

Abstract 3D composition of colorful geometric shapes balanced on a horizontal red beam against a black background. The arrangement includes a blue half-sphere, a red half-bowl shape, an orange cube, a green rectangular block, a blue trapezoid, a yellow sphere, and a red triangular prism, creating a minimalist modern design.

Decoupled DiLoCo brings chaos-resilient AI pre-training to Google’s global fleet

Promotional poster for Apple TV series “Star City” featuring a close-up of a person’s face partially revealed through a torn paper-like red and white graphic on a dark background. The Apple TV logo appears above the bold white title “STAR CITY” on the right side, creating a dramatic sci-fi thriller visual style.

Apple TV shares Star City trailer previewing its next premium sci-fi drama after For All Mankind

Anthropic

Investors chase Anthropic as its secondary value tops $1 trillion

ChatGPT Workspace Agents Library

OpenAI’s new workspace agents let ChatGPT run end-to-end team processes

Claude Cowork logo and text on a light grey background, featuring a coral-colored starburst icon next to the product name in black serif font.

Anthropic adds interactive charts and diagrams to Claude Cowork

Screenshot of an AI chat interface showing the model selection dropdown menu open. “Kimi K2.6 Thinking” is selected at the top, with options including Best, Kimi K2.6 (marked New), Claude Sonnet 4.6, Claude Opus 4.7 (marked Max), and Nemotron 3 Super. A tooltip on the right says “Moonshot AI’s latest model,” highlighting Kimi K2.6.

Perplexity Pro and Max just got Kimi K2.6 support

Kimi K2.6 hero image

Kimi K2.6 is Moonshot’s new engine for autonomous coding and research

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.