By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleRoboticsTech

DeepMind’s Gemini Robotics-ER 1.6 pushes embodied AI into the real world

DeepMind’s Gemini Robotics‑ER 1.6 is a new “robot brain” built to help machines actually understand messy real‑world spaces instead of just lab‑perfect scenes.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Apr 14, 2026, 12:59 PM EDT
Share
We may get a commission from retail offers. Learn more
Gemini Robotics-ER 1.6 automotive diagnostic gauge system with labeled pressure gauges and components in a professional mechanic's garage
Image: Google
SHARE

DeepMind is rolling out a new kind of robot brain today, and it’s aimed squarely at the messiness of the real world rather than a clean lab demo. Gemini Robotics-ER 1.6 is the latest “embodied reasoning” model from Google DeepMind, designed to help robots not just see their surroundings, but actually understand what they’re looking at and decide what to do next.

In practical terms, this is the model that sits on top of a robot’s cameras and sensors and acts like a high-level planner. It can parse multiple video feeds, figure out where things are, plan a sequence of actions, call tools like Google Search or other vision-language-action systems, and then tell the robot what to try next. DeepMind describes it as a “reasoning-first” model for the physical world, with a focus on three pillars: visual and spatial understanding, task planning, and knowing when a task has actually succeeded.

The headline upgrades over the previous Robotics-ER 1.5 and the general-purpose Gemini 3.0 Flash models are all about that spatial and physical reasoning layer. According to DeepMind’s internal benchmarks, 1.6 is noticeably better at things like pointing precisely to objects, counting items, and determining whether a job is finished based on what the cameras see. It’s also unlocking a new capability that sounds niche but is surprisingly important in industry: reading instruments like analog gauges and sight glasses with high accuracy, even when the camera view is imperfect.

A bar chart titled "Success rate (%)" comparing three models across four tasks. Gemini Robotics-ER 1.6 (dark blue) consistently outperforms Gemini 3.0 Flash (medium blue) and Gemini Robotics-ER 1.5 (light blue).
Image: Google

Pointing might sound trivial, but for robots it’s the foundation for almost everything else. If a model can’t reliably say “this is the blue cup” or “these are all the pliers,” any downstream motion planning will be shaky. Robotics-ER 1.6 uses points as intermediate reasoning steps: it can mark the location of objects, use those points to count, or identify “salient points” on an image to help with metric estimates like distances or proportions. DeepMind shows a simple, very human example: a cluttered tool bench with hammers, scissors, paintbrushes, pliers and garden tools. 1.6 manages to correctly count and point to each requested category, and—crucially—does not invent items that aren’t there, like a wheelbarrow or a specific drill brand that was mentioned in the prompt. Earlier models either miscounted or hallucinated objects.

Gemini Robotics-ER 1.6 correctly identifies the number of hammers (2), scissors (1), paintbrushes (1), pliers (6), and a collection of garden tools which can be interpreted as a single group or multiple points. It does not point to requested items that are not present in the image — a wheelbarrow and Ryobi drill. In comparison Gemini Robotics-ER 1.5 fails to identify the correct number of hammers or paint brushes, misses the scissors altogether, hallucinates a wheelbarrow and lacks precision on plier pointing . Gemini 3.0 Flash is close to Gemini Robotics-ER 1.6, but does not
Image: Google

That ability not to hallucinate visually is a quiet but big deal. A lot of modern vision-language models will confidently label or count things that simply don’t exist in the image if you nudge them in that direction. For a chatbot, that’s annoying. For a robot working around humans or heavy equipment, that’s a safety risk. Gemini Robotics-ER 1.6 appears much stricter here: if the requested object is absent, it just doesn’t point.

The second major piece is “success detection” — essentially teaching a robot to know when it can stop. In real environments, tasks rarely play out exactly like a textbook example. Objects move, lighting changes, camera views are partially blocked, and the robot itself may be juggling multiple camera angles, like an overhead view plus a wrist‑mounted camera on its arm. With 1.6, DeepMind has pushed multi-view reasoning forward, so the model can fuse multiple camera streams over time and decide whether a task like “put the blue pen into the black pen holder” has actually been completed. That’s the difference between a robot endlessly fiddling with a pen it already placed correctly, and one that can confidently move on to the next step in a multi-stage plan.

Your browser does not support the video tag.

Where things really start to look like a bridge to industrial deployments is the new instrument-reading capability. DeepMind developed this in close collaboration with Boston Dynamics, which has been using its Spot robot dog for industrial inspections—think factory floors, power plants, and construction sites where a human would otherwise walk around with a handheld camera, clipboard, or tablet. Spot can already do autonomous inspection runs and capture photos and data from all over a facility; the missing piece has been turning those images into reliable measurements without a human looking at every frame.

Gemini Robotics-ER 1.6 is meant to sit on top of that pipeline and interpret everything from circular pressure gauges to vertical level indicators and digital displays. Reading an analog gauge sounds simple—until you consider lens distortion, odd angles, small tick marks, labels with different units, and occasionally multiple needles that map to different decimal places. Sight glasses add another headache: you have to estimate fill levels from a camera perspective that might distort the perceived liquid boundary. DeepMind says 1.6 uses a combination of zooming, precise pointing and code execution to handle this, a technique they call “agentic vision” that first appeared with Gemini 3.

With agentic vision enabled, the model can autonomously crop into a gauge, zoom in to read fine details, then use simple code to estimate proportions and intervals, essentially turning the image into a more structured measurement problem. DeepMind’s internal numbers show a jump in instrument-reading success from 23% for Robotics-ER 1.5 to 86% for 1.6, and up to 93% when agentic vision is switched on. That’s the kind of accuracy that starts to be genuinely useful for routine industrial inspection, especially when combined with Spot’s growing role as a standard inspection platform in sectors like energy, manufacturing, and mining.

A bar chart titled "Instrument Reading" showing success rates for four AI models. Performance increases significantly from left to right: Gemini Robotics-ER 1.5: 23% Gemini 3.0 Flash: 67% Gemini Robotics-ER 1.6: 86% Gemini Robotics-ER 1.6 w/ agentic vision: 93% (highest, shown with a striped pattern)
Image: Google

Boston Dynamics is clearly leaning into this. The company has spent the last few years positioning Spot as an “agile mobile sensor platform” that can roam high-risk or hard-to-reach areas, capture data, and feed it into monitoring systems like Orbit, its management layer for robot fleets and inspection routes. With something like Gemini Robotics-ER 1.6 reading gauges and spotting anomalies, you can imagine a near-future workflow where a human engineer spends far less time walking the plant and far more time responding to data-driven alerts and trends.

All of this power comes with an obvious question: how safe is it to hand more autonomy to AI-driven robots? DeepMind’s answer is that 1.6 is its “safest robotics model yet,” and they back that up with tests against the ASIMOV safety benchmark, which was designed specifically to probe how foundation models behave as robot brains in risky situations. Earlier work on Robotics-ER 1.5 already focused heavily on two things: refusing harmful plans (semantic safety), and respecting physical constraints like payload limits or “don’t handle liquids” instructions. With 1.6, those safety behaviors improve further, especially when it comes to physical constraint awareness via spatial outputs like pointing.

A bar chart titled "ASIMOV - Safety Instruction Following" comparing three models—Gemini Robotics-ER 1.5, Gemini 3.0 Flash, and Gemini Robotics-ER 1.6—across three categories: Text Accuracy, Point Accuracy, and BBox Accuracy. The y-axis represents "Violation rate (%)" where "Higher is better."
Image: Google

In practice, this means the model is better at answers like “don’t pick up that object, it looks too heavy for this gripper” or “avoid interacting with that container, it appears to hold liquid,” rather than blindly following a user instruction. DeepMind also evaluated 1.6 on text and video scenarios derived from real-world injury reports, and reports about a 6% improvement in text-based risk perception and 10% in video over a baseline Gemini 3.0 Flash setup. It’s not a formal guarantee of safety, but the direction is clear: the models that power physical agents are being tuned specifically to spot trouble before it happens.

As with most modern AI launches, developers don’t have to wait long to play with this. Gemini Robotics-ER 1.6 is available starting today through the Gemini API and Google AI Studio, with a dedicated robotics overview and a Colab notebook that walks through configuration and prompting for embodied reasoning tasks. That makes it accessible not just to big robotics labs, but to smaller teams experimenting with robot arms, mobile bases, and custom hardware that need a smarter perception and planning layer on top.

DeepMind also seems keen to make this a two-way street with the robotics community. If the model falls short on a particular specialized use case, the company is inviting partners to submit 10–50 labeled images that highlight specific failure modes, which can then be used to harden the model’s reasoning for future releases. It’s a fairly lightweight feedback loop, but in a space where edge cases are endless—every facility or warehouse looks different—that kind of targeted data could matter.

Zooming out, Gemini Robotics-ER 1.6 fits into a broader trend: turning large multimodal models into “generalist” robot brains that can transfer knowledge across embodiments, tools, and environments. The previous Robotics-ER 1.5 already demonstrated state-of-the-art performance on a wide range of embodied reasoning benchmarks and agentic capabilities like breaking down long-horizon tasks and orchestrating tool use. The 1.6 upgrade isn’t about splashy new tricks so much as tightening the screws on the pieces that matter in the field: precise spatial reasoning, multi‑view understanding, instrument reading, and safety.

If you’re in robotics or industrial automation, the significance is straightforward: we’re inching closer to robots that can not only fetch and carry, but also patrol, inspect, and make first-line judgments about the health and safety of complex facilities without constant human supervision. For everyone else, you might not notice Gemini Robotics-ER 1.6 directly—but the next time a robot dog is quietly walking a refinery at night, reading gauges and listening for anomalies so a human doesn’t have to, there’s a good chance something like this model is doing the thinking behind the scenes.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Gemini AI (formerly Bard)Google DeepMind
Leave a Comment

Leave a ReplyCancel reply

Most Popular

How to get YouTube Premium free in 2026

What is YouTube Premium and should you pay for it?

NVIDIA adds MiniMax M2.7 to its AI stack for production-ready agents

2026 Samsung Bespoke AI fridge and range series now available

YouTube Premium raises US prices across all major tiers

Also Read
Amazon Leo commercial aviation antenna on an airplane in flight

Amazon Leo unveils gigabit-speed in-flight Wi-Fi for airlines

Scene from 2024 Mr. & Mrs. Smith series

How to stream the new ‘Mr. & Mrs. Smith’ series

Person using Insta360 Snap Selfie Screen camera with smartphone displaying live preview and LED ring lighting

Insta360 Snap turns your phone’s rear camera into a selfie beast

Google logo in blue gradient text on white background

Google Doodle celebrates World Quantum Day with a qubit Bloch sphere

Ray-Ban Meta smart glasses

Meta’s Muse Spark AI is about to supercharge Ray-Ban smart glasses

Kristina Kallas, Minister of Education arrives to attend in meeting of EU Ministers at the European Council headquarters in Brussels, Belgium on May 23, 2023.

Estonia tells EU to regulate Big Tech instead of banning kids from social media

X social media logo (formerly Twitter)

X cracks down on reposts to pay true creators more

An open hand with the Instagram logo overlayed, featuring a gradient of pink, purple, orange, and yellow tones, set against a black background.

Instagram adds 15-minute window to edit comments

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.