By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleRoboticsTech

DeepMind’s Gemini Robotics-ER 1.6 pushes embodied AI into the real world

DeepMind’s Gemini Robotics‑ER 1.6 is a new “robot brain” built to help machines actually understand messy real‑world spaces instead of just lab‑perfect scenes.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Apr 14, 2026, 12:59 PM EDT
Share
We may get a commission from retail offers. Learn more
Gemini Robotics-ER 1.6 automotive diagnostic gauge system with labeled pressure gauges and components in a professional mechanic's garage
Image: Google
SHARE

DeepMind is rolling out a new kind of robot brain today, and it’s aimed squarely at the messiness of the real world rather than a clean lab demo. Gemini Robotics-ER 1.6 is the latest “embodied reasoning” model from Google DeepMind, designed to help robots not just see their surroundings, but actually understand what they’re looking at and decide what to do next.

In practical terms, this is the model that sits on top of a robot’s cameras and sensors and acts like a high-level planner. It can parse multiple video feeds, figure out where things are, plan a sequence of actions, call tools like Google Search or other vision-language-action systems, and then tell the robot what to try next. DeepMind describes it as a “reasoning-first” model for the physical world, with a focus on three pillars: visual and spatial understanding, task planning, and knowing when a task has actually succeeded.

The headline upgrades over the previous Robotics-ER 1.5 and the general-purpose Gemini 3.0 Flash models are all about that spatial and physical reasoning layer. According to DeepMind’s internal benchmarks, 1.6 is noticeably better at things like pointing precisely to objects, counting items, and determining whether a job is finished based on what the cameras see. It’s also unlocking a new capability that sounds niche but is surprisingly important in industry: reading instruments like analog gauges and sight glasses with high accuracy, even when the camera view is imperfect.

A bar chart titled "Success rate (%)" comparing three models across four tasks. Gemini Robotics-ER 1.6 (dark blue) consistently outperforms Gemini 3.0 Flash (medium blue) and Gemini Robotics-ER 1.5 (light blue).
Image: Google

Pointing might sound trivial, but for robots it’s the foundation for almost everything else. If a model can’t reliably say “this is the blue cup” or “these are all the pliers,” any downstream motion planning will be shaky. Robotics-ER 1.6 uses points as intermediate reasoning steps: it can mark the location of objects, use those points to count, or identify “salient points” on an image to help with metric estimates like distances or proportions. DeepMind shows a simple, very human example: a cluttered tool bench with hammers, scissors, paintbrushes, pliers and garden tools. 1.6 manages to correctly count and point to each requested category, and—crucially—does not invent items that aren’t there, like a wheelbarrow or a specific drill brand that was mentioned in the prompt. Earlier models either miscounted or hallucinated objects.

Gemini Robotics-ER 1.6 correctly identifies the number of hammers (2), scissors (1), paintbrushes (1), pliers (6), and a collection of garden tools which can be interpreted as a single group or multiple points. It does not point to requested items that are not present in the image — a wheelbarrow and Ryobi drill. In comparison Gemini Robotics-ER 1.5 fails to identify the correct number of hammers or paint brushes, misses the scissors altogether, hallucinates a wheelbarrow and lacks precision on plier pointing . Gemini 3.0 Flash is close to Gemini Robotics-ER 1.6, but does not
Image: Google

That ability not to hallucinate visually is a quiet but big deal. A lot of modern vision-language models will confidently label or count things that simply don’t exist in the image if you nudge them in that direction. For a chatbot, that’s annoying. For a robot working around humans or heavy equipment, that’s a safety risk. Gemini Robotics-ER 1.6 appears much stricter here: if the requested object is absent, it just doesn’t point.

The second major piece is “success detection” — essentially teaching a robot to know when it can stop. In real environments, tasks rarely play out exactly like a textbook example. Objects move, lighting changes, camera views are partially blocked, and the robot itself may be juggling multiple camera angles, like an overhead view plus a wrist‑mounted camera on its arm. With 1.6, DeepMind has pushed multi-view reasoning forward, so the model can fuse multiple camera streams over time and decide whether a task like “put the blue pen into the black pen holder” has actually been completed. That’s the difference between a robot endlessly fiddling with a pen it already placed correctly, and one that can confidently move on to the next step in a multi-stage plan.

Your browser does not support the video tag.

Where things really start to look like a bridge to industrial deployments is the new instrument-reading capability. DeepMind developed this in close collaboration with Boston Dynamics, which has been using its Spot robot dog for industrial inspections—think factory floors, power plants, and construction sites where a human would otherwise walk around with a handheld camera, clipboard, or tablet. Spot can already do autonomous inspection runs and capture photos and data from all over a facility; the missing piece has been turning those images into reliable measurements without a human looking at every frame.

Gemini Robotics-ER 1.6 is meant to sit on top of that pipeline and interpret everything from circular pressure gauges to vertical level indicators and digital displays. Reading an analog gauge sounds simple—until you consider lens distortion, odd angles, small tick marks, labels with different units, and occasionally multiple needles that map to different decimal places. Sight glasses add another headache: you have to estimate fill levels from a camera perspective that might distort the perceived liquid boundary. DeepMind says 1.6 uses a combination of zooming, precise pointing and code execution to handle this, a technique they call “agentic vision” that first appeared with Gemini 3.

With agentic vision enabled, the model can autonomously crop into a gauge, zoom in to read fine details, then use simple code to estimate proportions and intervals, essentially turning the image into a more structured measurement problem. DeepMind’s internal numbers show a jump in instrument-reading success from 23% for Robotics-ER 1.5 to 86% for 1.6, and up to 93% when agentic vision is switched on. That’s the kind of accuracy that starts to be genuinely useful for routine industrial inspection, especially when combined with Spot’s growing role as a standard inspection platform in sectors like energy, manufacturing, and mining.

A bar chart titled "Instrument Reading" showing success rates for four AI models. Performance increases significantly from left to right: Gemini Robotics-ER 1.5: 23% Gemini 3.0 Flash: 67% Gemini Robotics-ER 1.6: 86% Gemini Robotics-ER 1.6 w/ agentic vision: 93% (highest, shown with a striped pattern)
Image: Google

Boston Dynamics is clearly leaning into this. The company has spent the last few years positioning Spot as an “agile mobile sensor platform” that can roam high-risk or hard-to-reach areas, capture data, and feed it into monitoring systems like Orbit, its management layer for robot fleets and inspection routes. With something like Gemini Robotics-ER 1.6 reading gauges and spotting anomalies, you can imagine a near-future workflow where a human engineer spends far less time walking the plant and far more time responding to data-driven alerts and trends.

All of this power comes with an obvious question: how safe is it to hand more autonomy to AI-driven robots? DeepMind’s answer is that 1.6 is its “safest robotics model yet,” and they back that up with tests against the ASIMOV safety benchmark, which was designed specifically to probe how foundation models behave as robot brains in risky situations. Earlier work on Robotics-ER 1.5 already focused heavily on two things: refusing harmful plans (semantic safety), and respecting physical constraints like payload limits or “don’t handle liquids” instructions. With 1.6, those safety behaviors improve further, especially when it comes to physical constraint awareness via spatial outputs like pointing.

A bar chart titled "ASIMOV - Safety Instruction Following" comparing three models—Gemini Robotics-ER 1.5, Gemini 3.0 Flash, and Gemini Robotics-ER 1.6—across three categories: Text Accuracy, Point Accuracy, and BBox Accuracy. The y-axis represents "Violation rate (%)" where "Higher is better."
Image: Google

In practice, this means the model is better at answers like “don’t pick up that object, it looks too heavy for this gripper” or “avoid interacting with that container, it appears to hold liquid,” rather than blindly following a user instruction. DeepMind also evaluated 1.6 on text and video scenarios derived from real-world injury reports, and reports about a 6% improvement in text-based risk perception and 10% in video over a baseline Gemini 3.0 Flash setup. It’s not a formal guarantee of safety, but the direction is clear: the models that power physical agents are being tuned specifically to spot trouble before it happens.

As with most modern AI launches, developers don’t have to wait long to play with this. Gemini Robotics-ER 1.6 is available starting today through the Gemini API and Google AI Studio, with a dedicated robotics overview and a Colab notebook that walks through configuration and prompting for embodied reasoning tasks. That makes it accessible not just to big robotics labs, but to smaller teams experimenting with robot arms, mobile bases, and custom hardware that need a smarter perception and planning layer on top.

DeepMind also seems keen to make this a two-way street with the robotics community. If the model falls short on a particular specialized use case, the company is inviting partners to submit 10–50 labeled images that highlight specific failure modes, which can then be used to harden the model’s reasoning for future releases. It’s a fairly lightweight feedback loop, but in a space where edge cases are endless—every facility or warehouse looks different—that kind of targeted data could matter.

Zooming out, Gemini Robotics-ER 1.6 fits into a broader trend: turning large multimodal models into “generalist” robot brains that can transfer knowledge across embodiments, tools, and environments. The previous Robotics-ER 1.5 already demonstrated state-of-the-art performance on a wide range of embodied reasoning benchmarks and agentic capabilities like breaking down long-horizon tasks and orchestrating tool use. The 1.6 upgrade isn’t about splashy new tricks so much as tightening the screws on the pieces that matter in the field: precise spatial reasoning, multi‑view understanding, instrument reading, and safety.

If you’re in robotics or industrial automation, the significance is straightforward: we’re inching closer to robots that can not only fetch and carry, but also patrol, inspect, and make first-line judgments about the health and safety of complex facilities without constant human supervision. For everyone else, you might not notice Gemini Robotics-ER 1.6 directly—but the next time a robot dog is quietly walking a refinery at night, reading gauges and listening for anomalies so a human doesn’t have to, there’s a good chance something like this model is doing the thinking behind the scenes.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Gemini AI (formerly Bard)Google DeepMind
Leave a Comment

Leave a ReplyCancel reply

Most Popular

Perplexity Computer is now inside Microsoft Teams

Apple gives up on Vision Pro after M5 refresh fails

Google Docs now lets you set custom instructions for Gemini

Google Workspace now has a central hub to control all AI and agent access

Apple’s rumored 32-inch iMac Ultra sounds absolutely wild

Also Read
Perplexity illustration. Abstract illustration of a transparent glass cube refracting beams of light into rainbow-like streaks across a dark, textured surface, symbolizing clarity, synthesis, and the convergence of multiple perspectives.

Perplexity Agent API now ships with Finance Search for structured financial insight

Apple showing off Siri’s updated logo at WWDC 2024.

Apple faces $250 million payout after overselling AI Siri on iPhone 16

The OpenAI logo displayed in white against a deep blue gradient background. The logo consists of a stylized hexagonal geometric shape resembling an interlocking pattern or aperture on the left, paired with the text "OpenAI" in a clean, modern font on the right. The background features subtle lighting effects with darker edges and a brighter blue glow in the upper right corner, creating a professional and technological atmosphere.

OpenAI’s rumored ChatGPT phone targets 2027 launch window

Minimal promotional graphic featuring the text “GPT-5.5 Instant” centered inside a rounded white rectangle, set against a soft abstract background with blurred pastel gradients in pink, purple, orange, and blue tones.

GPT-5.5 Instant replaces GPT-5.3 as OpenAI’s everyday ChatGPT model

Promotional interface mockup for Perplexity Computer focused on professional finance workflows, showing an “NVDA Post Earnings Impact Memo” with financial tables, charts, and analysis sections alongside a task panel requesting an AI-generated NVIDIA earnings summary with market insights and semiconductor industry implications.

Perplexity launches Computer for Professional Finance

Abstract 3D illustration of a flowing metallic ribbon with reflective gold and silver surfaces, curved in a wave-like shape against a dark background with bright light reflections and glossy highlights.

Perplexity health search gets a major upgrade with Premium Sources

Illustration of Google Chrome enhanced autofill showing three side-by-side form examples for loyalty card numbers, vehicle license plates, and travel confirmation numbers. Each input field displays a dropdown suggestion card with saved information and management options against a blue background.

Google Chrome’s enhanced autofill completely changes how you fill out tedious online forms

Close-up of the Google Drive webpage showing the Drive logo, the heading “Drive,” and text about storing, accessing, and sharing files, with a “Get started” button visible.

Google Drive API now supports large-scale CSE file migrations

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.

Advertisement
Amazon Summer Beauty Event 2026