By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleTech

Google DeepMind maps a new way to score AI systems on the road to AGI

Google DeepMind is done with vague AGI talk and is rolling out a cognitive scorecard that measures what today’s models can actually do across ten human‑style abilities.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Mar 18, 2026, 7:43 AM EDT
Share
We may get a commission from retail offers. Learn more
Minimal diagram showing ten labeled cognitive abilities arranged in a circle around the words “Cognitive Abilities,” including perception, generation, attention, learning, memory, reasoning, metacognition, executive functions, problem solving, and social cognition, each with a small blue icon.
Image: Google
SHARE

Google DeepMind is trying to answer one of AI’s most uncomfortable questions: how do you actually tell if you’re getting closer to “real” general intelligence, and not just building a better autocomplete? Instead of chasing a single magic number or leaderboard, the team is now proposing a cognitive-style blueprint for measuring progress toward AGI – and they’re turning it into a public Kaggle challenge with serious prize money behind it.

At the heart of this move is a simple but quietly radical shift: treat AI systems less like black boxes that happen to get high scores on benchmarks, and more like students sitting for a very broad cognitive exam. DeepMind’s new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” leans heavily on decades of psychology and neuroscience research to carve human intelligence into 10 core abilities — things like perception, learning, memory, reasoning, metacognition, and social cognition. The claim is not that AGI is “10 checkboxes away,” but that if you want an honest read on how general an AI system really is, you need to see how it behaves across this whole landscape, not just in a couple of cherry‑picked tasks.

Some of the abilities in the framework are pretty familiar from today’s models. Perception is basically how well a system can take in information from the world – text, audio, images, video – and parse it into something usable. Generation is what LLMs are famous for: producing coherent text, code, speech, or actions in an environment. Attention, in this context, is less about transformer architectures and more about whether a system can selectively focus on what matters in a task instead of being distracted by noise.

The more interesting – and frankly more uncomfortable – parts of the taxonomy live in the higher‑order stuff. Learning is framed as the ability to pick up new skills or concepts from experience or instruction, not just regurgitate what was baked in during pretraining. Memory is about storing and retrieving information over time, especially long‑term, which recent work suggests is still a glaring weakness in many large models despite their apparent knowledge. Reasoning is explicitly defined as drawing valid conclusions via logic, including deductive, inductive, analogical and mathematical reasoning – and the paper makes a point that pattern‑matching shortcuts don’t count.

Then there’s metacognition and executive function – essentially “thinking about thinking” and the ability to plan, inhibit bad impulses, and switch strategies when needed. DeepMind’s taxonomy treats these as separate from raw problem solving, which is described as a composite ability that pulls together perception, learning, reasoning, and planning to actually crack domain‑specific challenges. Finally, social cognition covers everything from reading other agents’ intentions to cooperation, negotiation, and even persuasion or deception – which the authors explicitly flag as double‑edged in terms of safety risk.

A framework is only as good as the tests built on top of it, so DeepMind is pairing the theory with a concrete evaluation pipeline. The protocol they describe has three stages: first, build a broad suite of cognitive tasks for each of the 10 abilities; second, collect human baselines on those tasks from a demographically representative sample of adults; and third, map each AI system’s performance to the corresponding human distribution. In other words, the goal is not “GPT-X got 82% on benchmark Y,” but “this model is roughly median‑human on attention, above average on knowledge‑heavy reasoning, and far below human on metacognition and long‑term memory.”

If this sounds a bit like psychometrics for machines, that’s intentional. The framework borrows from long‑standing human intelligence models, such as the Cattell-Horn-Carroll theory that splits cognition into broad factors and narrower skills. Other AGI-evaluation efforts are already heading in this direction as well; for example, recent “AGI scoring” proposals talk about decomposing general intelligence into multiple cognitive axes and then aggregating them, sometimes even assigning notional “AGI percentages” to today’s frontier systems, while emphasizing how jagged and uneven those cognitive profiles still are. Across surveys and safety reports, there is a growing consensus that classic narrow benchmarks miss huge parts of the picture and that cognitively grounded batteries are a more honest way to track where these systems are actually strong or brittle.

What makes DeepMind’s announcement more than just another paper is the decision to crowdsource a big chunk of the actual tests via Kaggle. The “Measuring progress toward AGI: Cognitive abilities” competition asks participants to design benchmarks that isolate five of the hardest‑to‑measure abilities: learning, metacognition, attention, executive functions and social cognition. Kaggle’s new Community Benchmarks platform will then host these evaluations, run them against a line‑up of leading models, and keep them alive as reusable public tests rather than one‑off leaderboard stunts.

There is real money on the table. DeepMind and Kaggle are putting up a $200,000 prize pool, with $10,000 for the top two submissions in each of the five tracks and four grand prizes of $25,000 for the strongest overall entries. Submissions run from March 17 through April 16, with results expected June 1, giving researchers and hackers just under a month to turn cognitive theory into concrete, model‑breaking tasks. Kaggle is positioning this as a way to move beyond “does the model remember this fact?” toward questions like “can it actually reason, adapt, and self‑monitor under pressure?”

Zooming out, this push slots into a larger trend: the slow pivot from pure benchmark‑chasing to more robust, cognitively informed evaluations of AI systems. Benchmarks like ARC-AGI, for instance, already try to probe fluid intelligence – the ability to solve completely novel puzzles from a handful of examples – and have become de facto progress meters for abstract reasoning because they resist brute‑force memorization and simple scaling tricks. The emerging picture from those efforts is that modern models can score impressively on many static tests while still faltering badly when pushed into tasks that demand on‑the‑fly rule discovery, long‑horizon planning, or human‑like common sense.

DeepMind’s cognitive framework does not magically settle the AGI definition debate, and the authors are fairly explicit that general intelligence is continuous and multidimensional rather than a single on/off threshold. But it does give labs, regulators, and the broader research community a shared vocabulary: instead of arguing in the abstract about whether “AGI is near,” they can start talking about concrete trajectories – how quickly are systems closing the gap on metacognition, how far below human they remain on social cognition, and which abilities plateau even as models scale.

For everyday users and policymakers, the implications are straightforward but important. If general‑purpose AI continues to drive real‑world decisions in science, healthcare, finance, and public services, then knowing what kind of intelligence these systems actually have – and what kinds they clearly do not – becomes a safety and governance requirement, not an academic curiosity. DeepMind’s move essentially says: if the industry is going to talk seriously about AGI, it needs equally serious, cognitively grounded ways to measure progress and expose blind spots, and it is willing to open that measurement problem up to the wider community rather than solving it behind closed doors.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Google DeepMind
Leave a Comment

Leave a ReplyCancel reply

Most Popular

Amazon Prime still offers free trials in 2026 — if you know where to look

Windows 11 needs 4x the RAM for the same work and MacBook Neo proves it

MacBook Neo can run Windows, just don’t push it too hard

Stop rebooting: grab 35% off Parallels Desktop and run Windows on your Mac the easy way

iOS 27 could be the Snow Leopard of the iPhone

Also Read
Bright lime‑green and black Nike Powerbeats Pro 2 wireless workout earbuds with over‑ear hooks are shown floating in front of their open charging case, which features a speckled Volt pattern on the base and the “JUST DO IT.” slogan inside the lid.

Special-edition Nike Powerbeats Pro 2 land with Volt design and ANC

Centered FIFA World Cup 2026 logo on a black background, featuring the golden World Cup trophy inside a bold white “26” with the word “FIFA” below and “World Cup 2026” in white text.

YouTube is now a preferred platform for the FIFA World Cup 2026

Black background graphic with the word “colab” in bold orange lowercase letters on the left, an orange heart emoji in the center, and the white Model Context Protocol logo with the text “Model Context Protocol” on the right.

Google’s Colab MCP server lets any AI agent run your notebooks

Mobile screenshot showing two Amazon app checkout screens side by side on an orange background, with the left phone displaying a cart containing Huggies Size 3 Little Snugglers diapers for 23.17 dollars and options to proceed to checkout, change quantity, delete, or save for later, and the right phone showing delivery choices highlighting a paid “Arriving in 1 hour” option for 9.99 dollars, a “In 3 hours” option for 4.99 dollars, and a free Same-Day delivery window later in the day.

Amazon launches ultra-fast 1-hour and 3-hour delivery in more US cities

Two Android smartphones are shown side by side on a gray gradient background, each displaying an active WhatsApp voice call screen with a large blue “W” avatar; the left phone shows the standard call controls with a banner at the top saying “Noise cancellation is on,” while the right phone reveals an expanded bottom sheet of call options where the “Noise cancellation” toggle switch is turned on, illustrating WhatsApp’s new in‑call noise cancellation feature for Android.

WhatsApp tests noise cancellation for Android voice and video calls

Close-up of a person wearing a Garmin smartwatch outdoors, showing the WhatsApp interface with group and individual chat notifications on the round display, with a subtle WhatsApp logo in the lower left corner.

You can now use WhatsApp on select Garmin smartwatches

A multicolored stylized Apple logo made of swirling, paint-like shapes centered on a solid black background in an ultra‑wide, high‑resolution format.

Apple Developer lands on bilibili and LinkedIn

Wide front view of a dark data center row showing dozens of gold-and-black NVIDIA Vera Rubin rack systems lined up side by side against a black background, emphasizing the scale of the AI supercomputer hardware.

NVIDIA Vera Rubin POD unites seven chips into one AI powerhouse

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.