By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleTech

Google DeepMind maps a new way to score AI systems on the road to AGI

Google DeepMind is done with vague AGI talk and is rolling out a cognitive scorecard that measures what today’s models can actually do across ten human‑style abilities.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Mar 18, 2026, 7:43 AM EDT
Share
We may get a commission from retail offers. Learn more
Minimal diagram showing ten labeled cognitive abilities arranged in a circle around the words “Cognitive Abilities,” including perception, generation, attention, learning, memory, reasoning, metacognition, executive functions, problem solving, and social cognition, each with a small blue icon.
Image: Google
SHARE

Google DeepMind is trying to answer one of AI’s most uncomfortable questions: how do you actually tell if you’re getting closer to “real” general intelligence, and not just building a better autocomplete? Instead of chasing a single magic number or leaderboard, the team is now proposing a cognitive-style blueprint for measuring progress toward AGI – and they’re turning it into a public Kaggle challenge with serious prize money behind it.

At the heart of this move is a simple but quietly radical shift: treat AI systems less like black boxes that happen to get high scores on benchmarks, and more like students sitting for a very broad cognitive exam. DeepMind’s new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” leans heavily on decades of psychology and neuroscience research to carve human intelligence into 10 core abilities — things like perception, learning, memory, reasoning, metacognition, and social cognition. The claim is not that AGI is “10 checkboxes away,” but that if you want an honest read on how general an AI system really is, you need to see how it behaves across this whole landscape, not just in a couple of cherry‑picked tasks.

Some of the abilities in the framework are pretty familiar from today’s models. Perception is basically how well a system can take in information from the world – text, audio, images, video – and parse it into something usable. Generation is what LLMs are famous for: producing coherent text, code, speech, or actions in an environment. Attention, in this context, is less about transformer architectures and more about whether a system can selectively focus on what matters in a task instead of being distracted by noise.

The more interesting – and frankly more uncomfortable – parts of the taxonomy live in the higher‑order stuff. Learning is framed as the ability to pick up new skills or concepts from experience or instruction, not just regurgitate what was baked in during pretraining. Memory is about storing and retrieving information over time, especially long‑term, which recent work suggests is still a glaring weakness in many large models despite their apparent knowledge. Reasoning is explicitly defined as drawing valid conclusions via logic, including deductive, inductive, analogical and mathematical reasoning – and the paper makes a point that pattern‑matching shortcuts don’t count.

Then there’s metacognition and executive function – essentially “thinking about thinking” and the ability to plan, inhibit bad impulses, and switch strategies when needed. DeepMind’s taxonomy treats these as separate from raw problem solving, which is described as a composite ability that pulls together perception, learning, reasoning, and planning to actually crack domain‑specific challenges. Finally, social cognition covers everything from reading other agents’ intentions to cooperation, negotiation, and even persuasion or deception – which the authors explicitly flag as double‑edged in terms of safety risk.

A framework is only as good as the tests built on top of it, so DeepMind is pairing the theory with a concrete evaluation pipeline. The protocol they describe has three stages: first, build a broad suite of cognitive tasks for each of the 10 abilities; second, collect human baselines on those tasks from a demographically representative sample of adults; and third, map each AI system’s performance to the corresponding human distribution. In other words, the goal is not “GPT-X got 82% on benchmark Y,” but “this model is roughly median‑human on attention, above average on knowledge‑heavy reasoning, and far below human on metacognition and long‑term memory.”

If this sounds a bit like psychometrics for machines, that’s intentional. The framework borrows from long‑standing human intelligence models, such as the Cattell-Horn-Carroll theory that splits cognition into broad factors and narrower skills. Other AGI-evaluation efforts are already heading in this direction as well; for example, recent “AGI scoring” proposals talk about decomposing general intelligence into multiple cognitive axes and then aggregating them, sometimes even assigning notional “AGI percentages” to today’s frontier systems, while emphasizing how jagged and uneven those cognitive profiles still are. Across surveys and safety reports, there is a growing consensus that classic narrow benchmarks miss huge parts of the picture and that cognitively grounded batteries are a more honest way to track where these systems are actually strong or brittle.

What makes DeepMind’s announcement more than just another paper is the decision to crowdsource a big chunk of the actual tests via Kaggle. The “Measuring progress toward AGI: Cognitive abilities” competition asks participants to design benchmarks that isolate five of the hardest‑to‑measure abilities: learning, metacognition, attention, executive functions and social cognition. Kaggle’s new Community Benchmarks platform will then host these evaluations, run them against a line‑up of leading models, and keep them alive as reusable public tests rather than one‑off leaderboard stunts.

There is real money on the table. DeepMind and Kaggle are putting up a $200,000 prize pool, with $10,000 for the top two submissions in each of the five tracks and four grand prizes of $25,000 for the strongest overall entries. Submissions run from March 17 through April 16, with results expected June 1, giving researchers and hackers just under a month to turn cognitive theory into concrete, model‑breaking tasks. Kaggle is positioning this as a way to move beyond “does the model remember this fact?” toward questions like “can it actually reason, adapt, and self‑monitor under pressure?”

Zooming out, this push slots into a larger trend: the slow pivot from pure benchmark‑chasing to more robust, cognitively informed evaluations of AI systems. Benchmarks like ARC-AGI, for instance, already try to probe fluid intelligence – the ability to solve completely novel puzzles from a handful of examples – and have become de facto progress meters for abstract reasoning because they resist brute‑force memorization and simple scaling tricks. The emerging picture from those efforts is that modern models can score impressively on many static tests while still faltering badly when pushed into tasks that demand on‑the‑fly rule discovery, long‑horizon planning, or human‑like common sense.

DeepMind’s cognitive framework does not magically settle the AGI definition debate, and the authors are fairly explicit that general intelligence is continuous and multidimensional rather than a single on/off threshold. But it does give labs, regulators, and the broader research community a shared vocabulary: instead of arguing in the abstract about whether “AGI is near,” they can start talking about concrete trajectories – how quickly are systems closing the gap on metacognition, how far below human they remain on social cognition, and which abilities plateau even as models scale.

For everyday users and policymakers, the implications are straightforward but important. If general‑purpose AI continues to drive real‑world decisions in science, healthcare, finance, and public services, then knowing what kind of intelligence these systems actually have – and what kinds they clearly do not – becomes a safety and governance requirement, not an academic curiosity. DeepMind’s move essentially says: if the industry is going to talk seriously about AGI, it needs equally serious, cognitively grounded ways to measure progress and expose blind spots, and it is willing to open that measurement problem up to the wider community rather than solving it behind closed doors.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Google DeepMind
Leave a Comment

Leave a ReplyCancel reply

Most Popular

Anthropic’s SpaceX compute deal supercharges Claude usage limits

Claude agents can now “dream” their way to better performance

OpenAI’s rumored ChatGPT phone targets 2027 launch window

Codex now runs natively inside Chrome on Mac and Windows

ASUS’ 12.3-inch ROG Strix XG129C is made to sit under your gaming monitor

Also Read
Apple logo on iPhone 11

Apple’s next chips may come from Intel’s fabs

ASUS Chromebook CM14 (CM1406) laptop

ASUS Chromebook CM14 packs Kompanio 540 power and 23-hour battery

SpaceX Founder and CEO Elon Musk speaks to press in front of the Crew Dragon capsule that is being prepared for the Demo-2 mission at SpaceX Headquarters October 10, 2019 in Hawthorne, California.

Anthropic was “evil” in February, now it runs on Musk’s Colossus 1 GPUs

Anthropic logo displayed as bold black uppercase text on a light beige background.

Anthropic’s SpaceX AI deal collides with data center backlash

Fitbit Air hero

Fitbit Air is the $99 screenless wearable made for Google Health Coach

Google Health Coach onboarding screens displayed on a phone.

Google Health Coach now included with Google Health Premium

Google Health logo

Fitbit app becomes Google Health app with AI coach starting May 19, 2026

Minimal graphic with the text “ChatGPT Futures” in black on a light purple background, with the word “Futures” highlighted by a hand-drawn yellow circle.

OpenAI unveils ChatGPT Futures Class of 2026

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.