GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIGoogleTech

Google DeepMind maps a new way to score AI systems on the road to AGI

Google DeepMind is done with vague AGI talk and is rolling out a cognitive scorecard that measures what today’s models can actually do across ten human‑style abilities.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Mar 18, 2026, 7:43 AM EDT
Share
We may get a commission from retail offers. Learn more
Minimal diagram showing ten labeled cognitive abilities arranged in a circle around the words “Cognitive Abilities,” including perception, generation, attention, learning, memory, reasoning, metacognition, executive functions, problem solving, and social cognition, each with a small blue icon.
Image: Google
SHARE

Google DeepMind is trying to answer one of AI’s most uncomfortable questions: how do you actually tell if you’re getting closer to “real” general intelligence, and not just building a better autocomplete? Instead of chasing a single magic number or leaderboard, the team is now proposing a cognitive-style blueprint for measuring progress toward AGI – and they’re turning it into a public Kaggle challenge with serious prize money behind it.

At the heart of this move is a simple but quietly radical shift: treat AI systems less like black boxes that happen to get high scores on benchmarks, and more like students sitting for a very broad cognitive exam. DeepMind’s new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” leans heavily on decades of psychology and neuroscience research to carve human intelligence into 10 core abilities — things like perception, learning, memory, reasoning, metacognition, and social cognition. The claim is not that AGI is “10 checkboxes away,” but that if you want an honest read on how general an AI system really is, you need to see how it behaves across this whole landscape, not just in a couple of cherry‑picked tasks.

Some of the abilities in the framework are pretty familiar from today’s models. Perception is basically how well a system can take in information from the world – text, audio, images, video – and parse it into something usable. Generation is what LLMs are famous for: producing coherent text, code, speech, or actions in an environment. Attention, in this context, is less about transformer architectures and more about whether a system can selectively focus on what matters in a task instead of being distracted by noise.

The more interesting – and frankly more uncomfortable – parts of the taxonomy live in the higher‑order stuff. Learning is framed as the ability to pick up new skills or concepts from experience or instruction, not just regurgitate what was baked in during pretraining. Memory is about storing and retrieving information over time, especially long‑term, which recent work suggests is still a glaring weakness in many large models despite their apparent knowledge. Reasoning is explicitly defined as drawing valid conclusions via logic, including deductive, inductive, analogical and mathematical reasoning – and the paper makes a point that pattern‑matching shortcuts don’t count.

Then there’s metacognition and executive function – essentially “thinking about thinking” and the ability to plan, inhibit bad impulses, and switch strategies when needed. DeepMind’s taxonomy treats these as separate from raw problem solving, which is described as a composite ability that pulls together perception, learning, reasoning, and planning to actually crack domain‑specific challenges. Finally, social cognition covers everything from reading other agents’ intentions to cooperation, negotiation, and even persuasion or deception – which the authors explicitly flag as double‑edged in terms of safety risk.

A framework is only as good as the tests built on top of it, so DeepMind is pairing the theory with a concrete evaluation pipeline. The protocol they describe has three stages: first, build a broad suite of cognitive tasks for each of the 10 abilities; second, collect human baselines on those tasks from a demographically representative sample of adults; and third, map each AI system’s performance to the corresponding human distribution. In other words, the goal is not “GPT-X got 82% on benchmark Y,” but “this model is roughly median‑human on attention, above average on knowledge‑heavy reasoning, and far below human on metacognition and long‑term memory.”

If this sounds a bit like psychometrics for machines, that’s intentional. The framework borrows from long‑standing human intelligence models, such as the Cattell-Horn-Carroll theory that splits cognition into broad factors and narrower skills. Other AGI-evaluation efforts are already heading in this direction as well; for example, recent “AGI scoring” proposals talk about decomposing general intelligence into multiple cognitive axes and then aggregating them, sometimes even assigning notional “AGI percentages” to today’s frontier systems, while emphasizing how jagged and uneven those cognitive profiles still are. Across surveys and safety reports, there is a growing consensus that classic narrow benchmarks miss huge parts of the picture and that cognitively grounded batteries are a more honest way to track where these systems are actually strong or brittle.

What makes DeepMind’s announcement more than just another paper is the decision to crowdsource a big chunk of the actual tests via Kaggle. The “Measuring progress toward AGI: Cognitive abilities” competition asks participants to design benchmarks that isolate five of the hardest‑to‑measure abilities: learning, metacognition, attention, executive functions and social cognition. Kaggle’s new Community Benchmarks platform will then host these evaluations, run them against a line‑up of leading models, and keep them alive as reusable public tests rather than one‑off leaderboard stunts.

There is real money on the table. DeepMind and Kaggle are putting up a $200,000 prize pool, with $10,000 for the top two submissions in each of the five tracks and four grand prizes of $25,000 for the strongest overall entries. Submissions run from March 17 through April 16, with results expected June 1, giving researchers and hackers just under a month to turn cognitive theory into concrete, model‑breaking tasks. Kaggle is positioning this as a way to move beyond “does the model remember this fact?” toward questions like “can it actually reason, adapt, and self‑monitor under pressure?”

Zooming out, this push slots into a larger trend: the slow pivot from pure benchmark‑chasing to more robust, cognitively informed evaluations of AI systems. Benchmarks like ARC-AGI, for instance, already try to probe fluid intelligence – the ability to solve completely novel puzzles from a handful of examples – and have become de facto progress meters for abstract reasoning because they resist brute‑force memorization and simple scaling tricks. The emerging picture from those efforts is that modern models can score impressively on many static tests while still faltering badly when pushed into tasks that demand on‑the‑fly rule discovery, long‑horizon planning, or human‑like common sense.

DeepMind’s cognitive framework does not magically settle the AGI definition debate, and the authors are fairly explicit that general intelligence is continuous and multidimensional rather than a single on/off threshold. But it does give labs, regulators, and the broader research community a shared vocabulary: instead of arguing in the abstract about whether “AGI is near,” they can start talking about concrete trajectories – how quickly are systems closing the gap on metacognition, how far below human they remain on social cognition, and which abilities plateau even as models scale.

For everyday users and policymakers, the implications are straightforward but important. If general‑purpose AI continues to drive real‑world decisions in science, healthcare, finance, and public services, then knowing what kind of intelligence these systems actually have – and what kinds they clearly do not – becomes a safety and governance requirement, not an academic curiosity. DeepMind’s move essentially says: if the industry is going to talk seriously about AGI, it needs equally serious, cognitively grounded ways to measure progress and expose blind spots, and it is willing to open that measurement problem up to the wider community rather than solving it behind closed doors.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Google DeepMind
Leave a Comment

Leave a ReplyCancel reply

Most Popular

Dell XPS 16 Creator Edition: Tandem OLED, RTX Spark, and 128GB unified memory

Dell’s new XPS 13 has more features than a MacBook Neo – at the same price

Apple rolls out iOS 26.5.1 and macOS 26.5.1 with important fixes

Walmart’s 30-minute delivery is now live in 33 U.S. cities

Sonos’s Arc Ultra Dolby Atmos soundbar is $200 off its list price

Also Read
Indoor Cam, 2nd Gen

Ring’s 2-pack Indoor Cam drops to $50 in early Prime Day deal

Blink Wired Floodlight Camera

Blink’s 2600-lumen Floodlight Camera falls to $30 ahead of Prime Day

Ultimate Ears WONDERBOOM 4 waterproof Bluetooth speaker in blue.

This rugged WONDERBOOM 4 speaker is nearly half price right now

Stylized illustration of the upper portion of an iPhone with a white device frame and black Dynamic Island cutout centered at the top of the display. The screen features abstract overlapping shapes and gradients in pastel orange, pink, blue, and purple, while status icons for signal, Wi-Fi, battery, and the time 9:41 appear along the top edge. The background continues the soft multicolor gradient theme with large decorative curves and shapes.

Apple Intelligence comes back to WWDC with more to prove

A person wearing Apple Vision Pro on a train.

Vision Pro 2 isn’t dead – it’s just slowing down

The classic Apple logo, shown in light silvery-blue, set against a black background. The logo has a clean, minimalist design featuring the iconic bitten apple silhouette with a soft, matte finish.

Apple Car Key is finally headed to future Mahindra models

WWDC 2026 wallpaper on Apple's Mac, iPad, and iPhone devices.

WWDC26 hype starts: new Apple wallpaper, playlist, and more

Promotional poster for Apple's WWDC26 developer conference featuring a glowing Apple logo centered on a black background. Beneath the illuminated logo, the text reads “WWDC26” and the slogan “All systems glow.” with event dates listed as June 8–12. The design uses bright white highlights and subtle blue reflections to create a futuristic, luminous effect.

Apple teases WWDC 2026 with ‘All systems glow’ and a big Siri reboot incoming

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.