GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIOpenAITech

OpenAI tackles AI language gaps with new India-focused IndQA benchmark

OpenAI's IndQA benchmark was built with 261 experts from across India.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Nov 4, 2025, 12:31 PM EST
Share
We may get a commission from retail offers. Learn more
A 3x4 grid of rounded square buttons, each containing a character from a different Indian script or the Latin alphabet. The characters include Bengali (অ), English (En), Hindi (ह), Kannada (Hi), and others representing various Indian languages, set against a light grey background. The image suggests multilingual support or language selection.
Image: OpenAI
SHARE

In the global race to build smarter artificial intelligence, a critical question has emerged: Can an AI truly be “intelligent” if it only understands the world from one perspective?

OpenAI, the San Francisco-based research firm that catapulted generative AI into the mainstream with ChatGPT, has confronted this problem head-on. On Tuesday, the company announced the launch of IndQA, a new and highly-detailed benchmark designed to evaluate how well AI systems grasp the vast, nuanced, and complex tapestry of Indian languages and culture.

This isn’t just another test of an AI’s ability to translate. It’s an attempt to measure something far more elusive: its understanding of context, history, and the everyday realities that matter to people where they live.

The initiative stems from a glaring gap in the world of AI development. As OpenAI points out, while 80% of the world’s population does not speak English as their primary language, the tools used to measure AI progress have been overwhelmingly Anglo-centric.

This has led to a significant problem. Popular multilingual benchmarks, like the widely-used MMMLU (Massive Multitask Language Understanding), are now “saturated.” In simple terms, the most powerful AI models are acing these tests, making them less and less useful for measuring real, meaningful progress.

More importantly, these existing tests often focus on multiple-choice questions or direct translations. They might be able to tell you the Hindi word for “computer,” but they can’t capture the cultural nuance of why a certain dish is central to a festival, the historical context of a local monument, or the subtle, code-switching humor of “Hinglish” spoken in a city.

That’s precisely the gap IndQA is built to fill.

“Today we are rolling out IndQA,” announced Srinivas Narayanan, OpenAI’s CTO for B2B Applications, at a media conference. “Built in collaboration with 261 experts across 12 languages, IndQA fills a key gap by enabling fair and rigorous evaluation that reflects India’s cultural and linguistic diversity.“

This is a benchmark built by humans, for AIs. The 261 domain experts, all native-level speakers from across India, were tasked with drafting difficult, reasoning-focused prompts tied directly to their regions and specialties.

The result is a massive evaluation system spanning 2,278 questions. These aren’t just in Hindi or English, but are natively written in 12 languages: Bengali, English, Hindi, Hinglish, Kannada, Marathi, Odia, Telugu, Gujarati, Malayalam, Punjabi, and Tamil.

The prompts cover 10 broad cultural domains, digging deep into topics like:

  • Architecture & Design
  • Arts & Culture
  • Everyday Life
  • Food & Cuisine
  • History
  • Law & Ethics
  • Literature & Linguistics
  • Media & Entertainment
  • Religion & Spirituality
  • Sports & Recreation

So, how does it work? Instead of a simple “right” or “wrong” answer, IndQA uses a sophisticated “rubric-based approach.” For each culturally-grounded prompt, the human expert also provides a detailed set of criteria for what a good answer looks like, along with an “ideal answer” that reflects expert expectations. This allows for a far more nuanced score than a simple pass/fail.

To ensure its robustness, OpenAI tested the benchmark against its most powerful models at the time of creation, including GPT-4o, GPT-4.5, and even the newly launched GPT-5.

Narayanan emphasized that this tool is designed to help all AI models—not just OpenAI’s—to “perform better in languages and contexts that are currently underrepresented in global datasets.“

With nearly a billion people who don’t speak English as their primary language and 22 official languages, India was described by the company as the “obvious starting point” for this global-first initiative. Company officials framed the work as part of an ongoing commitment to make AI technology more accessible and useful for a wide range of Indian users, from students and farmers to educators and developers.

Narayanan, speaking passionately about the potential, positioned India as a leader in this new era. “India can be a beacon of how AI can be used for social good,” he said, “including education, health and farming etc.“

However, the company was careful to add a few important caveats. Because the questions are unique and deeply tied to each specific language and culture, IndQA is not a “language leaderboard.” You cannot, for example, use its scores to definitively claim a model is “better” at Tamil than it is at Bengali.

Instead, its true value lies in measuring improvement over time within a single model family. It gives developers a clear, culturally-rich target to aim for, pushing them beyond simple translation and toward genuine understanding.

Ultimately, the launch of IndQA signals a major shift in how AI capabilities are measured. As OpenAI continues to expand its global developer ecosystem—which Narayanan noted already includes 4-5 million people—the focus is clearly moving. The true test of Artificial General Intelligence (AGI) won’t be its ability to pass an American high school exam, but its capacity to understand and respectfully engage with the countless cultures that make up humanity. And that road, it seems, runs directly through the rich, diverse, and complex linguistic landscapes of India.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:ChatGPT
Leave a Comment

Leave a ReplyCancel reply

Most Popular

Perplexity Computer adds a Command Panel

Summer Sale gives Nothing’s lineup a more tempting price tag

Also Read
Collage of four web-based artifacts created with Claude Code, including an analytics dashboard, a mobile app design showcase, a software migration report, and a systems workflow visualization. The examples demonstrate interactive interfaces, data-rich dashboards, design systems, and technical documentation generated through AI-assisted development.

Live artifacts come to Claude Code

Illustration of a Claude Connectors settings panel with organization-wide access enabled. A large toggle switch labeled “Enable for organization” is turned on, and a hand-shaped cursor points to it. Below, a list of connected apps—Asana, Atlassian, Canva, Figma, and Granola—each displays an enabled blue toggle switch. The interface appears on a light gray background with a clean, minimalist design.

Claude just solved the enterprise AI authorization headache — and it only took one login

Abstract 3D visualization of a connected network represented as a dark globe covered with intersecting lines and glowing spherical nodes. The illuminated points appear linked across the curved surface, symbolizing artificial intelligence, neural networks, global data connections, and knowledge processing.

Perplexity launches Brain for its Computer agent

Simple illustration of a shopping bag with a keyhole symbol on the front, representing secure or private shopping, on a solid orange background.

Anthropic killed the API key (for workloads, at least)

Design editor interface displaying a crowdfunding webpage for Maple Grove Park alongside a Claude Code terminal window. The design canvas shows editable text, fundraising progress, and donation information, while Claude Code is used to synchronize design components between the visual editor and development workflow.

Claude Design adds admin controls, direct editing, and a connector army

Abstract promotional graphic for LifeSciBench featuring layered design elements on a soft blue gradient background with light reflections and blurred yellow highlights. The composition includes a pale yellow rectangle, a scientific-style bar chart with error bars, and a large cropped text block reading “LifeSciBench” in bold black lettering on a light blue panel. The clean, modern layout combines data visualization and branding elements to represent a life sciences benchmarking or evaluation platform.

OpenAI’s GPT-Rosalind leads LifeSciBench — at a 36% pass rate

Abstract science-themed graphic featuring a soft green and blue gradient background with layered geometric shapes. A chemical structure diagram labeled “4-hydroxy-TEMPO” appears in the upper-right section, while large cropped black typography partially displays the letters “Mo.” The composition combines molecular chemistry imagery with modern design elements, suggesting a scientific research, chemistry, or drug discovery platform.

OpenAI’s near-autonomous chemist just proved it can do real wet-lab science

Apple iCloud logo displayed on a blue gradient background. The image features the iCloud cloud icon centered above the “iCloud” wordmark in white, representing Apple’s cloud storage and synchronization service used for backing up data, syncing files, photos, documents, and settings across iPhone, iPad, Mac, Apple Watch, and other Apple devices.

Apple’s new private.icloud.com domain has a downside

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.