GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIBusinessOpenAITech

OpenAI chooses Cerebras for ultra-fast AI inference

Cerebras’ wafer-scale chips give OpenAI a new way to run AI at real-time speed.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Jan 14, 2026, 10:00 PM EST
Share
We may get a commission from retail offers. Learn more
OpenAI and Cerebras logos displayed side by side, separated by a vertical line, on a blue-green gradient background.
Image: OpenAI
SHARE

OpenAI’s partnership with Cerebras is essentially a bet that the future of AI will be real-time, always-on, and limited less by GPUs and more by electricity and cooling. It is about taking the kinds of models people already use every day and making them feel as responsive as a live conversation or a local app, even at a massive global scale.​

At the heart of the deal is a huge number: 750 megawatts of ultra low-latency AI compute that Cerebras will dedicate to running OpenAI’s models. That capacity will be rolled out in multiple phases through 2028, making this one of the largest high-speed AI inference deployments announced so far. Unlike a typical GPU cluster stitched together from thousands of cards, Cerebras builds systems around a “wafer-scale engine” – a single, giant chip the size of an entire silicon wafer, with compute, memory, and bandwidth living side by side. By keeping everything on one enormous piece of silicon instead of hopping across a network of discrete accelerators, Cerebras cuts out many of the latency bottlenecks that slow traditional AI inference.​

This is exactly the pain point OpenAI wants to address. Today, when a user asks a complicated question, generates code, or kicks off an AI agent, there is a multi-step dance behind the scenes: the request travels to a data center, the model runs across multiple machines, results are stitched together, and then streamed back. That process works, but it is not always instant, especially at peak demand or with dense workloads like code generation and long-form reasoning. OpenAI describes its overall compute strategy as building a “resilient portfolio” that matches different workloads to the hardware that makes the most sense for them, and Cerebras is being slotted in as a dedicated low-latency inference tier. In practical terms, that means certain classes of prompts – the ones where every millisecond matters for user experience – can be routed to this faster Cerebras-backed layer.​

The companies have been circling each other for years. Cerebras has pitched itself as an alternative to GPU-bound AI infrastructure, claiming that its wafer-scale systems can run large language models at speeds up to an order of magnitude faster than conventional GPU setups for some workloads. Early benchmarks on Cerebras hardware, including models from the Llama family, show token generation rates that significantly outpace many GPU-based deployments, which is exactly the sort of improvement OpenAI needs for “always-on” assistants, live coding copilots, and real-time agents. For Cerebras, this deal is a validation moment: its CEO, Andrew Feldman, framed it as a decade-long journey culminating in a multi-year agreement that could push wafer-scale technology into the hands of hundreds of millions, and eventually billions, of users.​

There is also a bigger context here: OpenAI is quietly building out a vast physical footprint to feed its models’ hunger for power and cooling. The Cerebras announcement lands alongside a separate partnership with SB Energy, backed by SoftBank, that involves a $1 billion investment to build and operate a 1.2 gigawatt AI data center campus in Milam County, Texas, powered by new solar and battery storage. A gigawatt is enough electricity to power roughly three-quarters of a million US homes at any given moment, which gives a sense of the scale of the facilities needed to run next-generation AI systems. When you start to pair that kind of renewable-heavy power infrastructure with 750 MW of specialized inference hardware, you begin to see how seriously OpenAI is treating AI as critical infrastructure rather than just cloud software.​

For users, many of these moves will show up in subtle ways before they’re obvious headline features. Interfaces that used to stutter or lag may start to feel “local” even when they are calling massive models over the network. Latency-sensitive use cases – think real-time customer support, multiplayer gaming assistants, trading and risk agents, or live language translation – stand to benefit the most from a dedicated low-latency tier. In that sense, Cerebras plays a similar role to the early broadband providers of the web era: most people will never see the wafer-scale chips themselves, but they will notice when AI stops feeling like a slow remote service and starts behaving like a native part of everything they do online.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Leave a Comment

Leave a ReplyCancel reply

Most Popular

Xbox Game Pass explained: plans, perks, and play

What is cloud gaming?

The real purpose of Microsoft PC Manager

Universal is re-releasing The Fast and the Furious for its 25th anniversary

Apple removes many menu icons in macOS 27

Apple’s subscription overhaul brings bundles, group plans, and retention

Xbox Game Pass Ultimate: pricing, perks, and how it all fits together

Xbox Game Pass Essential: who it’s for, what it includes, what it skips

The next Xbox could arrive with a new business model

The new Beats headphones, Antonee Robinson just teased on his way to the World Cup

Also Read
Promotional image for the Swipewipe photo cleaner app showing three versions of the same portrait photo arranged on a soft beige background. The center image is highlighted with a green checkmark to indicate a photo being kept, while the smaller images on either side feature trash can icons, representing photos selected for deletion. The visual illustrates Swipewipe’s swipe-based photo organization and cleanup process for managing duplicate or unwanted images.

Swipewipe makes clearing your camera roll feel oddly easy

The Apple Music logo in white text against a vibrant red background. The text has a slight distortion or wave effect, giving it a dynamic, musical appearance. The Apple logo precedes the word "Music" and both share the same rippling, audiographic style treatment.

Apple Music iOS 27 update: AutoMix, artist pages, and Siri AI

Promotional artwork for PC Game Pass featuring a collage of game characters and worlds. The image includes a red-eyed fantasy character, a tactical soldier, an adventurer wearing a fedora, and a mythological bearded figure with glowing eyes. The Xbox logo and "PC Game Pass" branding appear across the center, highlighting a diverse library of action, adventure, strategy, and role-playing games available through the subscription service.

PC Game Pass in 2026: library, limits, and the new price cut

Promotional Xbox gaming image with the slogan “Play the Way You Want” displayed in large green text at the center. Surrounding the message are multiple gaming devices, including an Xbox console and controller, a gaming handheld, a laptop, a smartphone, and a TV, all showing Xbox games and the Xbox app interface. The artwork highlights Xbox Cloud Gaming and Game Pass, emphasizing the ability to play across console, PC, handheld, mobile, and streaming devices from a single gaming ecosystem.

Xbox Game Pass Premium: the middle tier that might be just right

Promotional image of the PlayStation Portal handheld gaming device featuring the PlayStation Plus cloud streaming interface on its display. The screen shows the PlayStation Plus logo surrounded by a glowing purple ring, while the device's white DualSense-style controller grips frame the display on both sides. Set against a dark background with PlayStation-inspired colors, the image highlights cloud gaming and remote play capabilities available through PlayStation Plus.

New to PlayStation Plus? Here’s how the service really works

Promotional image for Amazon Luna cloud gaming featuring the Luna logo on a purple gradient background. Multiple devices, including a smart TV, desktop monitor, laptop, tablet, and smartphone, display the same racing game scene with Sonic the Hedgehog and other characters. An Amazon Luna wireless controller is positioned in front of the screens, illustrating seamless game streaming across different devices through Amazon’s cloud gaming platform.

How Amazon Luna works and who it is for

Promotional image for NVIDIA GeForce NOW cloud gaming showcasing games streamed across multiple devices. Large displays feature Pragmata and Counter-Strike 2, while laptops, a handheld gaming device, smartphone, VR headset, racing wheel, and flight simulator controls are arranged on illuminated black platforms. The dark futuristic background with NVIDIA-green wave patterns emphasizes GeForce NOW’s ability to play high-end PC games across screens and gaming hardware through cloud streaming.

What GeForce Now gets right about cloud gaming

Promotional artwork for Xbox Cloud Gaming featuring Forza Horizon 5. A red Mercedes-AMG hypercar races along a dusty coastal road in a tropical landscape, while off-road vehicles jump over rocky terrain in the background. In the foreground, the game is shown running across multiple devices, including a TV, monitor, smartphone, tablet, handheld gaming device, VR headset, and Xbox Series X console with controllers, highlighting the ability to stream and play Forza Horizon 5 across the Xbox Cloud Gaming ecosystem.

What is Xbox Cloud Gaming and how does it work?

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.