Watching foreign-language content is a commitment. You’re either stuck with subtitles, forcing you to read the movie instead of watching it, or you opt for the dub, which often sounds like a monotone robot trying to express profound human emotion. It’s the classic “subtitle two-step,” and it’s always felt like a clunky workaround.
A new partnership, however, suggests a future where this choice is obsolete.
On November 11, 2025, Broadcom, the quiet giant whose chips power millions of set-top boxes and streaming devices, announced a collaboration with CAMB.AI, an AI startup specializing in localization. Their goal: to embed real-time audio translation and dubbing capabilities directly onto Broadcom’s system-on-chip (SoC) platforms.
This isn’t just another software update. This is a fundamental shift toward “on-device” AI, pulling one of the cloud’s most demanding jobs down from a distant server and placing it right inside your TV or cable box.
For years, any “smart” task—be it asking Alexa the weather, translating a sentence on your phone, or streaming a 4K movie—has relied on the cloud. Your device captures the data (your voice, a click) and sends it to a massive data center hundreds of miles away. The server does the heavy lifting and sends the answer back.
This partnership aims to sever that digital tether.
The companies demonstrated text-to-speech functionality running on the neural processing unit (NPU)—the dedicated “AI brain”—integrated into Broadcom’s SoC platforms. CAMB.AI’s MARS voice model is now operational on the BCM7116 chipset, a piece of silicon destined for the next generation of home entertainment hardware.
According to the official announcement, this on-device approach is a trifecta of benefits:
- Ultra-low latency: The translation is nearly instantaneous. There’s no “round trip” to the cloud, eliminating that awkward lag.
- Enhanced privacy: This is a big one. The processing happens locally. Your conversations, or the audio from the content you’re watching, never leave your living room.
- Reduced costs: For providers, streaming data to and from the cloud costs money and bandwidth. On-device processing bypasses this entirely.
This collaboration is squarely aimed at Broadcom’s bread and butter: the broadband and home entertainment segments. We may not know the Broadcom brand, but we definitely know its customers—think Comcast, Sky, Charter, and the dozens of other providers that supply the gateways and boxes that connect us to the internet.
“This collaboration showcases the power of bringing together CAMB.AI’s expertise in real-time multilingual AI and Broadcom’s advanced SoC technology,” said Rich Nelson, senior vice president and general manager of Broadcom’s Broadband Video Group.
This isn’t your standard Google Translate. CAMB.AI has been making waves for a different reason. Their MARS model doesn’t just swap words; it clones the voice and emotion of the original speaker.
The startup, which has raised $18.5 million to date, requires just 2-3 seconds of audio to clone a voice and make it “speak” over 140 languages, all while preserving the original tone, cadence, and emotion.
This is the technology that has already powered live, real-time dubbing for Major League Soccer, the Australian Open, and Comcast NBCUniversal events. Fans could listen to a game’s commentary in their native language, not from a monotone translator, but in a voice that sounded just like the original, excited commentator.
That is the technology now being ported to a chip.
“Our mission at CAMB.AI has always been to break down language barriers for 8 billion people around the world and make communication truly universal,” said Akshat Prakash, co-founder and CTO of CAMB.AI.
The next phase of the partnership will study porting CAMB.AI’s full real-time translation model (not just text-to-speech) to the chip, enabling on-the-fly translation in over 150 languages.
To show off the system’s potential, the companies demonstrated its audio description capabilities using a clip from the film Ratatouille. As the scene played, the AI narrated the visual action in multiple languages, with on-screen text translations appearing simultaneously.
The implications here are broader than just movies.
- Global sports: Imagine watching a Formula 1 race and switching the live driver audio from French to English to Japanese, hearing it in the driver’s own voice.
- Accessibility: For the visually impaired, this technology could provide real-time audio descriptions for any piece of content, from any country, in their native language.
- Gaming & VR: Real-time, in-voice translation could revolutionize online multiplayer gaming, allowing players from around the world to communicate seamlessly.
Of course, this is all still in the testing phase. There is no confirmed timeline for when these chipsets will actually appear in a consumer-ready television or smart device.
But the writing is on the wall. The foundation is being laid for a future where language is no longer a barrier to entertainment, but a simple preference setting. The days of the “subtitle two-step” may be numbered.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
