GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AINVIDIAPerplexityTech

Nemotron 3 Ultra rolls out to Perplexity Pro, Max, and Computer

As Nemotron 3 Ultra comes online in Perplexity and Computer, users gain a more robust foundation for long-running AI tasks that span multiple steps, tools, and evolving sources.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Jun 6, 2026, 1:49 AM EDT
Share
We may get a commission from retail offers. Learn more
Close-up screenshot of an AI model selection menu displaying several large language models, including Claude Sonnet 4.6, Claude Opus 4.8, and Nemotron 3 Ultra. The Nemotron 3 Ultra option is highlighted and selected, marked with a checkmark and a “Max” badge, while a large cursor points toward the model name. The interface emphasizes choosing an advanced AI model within a chatbot or AI platform.
Image: Perplexity
SHARE

Nemotron 3 Ultra landing on Perplexity’s Pro and Max tiers – and inside Computer – is one of those upgrades that quietly changes what you can actually do with an AI assistant, even if the interface looks almost the same at first glance. It is not just “a new model option”; it is NVIDIA’s flagship open frontier reasoning system, built specifically to power long-running agents that think for longer, juggle more context, and still respond fast enough to feel interactive.

If you’ve ever hit the limits of current models while running deep research sessions, complex coding tasks, or sprawling “plan this entire project” prompts, this is the sort of backend change that matters more than any shiny new chat UI.

When NVIDIA says “open frontier model,” they are not just using marketing language. Nemotron 3 Ultra is a 550 billion parameter Mixture-of-Experts model with around 55 billion parameters active at any given time, using a hybrid Mamba‑Transformer architecture tuned for throughput and long-context reasoning. That architecture, combined with NVIDIA’s NVFP4 and BF16 tricks on Blackwell-class GPUs, lets the model push out up to roughly six times higher inference throughput than comparable open LLMs at similar accuracy levels, which is a very polite way of saying “it runs frontier-sized brains without feeling like dial‑up.”

The other big number that matters is the context window: Nemotron 3 Ultra has been stretched to handle up to 1 million tokens of context, after being pre-trained on about 20 trillion tokens and then post-trained with supervised fine-tuning, reinforcement learning, and multi-teacher distillation. Long context is not new as a concept, but tying a 1M-token window to a model explicitly optimized for long-running agents means it is designed to keep absorbing intermediate steps, tools calls, and retrieved documents over time instead of falling apart halfway through a session.

Under the hood, NVIDIA’s technical report reads like a checklist of every modern LLM optimization you would expect in a 2026-era frontier model: Latent MoE routing, multi-token prediction, NVFP4 pretraining, multi-environment RL, and “reasoning budget control” to keep the model from overspending compute on trivial turns. The result, according to NVIDIA’s own benchmarking, is up to around 5.9x higher throughput than some of the largest open competitors on challenging long-output workloads, while keeping accuracy on reasoning and agentic benchmarks in the same ballpark.

That’s the model itself. The more interesting story, especially for power users in the US who already live in Perplexity all day, is what happens when you drop something like Nemotron 3 Ultra into a real product with actual users and noisy, messy, open-ended tasks.

Perplexity has been leaning hard into “agentic search” for a while now, layering open models like Nemotron 3 Super into its stack to orchestrate browsing, retrieval, and synthesis rather than just generating text from a static prompt. Nemotron 3 Ultra is essentially the bigger, more obsessive sibling in the same family – one that is built to run deeper and longer chains of reasoning, coordinate tools, and keep more of the evolving conversation in its head while it works.

By making Nemotron 3 Ultra available specifically to Pro and Max subscribers, Perplexity is doing two things at once. First, it is turning those paid tiers into a kind of “frontier open-model lab” where the best open weights from NVIDIA are wired directly into a consumer-facing agent stack instead of staying locked away in enterprise demos or research labs. Second, it is quietly normalizing the idea that open models are not just the cheap or privacy-friendly option – they can be the fast, capable default for demanding, long-running workflows.

If you are on Perplexity Pro or Max, the practical implication is simple: when you spin up longer Computer runs, ask the assistant to manage complex multi-step tasks, or rely on it as a “do this in the background while I keep working” companion, a lot of that orchestration can now ride on Nemotron 3 Ultra. The more your usage pattern looks like “agentic” rather than “quick one-off Q&A,” the more you benefit from a model that is tuned for extended reasoning and throughput at scale.

Computer is where this gets especially interesting. Perplexity’s Computer feature is essentially an agent shell: it opens tabs, navigates pages, runs tools, and stitches it all together into something coherent for you. Long-running agents in that environment need two things: they have to be able to keep context over many steps, and they have to be efficient enough that you are not staring at a spinner for minutes every time they think.

Nemotron 3 Ultra was built for exactly that: long-running “agentic” workflows where context grows continuously as the agent calls tools, reads more data, and updates its internal plan. A 1M-token context window lets an agent keep stacking up intermediate results, logs, and partial drafts without having to constantly prune away earlier context, which is precisely what kills coherence in a lot of existing long sessions.

The Mixture-of-Experts design is the other half of the story here. Because only a subset of experts are active per token, Nemotron 3 Ultra can deliver what is essentially frontier-scale capacity while still offering significantly higher throughput and lower effective cost than a dense model with similar total parameters. For a user, that translates into Computer sessions that can think deeply over long sequences of actions and still respond often enough that you feel comfortable iterating rather than setting something up and walking away.

If you are using Computer for extended coding sessions, refactors, or end-to-end research projects – especially with US-focused workflows like building reports, market analysis, or legal-style document reviews – that combination of long context and high throughput is a big deal. You can keep asking follow-ups, layering in new sources, or pivoting the task without having to manually reset or rewrite massive prompts just to keep the model from forgetting what you said ten steps ago.

There is also a bigger strategic angle here: the open-model ecosystem is evolving from “here are some weights on GitHub” to “here is a full frontier-class stack shipping in real products on day one.” NVIDIA’s Nemotron 3 family was always framed as a three-tiered system – Nano, Super, and Ultra – where Nano handles lightweight, high-frequency jobs, Super powers collaborative and high-volume agents, and Ultra is the heavyweight reasoning and orchestration engine. Nemotron 3 Nano has already landed on platforms like Hugging Face and multiple inference providers; Ultra now arriving inside Perplexity closes the loop between open research, cloud deployment, and end-user applications.

For Perplexity, this deepens its role in what NVIDIA has called the “Nemotron coalition,” where different partners integrate these open models into their own products rather than treating them as side-grade options. For users, especially in markets like the US where AI tooling is quickly becoming part of everyday professional workflows, it means that the open-versus-proprietary debate is less about raw capability and more about which ecosystem fits your use case and values.

NVIDIA’s choice to release Nemotron 3 Ultra with open weights, data, and recipes under an open license gives developers and platforms a lot of room to customize, fine-tune, or self-host variants of the model for domain-specific workflows. That openness is part of why you are seeing day-zero integrations not just on Perplexity but also across orchestration platforms and inference providers that are building their own agent stacks on top.

So what does all of this mean if you are just a person with a Perplexity Pro or Max subscription, logging in from a laptop or phone somewhere in the US and wondering if this actually changes your day-to-day?

In the near term, the shift is mostly experiential rather than flashy. Long Computer sessions feel less fragile and more “confident,” especially when they involve writing, refactoring, or analyzing large bodies of text and code. Multi-step research tasks – think: “analyze these reports, cross-check with current news, and draft something that ties it all together” – become more viable as a single continuous session instead of a series of disconnected prompts.

Over time, as Perplexity leans harder into agent-based features, Nemotron 3 Ultra gives the platform more headroom to experiment with richer, more autonomous behaviors without wrecking latency or cost. And because the model itself is open, it nudges the ecosystem toward a world where the frontier capabilities you get inside a polished consumer product are not fundamentally different from the tools independent developers and researchers can access and modify.

In other words, this is one of those platform updates that doesn’t demand a big marketing explainer in your inbox but quietly shifts the ceiling on what your AI assistant can handle. If you are on Pro or Max, you do not have to do anything fancy: just run the kinds of long, complex, multi-step tasks you already wish your AI could handle better, especially inside Computer, and see how far you can push it now that Nemotron 3 Ultra is doing a lot of the heavy lifting behind the scenes.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Leave a Comment

Leave a ReplyCancel reply

Most Popular

Gemini can now create images based on your own life

Linux developers get an official native Claude Desktop app

Google’s 2026 Environmental Report: A tougher road to net-zero

Google Meet updates bandwidth controls for smoother calls

You can finally use Ask Gemini in the Google Drive mobile app

Also Read
A person carries the LG xboom Stage 501 portable Bluetooth party speaker by its built-in handle at an outdoor backyard gathering. The speaker features illuminated LED lighting and top-mounted controls while friends socialize in the background, highlighting its portable design for outdoor entertainment.

LG’s new xboom Stage 501 turns your living room into a karaoke bar

Screenshot of the Anthropic Claude Enterprise Analytics dashboard displaying organization-wide AI usage and cost metrics. The interface includes summary cards for weekly active members, pull requests created, cowork sessions, and total spending, along with an Analytics Chat panel and a line chart showing Claude usage trends over time. A sidebar provides navigation to analytics for Claude.ai, Claude Code, Cowork, Claude Tag, and Code Review.

Anthropic’s new admin tools bring discipline to AI spending

Screenshot of a Claude Code artifact viewer displaying a product analytics dashboard. The interface includes version comparisons, mobile UI mockups, conversion metrics, performance charts, and a sharing panel that allows users to distribute the latest artifact version through a shareable link.

Claude Code brings artifacts to Pro and Max users

Promotional graphic showcasing example WhatsApp usernames displayed as profile cards. Sample profiles include @AnnaAtWork, @QueenTrinity, @JonnyR, and @Katy_Paints, illustrating how usernames will appear alongside profile photos and display names. The WhatsApp logo appears in the lower-left corner.

The era of the WhatsApp username is finally here

Screenshot of Google Sheets displaying a spreadsheet with regional sales data and a newly imported 3D stacked column chart. The Chart editor panel on the right shows the chart type set to "3D Stacked column chart," with data for laptops, smartphones, and tablets grouped by region (East, North, South, and West).

You can now import 3D bar charts into Google Sheets

Google Drive logo featuring a triangular design with green, blue, and yellow segments on a light blue background.

Google replaces clunky Drive searches with AI Overviews on mobile

Gemini logo featuring a four-pointed star with smooth curved edges, filled with a rainbow gradient transitioning from red to purple. The star is centered on a white rounded square, set against a blue gradient background fading from dark at the edges to light near the center.

Gemini Spark for Mac is here to organize your files

Ryan Gosling in Project Hail Mary

Stream Project Hail Mary starting tomorrow

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.