Perplexity is turning its Deep Research feature into something much closer to a serious research companion than a fancy answer box, and Max subscribers are the first to feel that shift, with Pro users next in line over the coming days.
At the core of this upgrade is a fairly bold claim: Deep Research now delivers state-of-the-art performance on leading external benchmarks, beating rival “research agents” on both accuracy and reliability. Internally, Perplexity is especially confident about the hardest categories to get right at scale—Law, Medicine, and Academic work—where it reports higher scores than competing tools when judged on detailed, expert rubrics rather than vibes. For anyone who has been hesitant to trust AI on high‑stakes queries, that distinction matters more than raw model brand names.
Under the hood, Deep Research is now consistently running on Anthropic’s Opus 4.5 for Max and Pro tiers, wrapped in Perplexity’s own search, browsing, and sandbox infrastructure. That means when you kick off a Deep Research session, you’re not just prompting a single model—you’re dispatching an agent that can search the web, open pages, execute code where needed, and then stitch all of that into a long‑form answer that is grounded in actual sources instead of hallucinated detail. The company also says it will keep swapping in “top reasoning models” as they appear, treating the model layer as a replaceable component inside a larger research engine rather than the product itself.
What really makes this release feel different from the usual “we’re better on benchmarks” announcement is DRACO, a new open-source benchmark Perplexity is publishing to stress‑test deep research agents in conditions that look a lot more like messy reality. DRACO is built from 100 tasks across 10 domains—Academic, Finance, Law, Medicine, Technology, General Knowledge, UX Design, Personal Assistant, Shopping, and tricky “Needle in a Haystack” scenarios—and each task is graded against roughly 40 criteria covering accuracy, completeness, and objectivity. Rather than synthetic trivia, these tasks are adapted from real user questions, stripped of personal data and then refined by subject‑matter experts, with an LLM‑as‑judge setup that checks specific claims against the underlying sources.
On DRACO, Perplexity’s upgraded Deep Research comes out on top across every domain that was tested, with particularly strong numbers in Law and Academic use cases, and a notable lead in complex, multi‑step reasoning tasks like planning or personalized assistance. The company also points out that it isn’t just scoring higher—it’s doing so faster, recording lower latency than competing research agents while still running multi‑step searches and analysis under the hood. In practice, that’s the difference between an AI “report” that feels like waiting on a slow consultant and something you can actually integrate into your daily workflow without breaking your rhythm.
For Max subscribers, this upgrade is live now, with each Advanced Deep Research query defaulting to the same Opus 4.5‑based harness and toolset, plus higher usage limits that encourage you to lean on it for more than the occasional “big project.” Pro users are next; Perplexity says the rollout is happening “over the coming days,” which suggests the company wants to watch load and behavior on Max before it turns up the dial for the much larger Pro base. If you’re on the free tier, this is also a clear bit of product segmentation: Deep Research is increasingly framed as a premium, power‑user feature, not just another button everyone has by default.
Stepping back, this move fits into a broader race among AI companies to move beyond chat and into tools that can actually carry a research workflow from question to synthesis. OpenAI, Google, and others are experimenting with their own agentic systems, but Perplexity is leaning heavily into the “answer engine” identity—live search, citations, and now a research mode that tries to mirror how a diligent analyst would work through a problem. With DRACO open‑sourced and hosted on Hugging Face, Perplexity is also inviting competitors and skeptics to run their own evaluations, which is a confident stance in a space where benchmark cherry‑picking is the norm.
For working researchers, journalists, analysts, and students, the interesting question isn’t just “Is this more accurate?” but “What can I safely offload?” With stronger performance in domains like law, medicine, and finance, the upgraded Deep Research starts to look like something you might use to assemble the first draft of a memo, literature review, or market brief, then layer your own judgment on top. If the rollout to Pro users goes smoothly and the DRACO results hold up under outside scrutiny, this release positions Perplexity less as a generic chatbot and more as a specialized research layer you keep pinned next to your browser tabs—especially if your day is already one long deep‑dive.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
