By using this site, you agree to the Privacy Policy and Terms of Use.
Accept

GadgetBond

  • Latest
  • How-to
  • Tech
    • AI
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Add GadgetBond as a preferred source to see more of our stories on Google.
Font ResizerAa
GadgetBondGadgetBond
  • Latest
  • Tech
  • AI
  • Deals
  • How-to
  • Apps
  • Mobile
  • Gaming
  • Streaming
  • Transportation
Search
  • Latest
  • Deals
  • How-to
  • Tech
    • Amazon
    • Apple
    • CES
    • Computing
    • Creators
    • Google
    • Meta
    • Microsoft
    • Mobile
    • Samsung
    • Security
    • Xbox
  • AI
    • Anthropic
    • ChatGPT
    • ChatGPT Atlas
    • Gemini AI (formerly Bard)
    • Google DeepMind
    • Grok AI
    • Meta AI
    • Microsoft Copilot
    • OpenAI
    • Perplexity
    • xAI
  • Transportation
    • Audi
    • BMW
    • Cadillac
    • E-Bike
    • Ferrari
    • Ford
    • Honda Prelude
    • Lamborghini
    • McLaren W1
    • Mercedes
    • Porsche
    • Rivian
    • Tesla
  • Culture
    • Apple TV
    • Disney
    • Gaming
    • Hulu
    • Marvel
    • HBO Max
    • Netflix
    • Paramount
    • SHOWTIME
    • Star Wars
    • Streaming
Follow US
AIAmazonAnthropicTech

Anthropic launches Claude Opus 4.6 with massive context and agent power

Anthropic’s newest model is built to plan, execute, and revise work over long stretches of time.

By
Shubham Sawarkar
Shubham Sawarkar's avatar
ByShubham Sawarkar
Editor-in-Chief
I’m a tech enthusiast who loves exploring gadgets, trends, and innovations. With certifications in CISCO Routing & Switching and Windows Server Administration, I bring a sharp...
Follow:
- Editor-in-Chief
Feb 6, 2026, 1:05 AM EST
Share
We may get a commission from retail offers. Learn more
A collage-style banner image with the text “Claude Opus 4.6” overlaid, combining visuals of a retro computer screen displaying system text, a cloudy sky, a Mars rover on a rocky landscape, tomatoes on a vine, and a grid of circular buttons, symbolizing advanced AI, technology, and exploration.
Image: Anthropic
SHARE

Anthropic’s newest flagship AI, Claude Opus 4.6, isn’t just another incremental model bump – it’s Anthropic openly declaring that “agentic” AI is ready to handle real work, not just brainstorms and drafts. The company is pitching it as a system that can plan, execute and revise over long horizons, across big codebases and dense document sets, without needing the kind of constant hand‑holding people have learned to expect from today’s chatbots.

On paper, Opus 4.6 is an upgrade to Anthropic’s previous top‑tier model in all the ways you’d expect: better coding, stronger reasoning, more context, and a fatter stream of output. Under the hood, though, there’s a more interesting story: Anthropic is leaning hard into long‑running “agent” workflows and enterprise‑grade knowledge work, and it’s comfortable enough with the safety profile to let this thing operate closer to where business value actually lives – source code, spreadsheets, contracts, and production systems.

The headline spec is the context window. Opus 4.6 is Anthropic’s first Opus‑class model with a 1‑million‑token context window in beta, meaning it can keep track of the equivalent of thousands of pages of code and documentation at once. In practice, developers and analysts won’t work at that ceiling most of the time, but they’ll feel the benefits well below it: fewer “what were we talking about again?” moments, less need for manual chunking, and fewer hallucinated details when the model is asked to reason over a large body of material. Anthropic’s own tests on MRCR v2, a needle‑in‑a‑haystack benchmark, show Opus 4.6 hitting 76% on an 8‑needle, 1M‑token setup, compared with just 18.5% for its Sonnet 4.5 sibling – a huge jump in the model’s ability to actually use long context instead of simply accepting it.

They’ve also bumped output size to a frankly wild 128,000 tokens, which is enough to generate entire subsystems of documentation, large code diffs or full‑length reports in one go. That pairs with a set of new “effort” and “thinking” controls clearly aimed at builders of agents and dev tools. Anthropic now exposes four effort levels – low, medium, high (default), and max – and a mode it calls adaptive thinking, where the model decides for itself when to spin up deeper chains of thought and when to stay lightweight. The idea is that, instead of always forcing the model into heavy, expensive reasoning, you can give it a budget and let it dynamically spend that budget only when a task looks tricky or ambiguous.

That “agent‑first” design shows up everywhere around Opus 4.6. In Claude Code, Anthropic is rolling out agent teams – multiple sub‑agents that can work in parallel on a codebase, coordinate, and hand off tasks. Think of one agent doing static analysis, another implementing changes, another running tests, with a coordinator agent deciding who does what and when. On the API side, there’s a feature called context compaction: the model can keep summarizing and compressing older parts of a long conversation or task so it can keep going without hitting the context wall, which is exactly what you want if you’re trying to build an autonomous system that can run for hours instead of minutes.​

The early‑access quotes Anthropic is willing to publish read like a targeted pitch to serious software teams. GitHub talks about Opus 4.6 “unlocking long‑horizon tasks at the frontier” because it can plan, call tools, and survive complex, multi‑step workflows without derailing. Replit calls it “a huge leap for agentic planning,” emphasizing its ability to break down complex tasks into subtasks, run tools and subagents in parallel, and spot blockers with precision. Cursor, which builds an AI‑first IDE, says Opus 4.6 is “the new frontier on long‑running tasks” and is highly effective at code review. SentinelOne, a cybersecurity company, describes it bluntly: in their tests, Opus 4.6 handled a multi‑million‑line codebase migration “like a senior engineer,” planning ahead, adjusting strategy, and finishing in half the time.​

Zoom out from the quotes, and the benchmark story is pretty aggressive. On Terminal‑Bench 2.0, an agentic coding benchmark, Opus 4.6 takes the top spot across all tested models. On Humanity’s Last Exam, a tough multi‑disciplinary reasoning test, it again leads. On GDPval‑AA, which measures performance on economically valuable knowledge work across finance, legal and other domains, Anthropic says Opus 4.6 beats OpenAI’s GPT-5.2 by about 144 Elo points and its own Opus 4.5 by 190 points – big enough that, in head‑to‑head trials on that benchmark, Opus 4.6 would be expected to win around 70% of the time. It also tops BrowseComp, a benchmark that simulates the “find the one obscure thing on the internet” problem that real research agents face.

Outside pure scores, partners are reporting the kinds of outcomes enterprises actually care about. Norway’s sovereign wealth fund manager NBIM says that in 40 cybersecurity investigations, Opus 4.6 produced the best result in 38 cases when run in an agentic harness with up to nine subagents and more than a hundred tool calls. Rakuten describes the model autonomously closing 13 issues and assigning another 12 across a ~50‑person engineering org in a single day, making both product and organizational decisions and knowing when to escalate to humans. Box reports a 10‑percentage‑point lift on its own multi‑source analysis evals, hitting 68% versus a 58% baseline and nearly perfect scores in technical tasks. In legal, Harvey says Opus 4.6 hit a 90.2% score on its BigLaw Bench, with 40% of tests perfect and 84% above 0.8, which is the kind of performance that starts to look like a true specialist assistant rather than a generic text model.

Where this lands for everyday knowledge workers is in the integrations. Anthropic has been quietly turning Claude into a kind of AI layer for the Microsoft Office universe, and Opus 4.6 deepens that. Claude in Excel now plans before acting, handles messier, unstructured data, and can execute multi‑step transformations and analyses without constant prompting – more like giving a junior analyst a brief than telling a macro exactly what to do. Claude in PowerPoint, now in research preview, sits directly inside PowerPoint as a side panel and can generate or refine entire decks, while respecting slide masters, fonts, and layouts so the output doesn’t feel like a generic AI template slapped on top.

Pricing is notable for what didn’t change. Opus 4.6 keeps the same base API pricing as its predecessor – $5 per million input tokens and $25 per million output tokens – with a premium tier that kicks in for prompts that go beyond 200,000 tokens, where the 1M‑token context beta lives. There’s also a US‑only inference option at a 1.1× price multiplier for workloads that need to stay on US soil, clearly aimed at regulated industries. For developers, the model is already available via the Claude API as claude-opus-4-6, and for everyone else, it’s simply the new “smartest Claude” tier in the claude.ai interface and across major cloud platforms, including Microsoft’s Foundry catalog and other partners.

One of the more subtle shifts with Opus 4.6 is Anthropic’s tone around how much autonomy they think is acceptable. The company says its own engineers “build Claude with Claude,” and that Opus 4.6 often chooses to think more deeply and revisit its reasoning before finalizing an answer, which is great for hard problems but can feel like overkill on simple ones. That’s part of why they’re pushing the effort controls: if your use case is lightweight chat or quick email drafts, you dial the effort down; if you’re asking the model to refactor a monorepo or synthesize 10 research reports into a risk memo, you let it go full throttle.​

None of this comes without safety baggage, and Anthropic is clearly trying to get ahead of that. The Opus 4.6 system card is billed as one of the most extensive they’ve done, with automated behavioral audits, tests for deception and sycophancy, wellbeing‑oriented evals, and updated probes for misuse in sensitive domains like cybersecurity. Anthropic says Opus 4.6 has a misalignment rate as low as, or lower than, any frontier model it’s tested, with fewer “over‑refusals” – cases where the model unnecessarily refuses benign queries – compared with previous Claude versions. Given that the model is demonstrably stronger at security‑relevant tasks, the company has added six new cybersecurity probes to detect harmful responses and is leaning into using the same model defensively: scanning open‑source projects for vulnerabilities, helping defenders patch faster, and, if needed, moving toward real‑time intervention mechanisms when abuse is detected.

Stepping back, Opus 4.6 drops into a moment where the big labs are all converging on a similar picture of what “next‑gen AI” looks like: not just chat, but agents that can operate computers, schedule work, call tools, and keep going over long time spans. OpenAI has been pushing in that direction with its own agents and GPT-5‑series models; Google is doing the same under the Gemini 3 Pro umbrella. Anthropic’s claim with Opus 4.6 is that, at least on today’s benchmarks and in early partner deployments, it now has the edge where it matters for enterprises: long‑horizon coding, deep research, and high‑stakes knowledge work.

For users, the experience may feel less like talking to a chatbot and more like working with a highly competent, occasionally verbose colleague who can take a vague, messy request and quietly turn it into a plan, a batch of tool calls, and a finished product. And as the industry leans further into this “vibe working” era – where AI doesn’t just answer but collaborates over days or weeks – Opus 4.6 is Anthropic’s argument that Claude is ready to sit at that table, not as a demo, but as part of the production stack.


Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

Topic:Claude AI
Leave a Comment

Leave a ReplyCancel reply

Most Popular

Apple MacBook Neo: big power, surprising price, one clear target — Windows

Apple’s first touchscreen MacBook Pro is finally happening

New iPad Air M4 keeps price, adds more memory and Wi-Fi 7

The new budget MacBook could be Apple’s best Windows switcher yet

OpenAI’s Codex app is almost ready for Windows PCs

Also Read
Google Find Hub app logo surrounded by travel icons including airplanes, a luggage bag with tags, a location pin, a search icon, and a map icon on a light blue background.

Lost luggage? Google Find Hub can now tell your airline where it is

Google Gemini 3.1 Flash-Lite logo on a black background with a colorful four-pointed star icon and a large stylized "G" made of blue dots.

Google launches Gemini 3.1 Flash-Lite for high-volume AI workloads

The F1 and Apple TV logos are shown above images of Formula 1 cars.

The 2026 F1 season is here — and it’s all on Apple TV

A person stands in front of a blue tiled wall featuring the illuminated word “OpenAI.” They are holding a smartphone and appear to be engaged with it, possibly taking a photo or interacting with content. The scene emphasizes the OpenAI brand in a modern, tech-savvy setting.

OpenAI’s GPT-5.4 is coming — and it’s sooner than you think

OpenAI Prism app icon shown as a layered, glowing blue geometric shape centered on a soft blue gradient background, representing an AI-powered scientific writing workspace.

Prism + Codex: OpenAI’s LaTeX editor is now a full research powerhouse

OpenAI Codex app logo featuring a stylized terminal symbol inside a cloud icon on a blue and purple gradient background, with the word “Codex” displayed below.

OpenAI’s Codex app is now available on Windows

Screenshot of the Perplexity Computer product interface inside the Comet browser, showing the left sidebar with navigation options including Search, Computer, New Task, Tasks, Files, Connectors, and Skills. The main panel displays a Tasks view with a voice mode input prompt labeled "Say something..." beneath a pixelated AI avatar icon. Below it, a task status list shows entries including a deployed PerplexINTy dashboard update, a completed web app submission, and an enterprise-related development run, illustrating the AI agent's multi-task execution capabilities.

Perplexity Computer can now hear you — voice mode officially launches

Minimalist promotional graphic for Perplexity Computer showing a glowing glass sphere with a computer icon floating above a sunlit field of white wildflowers, with the word “perplexity” on the left and “computer” on the right against a soft gray sky.

AI orchestration is the new operating system — Perplexity got there first

Company Info
  • Homepage
  • Support my work
  • Latest stories
  • Company updates
  • GDB Recommends
  • Daily newsletters
  • About us
  • Contact us
  • Write for us
  • Editorial guidelines
Legal
  • Privacy Policy
  • Cookies Policy
  • Terms & Conditions
  • DMCA
  • Disclaimer
  • Accessibility Policy
  • Security Policy
  • Do Not Sell or Share My Personal Information
Socials
Follow US

Disclosure: We love the products we feature and hope you’ll love them too. If you purchase through a link on our site, we may receive compensation at no additional cost to you. Read our ethics statement. Please note that pricing and availability are subject to change.

Copyright © 2026 GadgetBond. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | Do Not Sell/Share My Personal Information.