Claude Opus 4.6 and Sonnet 4.6 now support 1M tokens at standard pricing

Anthropic just made a significant move that’s been quietly anticipated in the developer community for a while — the company officially announced on March 13, 2026, that its 1 million token context window is now generally available for both Claude Opus 4.6 and Claude Sonnet 4.6, effective immediately across the Claude Platform, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Azure Foundry.

To understand why this is a big deal, it helps to know what a “context window” actually means. Think of it as the AI’s working memory — everything it can read and consider at once before giving you an answer. One million tokens is roughly 750,000 words, or the equivalent of about ten full-length novels. Until now, even the smartest AI models would start forgetting what you told them earlier in a conversation once things got too long. That problem — sometimes called “context rot” — has been a real limitation for engineers, lawyers, researchers, and really anyone trying to use AI for complex, sprawling projects.

What’s changed with today’s announcement isn’t just the raw number. It’s the price. Previously, when Anthropic launched Opus 4.6 back in February, that 1M context window was available in beta — but for prompts exceeding 200K tokens, developers were billed at a premium rate of $10 per million input tokens and $37.50 per million output tokens. That was a steep surcharge that many developers simply couldn’t justify at scale. Starting now, those premium rates are completely gone. The standard pricing — $5 per million input tokens and $25 per million output for Opus 4.6, and $3/$15 for Sonnet 4.6 — applies whether you’re sending a 9,000-token message or a 900,000-token one. No multiplier, no fine print.

Beyond pricing, Anthropic has also lifted a few other practical limitations. The media limit per request has jumped from 100 images or PDF pages all the way to 600 — a six-fold increase that makes a meaningful difference for anyone doing document-heavy work. Full rate limits now apply across the entire context window, which means developers aren’t penalized or throttled just because their requests are longer. And for those who were using the beta header in their API calls to unlock long-context access, Anthropic says it’s no longer needed — requests over 200K tokens just work automatically without any code changes.

The other question worth asking is: does the model actually use all that context effectively, or is it just window dressing? This is where Anthropic has put serious effort. On MRCR v2 — an industry benchmark that tests long-context retrieval by hiding multiple pieces of information deep inside a million-token document and asking the model to find them all — Claude Opus 4.6 scores 78.3% at the 1M token length, the highest among frontier models at that context length. For comparison, Sonnet 4.5, the previous default model, managed just 18.5% on the same test. That’s not a minor improvement. That’s a qualitative leap, the kind of difference that changes whether a feature is actually useful in production or just a marketing claim.

The real-world implications are starting to surface in interesting ways. Anthropic shared a number of testimonials from companies already using the expanded context. One AI research lab says it can now synthesize hundreds of scientific papers, proofs, and codebases in a single pass, dramatically accelerating fundamental physics research. A legal tech company notes that lawyers can finally bring multiple rounds of a 100-page contract negotiation into one session without losing track of changes across versions. An incident response platform says it can keep every signal, entity, and working theory in view from the first alert all the way through remediation — without compaction or context clearing.

One particularly telling data point comes from a company that raised its Opus context window from 200K to 500K and found the agent actually used fewer tokens overall — because with more context available, the model spent less time re-reading and re-processing earlier information. That counterintuitive result speaks to something deeper about how context efficiency works: more isn’t always wasteful; sometimes it’s actually leaner.

For Claude Code users — Anthropic’s AI-powered coding assistant — this update is especially meaningful. Max, Team, and Enterprise users on Opus 4.6 will now default to 1M context automatically, which means fewer “compaction” events where the model is forced to summarize and discard earlier parts of a long coding session. Developers who have worked with Claude Code at scale know exactly how painful those compaction moments are — you lose details, cross-file dependencies get murky, and you end up re-explaining things you’ve already said. With 1M context running by default, that friction is largely eliminated.

Sonnet 4.6, which Anthropic made the default model for Free and Pro claude.ai users when it launched in February, also benefits from today’s announcement. The model was already praised for approaching Opus-level intelligence at Sonnet-level pricing, and now it carries the same long-context access without surcharge. For developers building on a budget or teams that need high throughput at reasonable cost, Sonnet 4.6 at $3/$15 per million tokens with a full 1M window is a compelling combination.

In the broader AI landscape, this move puts Anthropic squarely in competition with Google’s Gemini 1.5 Pro and Gemini 2.0, both of which have long offered 1M token contexts at competitive prices. What Anthropic is now arguing is that having the context window isn’t enough — what matters is how well the model retrieves and reasons across that context. With Opus 4.6’s benchmark scores and Anthropic’s claim of being the highest-performing frontier model at 1M tokens, the company is making a quality-over-quantity argument.

For anyone building enterprise software, doing large-scale document analysis, or simply tired of their AI assistant losing the thread halfway through a long conversation — this is the kind of infrastructure update that quietly makes a lot of things better. The 1M context window is available right now across all major cloud platforms, with no extra steps required.

Discover more from GadgetBond

Subscribe to get the latest posts sent to your email.

GadgetBond

Claude Opus 4.6 and Sonnet 4.6 now support 1M tokens at standard pricing

Discover more from GadgetBond

Leave a ReplyCancel reply

Windows 10 and 11 PCs hit by 2026 Secure Boot deadline

How to scan documents in the iPhone Notes app

What is Raycast and why everyone’s using it

Samsung confirms the end of Samsung Messages in July 2026

OpenAI launches Safety Fellowship for independent AI research

Apple’s iPhone Fold enters trial run at Foxconn

Netflix rolls out Playground app with ad-free games for kids under eight

How to sort and filter Photos on iPhone

ASUS ProArt PRT-BE5000 WiFi 7 router pairs with PQG-U1080 switch for creator networks

Switch 2 + Super Mario Galaxy duo discounted until May 9

Samsung expands Q-Symphony with 2026 soundbars and Wi-Fi speakers

Microsoft bets $10 billion on Japan’s AI future by 2029

Perplexity API credits start at $1,000 on AWS Marketplace