Anthropic’s secret Mythos AI just slipped into the wrong hands

Anthropic built Mythos as the kind of AI you never want loose on the open internet – and yet, according to a report, that is now effectively what has happened, at least for a small but determined group of users. Behind the headline about “unauthorized access” is a story about how frontier AI, messy supply chains, and very human curiosity have collided in a way that should worry pretty much anyone who relies on software – which is to say, all of us.

Mythos is not just another chatbot with a friendly name and a quirky personality; it is Anthropic’s internal, unreleased security super-tool, a frontier model trained specifically to hunt down and exploit vulnerabilities across the digital stack. In Anthropic’s own framing, the Claude Mythos Preview can identify and exploit flaws “in every major operating system and every major web browser” when pointed at a target – putting it in a different league from the consumer-facing assistants we’re used to poking for recipe tips and vacation ideas. That capability is exactly why Mythos was being rolled out in a tightly controlled way, through an effort called Project Glasswing that brings in heavyweight partners like AWS, Apple, Google, Microsoft, NVIDIA, JPMorgan Chase, and major security vendors to use the model for defensive cybersecurity, not chaos.

The idea behind Glasswing is simple on paper: if you’ve built an AI that is better than almost any human at finding bugs, you point it at critical infrastructure and let it stress-test the software before attackers do. Anthropic says Mythos has already uncovered thousands of high-severity vulnerabilities across widely used operating systems, browsers, and other software, with those issues being quietly disclosed so vendors can patch them. In theory, this is a huge win for users: the same intelligence that could crack open banks or hospitals can instead help lock them down, reducing the risk of the kind of catastrophic breaches that dominate headlines every few weeks. But that bargain only holds if the model stays where it’s supposed to be.

According to documentation and sources cited by Bloomberg, that assumption broke on April 7th – the very day Anthropic publicly announced it was giving limited Mythos access to a small set of companies under the Project Glasswing banner. A “small group of unauthorized users” in a private online forum reportedly managed to get their hands on the model through a mix of inside access and good old-fashioned internet sleuthing. The Verge, which also reviewed details of the incident, reports that the group congregates on Discord, focused specifically on tracking and poking at unreleased AI systems.

The key link appears to have been a third-party contractor working with Anthropic. One member of the forum, described as a “third-party contractor for Anthropic,” allegedly used their access to help the group reach Mythos, augmented by clues from a separate security disaster: the recent Mercor breach. Mercor is a $10 billion AI training startup that supplies data and services to labs like OpenAI, Anthropic, and Meta, and it suffered a major supply-chain attack that exposed around 4 terabytes of sensitive data, including internal materials tied to frontier models. Among that haul, according to reporting, were details about Anthropic’s model formats and deployment patterns, enough for motivated sleuths to make an “educated guess” about where Mythos might be reachable online.

Put those pieces together – a contractor with access, leaked information about how Anthropic deploys models, and a Discord full of people obsessed with unreleased AI – and you have the rough recipe for what happened next. The group reportedly located an environment hosting Claude Mythos Preview, then began using it regularly over the following two weeks, even providing screenshots and a live demo to Bloomberg as proof. Anthropic, for its part, has acknowledged that it is investigating a report of unauthorized access to Mythos through “one of our third-party vendor environments” and says it has no evidence, so far, that its core systems were compromised or that the incident extended beyond that vendor’s environment.

The most unsettling detail might be what the group decided not to do. According to Bloomberg’s sources and follow-on coverage, the forum members deliberately avoided using Mythos for cybersecurity tasks – the very thing it was built for – in order to avoid tripping any alarms Anthropic might be running on that environment. Instead, they appear to have treated it more like a rare piece of tech memorabilia: something to play with, show off on a private server, and keep under the radar rather than turn into an obvious weapon. That restraint is cold comfort, though, because the incident reveals just how porous the AI ecosystem around these models really is.

Inside Anthropic, Mythos has been framed as “too dangerous for the wild,” a phrase that sounds melodramatic until you understand what the company’s own red-teamers reportedly found. According to feature reporting, internal teams concluded that Mythos could reliably compromise the underlying systems of modern computing – not just individual apps, but the operating systems and core services that everything else runs on. Government agencies and banks have been racing to secure access to the model under Project Glasswing precisely because they see it as both a powerful ally and a potential national security risk. When a tool can out-hack almost anyone, you don’t just worry about script kiddies; you worry about states, criminal syndicates, and anyone with enough money and patience to chase it.

That’s why the Mythos leak lands differently than, say, a misconfigured chatbot endpoint or a leaked model checkpoint for a generic text generator. It suggests that no matter how carefully a frontier lab tries to wrap a powerful system in rules and access controls, the weakest link might be a contractor’s workstation, a compromised vendor, or a misjudged integration. The Mercor breach is a particularly stark reminder that the AI industry relies on a dense supply chain of vendors, contractors, open source components, and cloud services – each of them a potential attack surface. When one of those links snaps, it’s not just personal data on the line; it can be the blueprints and connective tissue around the most advanced AI systems on earth.

Security researchers have been warning for years that “AI safety” can’t just mean making sure chatbots don’t say offensive things or refuse to write malware on request. Incidents like the Anthropic leak and the Mercor attack are now being cited as early examples of a different class of problem: AI security failures that expose internal models, training data, or deployment details in ways that could shape the global balance of power in AI. In one case described by enterprise security analysts, a release-packaging error at Anthropic exposed source code for its Claude Code assistant and other internal files, while in another, a supply-chain attack led to mass exfiltration of proprietary data from a vendor’s environment. None of these are science-fiction scenarios – they look a lot like the messy, human bugs and oversights that drive ordinary data breaches, just with much higher stakes.

For everyday users, it’s tempting to treat Mythos as something distant and abstract, locked inside corporate networks and government labs. But the whole point of tools like Project Glasswing is to defend the software you actually rely on: the banking apps on your phone, the EHR systems at your hospital, the cloud services your employer runs on, the routers and switches keeping your home internet alive. If a model like Mythos can make defenders an order of magnitude more effective at finding and fixing vulnerabilities in those systems, that’s a big net positive – as long as the model doesn’t leak, get cloned, or end up operating at the direction of people who don’t care about collateral damage. The moment unauthorized access becomes normal, the argument for deploying such capabilities at all starts to look a lot shakier.

This also adds fuel to a brewing policy debate about how frontier AI models should be governed. Anthropic is already flagged as a “supply-chain risk” by the Pentagon, reflecting concerns that dependencies on a single private AI vendor could introduce systemic vulnerabilities for the US government and military. At the same time, the company has been trying to rebuild trust in Washington by pitching Mythos and Project Glasswing as proof that high-end AI can serve national security and critical infrastructure, not just consumer products. Now policymakers have to confront the uncomfortable reality that even those security-focused models can slip the leash, not because a lab suddenly decided to open-source its weights, but because a contractor clicked the wrong link or a vendor failed to lock down a dependency.

We’re also seeing the cultural side of this: a growing underground scene of AI hobbyists and professionals who treat unreleased systems the way console modders once treated dev hardware, as forbidden toys that confer status if you can get your hands on them. A private Discord channel trading tips on hidden endpoints or leaked model formats might sound niche, but those communities can quickly become the connective tissue between security lapses and real-world exploitation. The Mythos incident suggests that this subculture doesn’t always see itself as “the bad guys” – they may genuinely be curious, careful, even proud of not going full black hat – but they still normalize a world where wandering into a restricted AI environment is a fun challenge, not a red line.

For the big AI labs, the lesson is brutally straightforward: your threat model has to include your own ecosystem, not just external attackers trying to brute-force the front door. Anthropic’s experience with Mythos and its previous leaks, combined with the Mercor fallout, point to a future where model governance, red-teaming, and safety benchmarks are only one part of the job. The rest looks a lot like old-school, unglamorous enterprise security work: doing vendor due diligence, controlling contractor access, monitoring strange usage patterns, and building incident response plans that assume the worst has already happened somewhere in your stack.

And for the rest of us – users, customers, voters – the Mythos story is a reminder that AI risk isn’t just about whether a model says something harmful on social media or automates away a job. It’s also about whether powerful new tools that can reshape cybersecurity actually stay in the hands of people trying to make the internet safer. When one of those tools quietly slips into a Discord channel, even for a couple of weeks, it shows just how thin the line is between “too dangerous for release” and “accessible if you know the right people and the right tricks.” That thin line is exactly where the next phase of the AI security conversation is going to be fought.