Anthropic’s Most Dangerous AI Leaked Itself (And It’s Called Claude Mythos)

Kawaii cat hacker looking at leaked AI documents on computer screen - Claude Mythos Anthropic

Anthropic, the AI safety company that keeps telling us it’s being very careful about AI, accidentally left its most powerful AI model sitting in a publicly searchable data store on the open internet. The model is called Claude Mythos. It is, by Anthropic’s own admission, “by far the most powerful AI model we’ve ever developed.” It also, according to internal documents, poses “unprecedented cybersecurity risks.”

You really cannot make this stuff up.

How the Leak Happened

On March 27, 2026, Fortune reporter Bea Nolan discovered that Anthropic had misconfigured its content management system, leaving nearly 3,000 unpublished assets in an unencrypted, publicly searchable data cache. Among them: a draft blog post announcing Claude Mythos, details of a private invite-only CEO summit in Europe, and internal documents describing the model’s capabilities.

Two cybersecurity researchers independently verified the finding: Roy Paz, a senior AI security researcher at LayerX Security, and Alexandre Pauwels from the University of Cambridge. After Fortune contacted Anthropic, the company removed public access to the data store.

Anthropic attributed the exposure to “a human error in the configuration of its content management system” and described the leaked materials as “early drafts of content considered for publication.” Which is a very polite way of saying “we accidentally told the whole internet about our secret superpowered AI.”

What Is Claude Mythos, Exactly?

Claude Mythos sits in a new tier above Anthropic’s existing Opus models — a tier the company is internally calling “Capybara.” The leaked documents describe it this way: “Capybara is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful.”

In plain English: if you thought the AI coding war was heating up, Claude Mythos just added rocket fuel. The model reportedly scores “dramatically higher” than Claude Opus 4.6 on tests of software coding, academic reasoning, and — here’s where it gets interesting — cybersecurity.

Anthropic confirmed the model’s existence in a statement: “We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we’re being deliberate about how we release it. We consider this model a step change and the most capable we’ve built to date.”

A small group of early access customers are already testing it. The rest of us are apparently not ready yet.

The Cybersecurity Problem

Here’s the part that should make you put down your coffee: the leaked documents describe Claude Mythos as “currently far ahead of any other AI model in cyber capabilities,” and explicitly raise concerns about its ability to rapidly identify and exploit software vulnerabilities.

This is not a minor footnote. Anthropic was apparently so concerned about the dual-use implications of releasing a model this good at cybersecurity that it was taking an unusually slow rollout approach even before the leak. The documents suggest the company believes Mythos could “significantly heighten cybersecurity risks” and potentially “accelerate a cyber arms race.”

Think about what that means for a moment. The people who built this thing — the same people who wrote a whole manifesto about responsible AI development — are themselves worried about what happens when it gets out. And now, in a twist of delicious irony, the company that preaches AI safety had its biggest safety secret leaked because someone forgot to lock a folder.

For context: we’ve previously covered mystery models like Hunter Alpha that appeared to have extraordinary capabilities. Claude Mythos is different because Anthropic is confirming it, not denying it.

What This Means for the AI Landscape

The timing is notable. Just days ago, AI coding benchmarks had converged — every major lab within a few percentage points of each other, leading some to declare the AI arms race essentially a draw. Claude Mythos, if it performs as described, would blow that narrative apart entirely.

It also changes the competitive picture. Google, OpenAI, Meta — they all now know a Capybara-tier model exists and is in limited testing. The race to release an equivalent will accelerate. Meanwhile Anthropic is in the awkward position of having had its product roadmap exposed, its safety concerns publicly documented, and its next major announcement stolen by a misconfigured database.

The model’s name is also worth a moment of attention. “Mythos” — from the Greek for narrative, legend, or story. Whether that’s a reference to the model’s conversational abilities, its mythological level of capability, or just a naming committee that liked the sound of it, we don’t know. But it lands differently when the “legend” part of the story involves an accidental leak.

Should You Be Excited or Terrified?

Probably both, which is the standard emotional state for following AI in 2026.

On one hand: a model that genuinely represents a “step change” in reasoning and coding would be transformative. We’ve been in benchmark convergence territory for months. The AGI debate keeps circling the same arguments. A legitimately more capable model is exactly the kind of thing that moves the conversation forward in ways that matter for actual users.

On the other hand: a company that builds powerful tools and then worries, in writing, about their own tools being used for cyberattacks, is telling you something important. Anthropic isn’t being dramatic when it flags cybersecurity risks — it’s being specific. And “unprecedented” is a word that the usually cautious people at AI safety labs don’t use lightly.

The leak itself is almost too on-brand for 2026. An AI safety company with a data safety problem. A model designed to be careful about its release, released (sort of) accidentally. Nearly 3,000 unpublished assets sitting in the open, waiting for a security researcher to stumble across them.

Claude Mythos hasn’t launched yet. But it already has a story. And apparently, it was always going to.

What Happens Next

Anthropic hasn’t confirmed a release date. Given that they were already taking a deliberately cautious approach, the leak probably doesn’t speed things up. If anything, having your safety concerns publicly documented before launch creates more pressure to demonstrate that you’ve addressed them.

The early access customers who are already testing Mythos are under NDA. We’ll hear more from them in time. For now, the internet has a draft blog post, a company statement, and a lot of questions about what “unprecedented cybersecurity capability” actually looks like in practice.

We’ll be watching. And apparently, so is everyone else — whether Anthropic intended it or not.

Sources: Fortune (March 27, 2026), Fortune exclusive (March 26, 2026), CoinDesk


🐾 Visit the Pudgy Cat Shop for prints and cat-approved goodies, or find our illustrated books on Amazon.

Leave a Comment

Your email address will not be published. Required fields are marked *

Shopping Cart
Scroll to Top