Hero image for Anthropic's Claude Mythos Leak: What We Know
By AI Tool Briefing Team

Anthropic's Claude Mythos Leak: What We Know


The Anthropic Claude Mythos leak exposed ~3,000 internal documents revealing a next-gen model with unprecedented cybersecurity benchmark scores. Here’s what the documents actually say.

I woke up Thursday morning to 47 notifications — every one pointing to the Claude Mythos leak: roughly 3,000 internal Anthropic documents sitting open on a public CMS endpoint.

By the time I’d finished reading the first dozen, CrowdStrike was down 7%.

What We Know So Far

DetailSummary
What happened~3,000 internal Anthropic documents exposed via misconfigured CMS
What they revealA next-gen model codenamed “Claude Mythos” (internal tag: Claude Capybara)
Key findingMythos scores far higher on cyber exploit benchmarks than any existing AI model
Anthropic’s responseConfirmed the model exists, called it “a step change”
Market impactCrowdStrike -7%, Palo Alto Networks -6%, Zscaler -4.5% within hours
Current statusDocuments pulled; no release date announced for Mythos

Bottom line: Anthropic’s own internal safety researchers flagged Claude Mythos as posing “unprecedented cybersecurity risks.” That’s not a competitor saying it. That’s the people who built it.

How the Leak Happened

A misconfigured content management system. That’s it. No state-sponsored hack. No disgruntled employee. Someone at Anthropic (or more likely, an automated deployment) pushed internal documentation to a CMS instance that was publicly accessible. No authentication. No IP restrictions. Just… open.

The documents were indexed by search engines before anyone noticed. Security researchers, journalists, and a lot of very curious AI engineers spent the better part of Wednesday morning downloading and reading them before Anthropic’s infrastructure team locked it down.

I’ve been covering AI companies for years, and the irony is almost too on the nose. The company that has built its entire brand on responsible AI development and safety just had the most significant accidental disclosure in the industry’s short history. Not because their AI was unsafe. Because their CMS was misconfigured.

Anthropic has acknowledged the breach and confirmed the documents are authentic. They haven’t said much beyond that.

What Is Claude Mythos?

Based on the leaked documents, Claude Mythos is Anthropic’s next-generation foundation model. Internally, some documents reference it as Claude Capybara — apparently an earlier codename that stuck in certain teams.

Here’s what the documents describe:

  • A significant capability jump over Claude Opus 4.6, Anthropic’s current flagship. Not incremental. The internal characterization is “a step change.”
  • Dramatically improved reasoning and planning capabilities — particularly on multi-step tasks requiring sustained coherence over long chains of inference.
  • Near-human performance on certain technical benchmarks that current models score moderately on.
  • And the part that crashed cybersecurity stocks: scores on internal cyber exploit benchmarks that Anthropic’s own safety team described as “unprecedented.”

That last bullet deserves its own section.

What “Unprecedented Cybersecurity Risks” Actually Means

I want to be precise here because the headlines have been sloppy.

The leaked documents include internal safety evaluations — the kind Anthropic runs before deciding whether and how to release a model. These evaluations test, among other things, how effectively a model can identify vulnerabilities, generate exploit code, and reason about attack chains against real-world software.

Every AI model gets tested this way. Claude Opus 4.6, GPT-5, Gemini Ultra — they all have scores on these benchmarks. The scores have been climbing steadily as models get more capable. That’s expected. More capable reasoning means more capable everything, including the things you’d rather it couldn’t do.

What made Anthropic’s safety researchers flag Mythos differently is the magnitude of the jump. The documents describe Mythos not as slightly better at cyber tasks than Opus 4.6, but as occupying a different category. The specific language from one internal memo: the model “crosses thresholds previously considered theoretical” on autonomous vulnerability discovery and exploit generation.

What does Claude Mythos score on cybersecurity benchmarks?

According to leaked internal Anthropic documents, Claude Mythos scores dramatically higher than any existing AI model — including Claude Opus 4.6, GPT-5, and Gemini Ultra — on cyber exploit benchmarks measuring autonomous vulnerability discovery, exploit code generation, and multi-step attack chain reasoning. Anthropic’s own safety researchers described the scores as “unprecedented” and crossing “thresholds previously considered theoretical.” Specific numeric scores have not been independently verified.

To be clear: this doesn’t mean Mythos can hack the Pentagon. Benchmark performance and real-world capability are not the same thing. A model that scores well on structured vulnerability puzzles isn’t automatically a weapon. There are enormous gaps between “can reason about exploits in a test environment” and “can autonomously breach a production system with real defenses.”

But the gap is narrowing. And Anthropic’s own people are the ones raising the alarm. That matters.

The Market Reaction

Wall Street didn’t wait for nuance.

Within hours of the documents surfacing:

  • CrowdStrike (CRWD): down 7%
  • Palo Alto Networks (PANW): down 6%
  • Zscaler (ZS): down 4.5%

The logic, if you can call it that: if AI models are about to get dramatically better at finding and exploiting vulnerabilities, the companies selling cybersecurity defenses face an asymmetric threat. Offense scales faster than defense. AI-generated exploits could outpace the signature-based and heuristic detection that most security products rely on.

It’s not a stupid thesis. But it’s an overreaction, at least at this speed.

Cybersecurity companies aren’t going to become irrelevant because AI gets better at offense. They’re going to use the same AI for defense. CrowdStrike already uses machine learning models extensively. Palo Alto Networks has been acquiring AI security companies for three years. The assumption that offense gets AI but defense doesn’t is the kind of logic that sounds right at 7 AM on a panic day and looks silly by Friday.

That said — the selloff tells you something about market sentiment around AI risk. Investors are starting to price in the possibility that AI capability jumps won’t be uniformly positive. That’s new. A year ago, any AI news moved tech stocks up. Now the market is differentiating between “AI makes businesses more productive” and “AI creates risks we don’t fully understand yet.”

I’d call that progress, even if the specific trades were dumb.

Why This Is Different from Previous Model Announcements

We’ve seen capability jumps before. Claude Opus 4.6 was a notable step up. GPT-5 moved the needle. Each time, there’s a mix of excitement and safety concern, and each time the concern fades as people realize the model is useful but not world-ending.

Mythos feels different for three reasons.

First, the source. This isn’t a competitor raising safety concerns to score PR points. This is Anthropic’s own internal safety team — the team they hired specifically to prevent catastrophic AI risks — saying this model exceeds their previous risk thresholds. When the people who built the thing are worried, the rest of us should at least pay attention.

Second, the specificity. Previous safety concerns have been general. “Models are getting more capable, which could be dual-use.” Okay. True but vague. The Mythos documents describe specific capabilities on specific benchmarks with specific language about threshold-crossing. That’s not hand-waving.

Third, the involuntary disclosure. Anthropic didn’t choose to publish these safety concerns as part of a carefully managed model card release. They got caught with their documents exposed. Which means what we’re reading is what they actually think, not what they want us to think they think. Internal memos are more honest than press releases. Always.

What Anthropic Should Do Now

I’ve been tracking Anthropic closely and their approach to safety has generally been more substantive than the industry average. This incident doesn’t erase that track record, but it does test it.

Here’s what I think matters:

Release the full safety evaluation — voluntarily. The documents are already out there. Security researchers have copies. Trying to pretend the cat is still in the bag is pointless. Publish the complete benchmark results, the safety team’s assessment, and the mitigation plan. Controlled disclosure beats leaked fragments.

Don’t rush the release. If Mythos genuinely crosses capability thresholds that your own safety team flagged, take the time to implement proper safeguards before releasing it. The competitive pressure from OpenAI and Google is real, but Anthropic has built its reputation on being the safety-first lab. Burning that reputation to ship faster would be a strategic mistake that goes beyond this one model.

Fix the CMS. Obvious, but worth saying. A company that handles frontier AI research needs infrastructure security that matches the sensitivity of what they’re building. A misconfigured CMS exposing 3,000 internal documents is an operational failure that a company of Anthropic’s profile should not have.

What This Means for You

If you’re using Claude products today — Claude for enterprise, Claude via API, Claude Code — nothing changes immediately. Mythos isn’t released. Your current Claude deployment is still running on Opus 4.6 or Sonnet, depending on your tier.

But here’s what I’d start thinking about:

If you’re in cybersecurity: The capabilities described in the Mythos documents will eventually be available — if not from Anthropic, then from another lab. AI-augmented offensive security is going to become the norm, not the exception. If your defense strategy assumes human-speed attackers, start updating those assumptions now. Our AI safety and privacy guide covers the framework for evaluating how these shifts affect enterprise security posture.

If you’re evaluating AI vendors: This is another data point in the growing case for supply chain transparency. Two weeks ago it was Cursor hiding a Chinese base model. Now it’s Anthropic accidentally exposing internal safety concerns. The pattern is clear: you can’t assume AI companies will tell you what you need to know. Ask hard questions. Demand documentation.

If you’re in AI policy or governance: The Mythos leak is going to accelerate the regulatory conversation. Expect to see this cited in Congressional testimony, EU AI Act enforcement discussions, and every think tank paper on AI risk for the next six months. Whether that produces good policy is anyone’s guess, but the conversation is happening whether the industry wants it or not.

The Uncomfortable Question

Here’s what I keep coming back to.

Anthropic built Claude Mythos. They tested it. Their own safety team said it poses unprecedented risks on specific capability dimensions. And then… what? The documents don’t describe a decision to not release it. They describe a model in active development with safety mitigations being developed in parallel.

That’s not irresponsible. Responsible AI development means building capable models, evaluating their risks honestly, and implementing safeguards before release. That appears to be exactly what Anthropic was doing before their CMS blew the door open.

But it raises the harder question: at what point does a model’s capabilities on offensive benchmarks become a reason not to release it at all? And who gets to make that call?

Anthropic hasn’t answered that. Neither has anyone else. And at the rate AI models are improving — look at the trajectory from Opus 4.5 to 4.6 — we’re going to need an answer sooner than most people think.

For now, I’m watching. Anthropic has earned a degree of trust on safety that other labs haven’t. This incident didn’t destroy that trust, but it put it under a microscope. What they do next — how they handle the disclosure, whether they rush or slow down, how transparent they are about Mythos’s capabilities and safeguards — will determine whether that trust was earned or just assumed.

I’ll update this post as Anthropic responds more fully. If you’re tracking AI industry trends for business planning, this is one to watch closely.

Frequently Asked Questions

What is Claude Mythos?

Claude Mythos is Anthropic’s next-generation foundation model, revealed through an accidental leak of approximately 3,000 internal documents on March 27, 2026. Internally also referred to as Claude Capybara, the model represents what Anthropic calls “a step change” in capability over the current Claude Opus 4.6. It has not been publicly released.

What happened with the Anthropic leak?

A misconfigured content management system made roughly 3,000 internal Anthropic documents publicly accessible and indexable by search engines. The documents included internal safety evaluations, model capability assessments, and development roadmap materials for Claude Mythos. Anthropic confirmed the documents are authentic and pulled them offline within hours.

Why did cybersecurity stocks drop?

CrowdStrike fell 7%, Palo Alto Networks 6%, and Zscaler 4.5% after the leaked documents revealed that Claude Mythos scores dramatically higher than any existing model on cyber exploit benchmarks. Investors reacted to the concern that AI-powered offensive capabilities could outpace existing cybersecurity defenses, threatening the business models of defensive security companies.

Is Claude Mythos dangerous?

The leaked documents show Anthropic’s own safety researchers flagged the model as posing “unprecedented cybersecurity risks” based on benchmark performance. However, benchmark scores in controlled test environments don’t directly translate to real-world offensive capability. Anthropic was actively developing safety mitigations before the leak. The model has not been released, and its actual risk profile in deployment remains unknown.

When will Claude Mythos be released?

No release date has been announced. Given the safety concerns documented in the leaked materials and the public attention the leak has generated, Anthropic will likely take additional time to develop and validate safeguards before any release. Whether the model will be released broadly, offered only through restricted API access, or held back entirely has not been disclosed.


Last updated: March 28, 2026. Based on reporting from Reuters, The Information, and independently verified leaked documents. This article will be updated as Anthropic issues further statements.