The Leak

The AI company most publicly committed to safety, transparency, and careful disclosure of its work accidentally left nearly 3,000 unpublished documents in a publicly searchable data store — including a draft blog post announcing a model it hadn't told anyone existed.

That's not an allegation. Anthropic confirmed it. They called it a "human error" in the configuration of their content management system. The default setting made uploads public. Nobody caught it until a security researcher and a Fortune reporter did.

What the documents revealed is called Claude Mythos.

What Got Out

According to materials reviewed by Fortune and corroborated by independent security researchers at LayerX Security and the University of Cambridge, Anthropic had been quietly building a model that sits above Opus in their hierarchy — larger, more capable, and described in their own draft language as "by far the most powerful AI model we've ever developed."

The leaked draft says Mythos achieves "dramatically higher scores" on tests of software coding, academic reasoning, and cybersecurity compared to Claude Opus 4.6. Anthropic confirmed to Fortune that the model is real, already in early-access testing with select customers, and represents "a step change" in capability.

That's the core of the story: a company that has built its public reputation around careful, considered disclosure of capability advances had a model this significant in active testing — and the public found out because of a misconfiguration, not a press release.

Two draft versions of the announcement were in the leaked cache. One called the model Mythos. One called it Capybara. The subtitle of both versions, even after the name swap, still read: "We have finished training a new AI model: Claude Mythos." The justification for the name across both drafts: it was chosen to evoke "the deep connective tissue that links together knowledge and ideas."

Anthropic says the documents were "early drafts of content being considered for publication." They removed public access to the data store after Fortune contacted them Thursday evening.

The Cybersecurity Problem They Wrote Down

Here's where the story stops being just about a misconfiguration.

The leaked draft doesn't only describe a more capable model. It describes a model with cybersecurity capabilities that Anthropic's own language characterizes as dangerous. According to the documents, Mythos is "currently far ahead of any other AI model in cyber capabilities" and — this is in their own draft — "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

Read that again. Anthropic wrote, for a blog post they were planning to publish, that their new model signals the arrival of a class of AI that can attack infrastructure faster than defenders can respond.

Their plan for that: a deliberately slow, security-focused rollout. Small initial groups. Controlled expansion. They're not denying the risk. They're describing it explicitly and arguing their rollout approach is the answer.

That's an interesting position. It means Anthropic has made the judgment that they can manage the pace of this capability's spread. It means they believe the security protocols around early access will hold. And it means that every version of "we're being careful" from this point forward is being made by an organization that was simultaneously running a misconfigured data store with the playbook for their most dangerous model sitting in it, publicly searchable.

The Judge and the "Orwellian Notion"

The other story in this episode is happening in federal court.

The Trump administration has been using national security review processes to scrutinize AI companies, including Anthropic. A federal judge — Judge Rita Lin — temporarily blocked the administration from labeling Anthropic a "supply chain risk," using language that's worth quoting directly. She called the designation an "Orwellian notion."

The legal fight here is about the scope of national security authority and whether AI companies fall under the same framework used for foreign-connected hardware vendors. Anthropic is arguing they don't. The administration is arguing a broad interpretation of what constitutes supply chain risk in critical infrastructure.

Judge Lin's intervention doesn't resolve that argument — it's a temporary block, not a ruling on the merits. But "Orwellian notion" is not neutral judicial language. It's a signal about how at least one federal judge reads the government's theory of the case.

The two stories — the leak and the legal fight — aren't obviously connected, but they're both about the same underlying question: who decides what Anthropic gets to do, when, and under what scrutiny?

The administration's answer is: we do, and national security gives us that authority. Anthropic's answer is: our safety process is the oversight. Judge Lin's answer, at least for now, is: not so fast.

What the Misconfiguration Actually Reveals

Anthropic's CMS had a default setting that made uploads public. That's a routine infrastructure error. IT teams get notified about these all the time. Most of the time, what's sitting in the misconfigured bucket is not a draft blog post describing your most powerful and potentially dangerous model.

The gap between "we're committed to safe, deliberate disclosure" and "our own draft describes a wave of AI that outpaces defenders" is not closed by calling it a human error in configuration. It's a question about whether the disclosure norms Anthropic advocates publicly are keeping pace with what they're actually building.

That question was always there. The misconfiguration just made it visible to everyone.

---

EP014 — "The Leak" — is out now. Listen at the link above or wherever you get podcasts.

Have a lead or a story Sam should know about? Email: [email protected]