Claude Fable 5's system prompt leaked. Here's what we could verify.

Days after Anthropic shipped Claude Fable 5 and Mythos 5, a file claiming to be Fable 5's full system prompt appeared in a public GitHub repository — the CL4R1T4S collection run by the jailbreaker known as "Pliny the Liberator." The leak rode a wave of breathless headlines: jailbroken in 24 hours, safety layer dead in two days, "every hidden instruction exposed."

We pulled the file down and checked it against what's actually inside, rather than what the threads claimed. Here is the honest version — what we can stand behind, and what we can't.

Disclaimer. We did not create, obtain, or solicit this file. It was already publicly available on GitHub, and we downloaded a copy for verification and archival. We host it as-is, make no claim that it is genuine Anthropic material, and accept no responsibility for its contents, accuracy, or any use you make of it. If you are the rights holder and want it removed, contact us.

What we verified

We downloaded the raw file and measured it ourselves:

Claim circulating	What the file actually is
"~120,000 characters"	122,750 characters — accurate
"~1,585 lines"	1,597 lines — accurate
"27,000+ tokens"	~17,500 words — consistent with that token range

The widely-quoted figures are real, not invented. (One AI-summary tool guessed "200,000+ characters / 7,500 lines" — that was wrong; the counts above come from a direct character count.)

Several specific claims about the contents also check out against the text:

Knowledge cutoff. The file states the model's reliable cutoff is the "end of Jan 2026," answering as an informed person from that date would, talking to someone in June 2026.
Fable 5 and Mythos 5 are the same model. The opening product section confirms the two share one underlying model, split only by safety measures — matching what Anthropic announced publicly.
The strict quoting rule. The file repeatedly flags pulling 15-plus words from any single source as a "severe violation," and caps it at one quote per source before that source is "closed." This rule is stated several times over.
The model lineup. It lists the current model strings — claude-fable-5, claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5-20251001 — consistent with Anthropic's released family.

The most level-headed coverage landed on a quieter read than the jailbreak hype: the document is less a personality script and more an operating manual for long-running agents — tool schemas, web-search procedure, memory and artifact handling, citation limits, and refusal policy.

What we could not verify

Authenticity itself. We can confirm the file exists at that URL, that its contents are internally consistent, and that its headline numbers are accurate. We cannot prove it is genuinely Anthropic's production prompt rather than a convincing fabrication. Nobody outside Anthropic can, without the company confirming it — and it has not.
The "silent handoff to Opus 4.8" framing. Anthropic did document, at launch, that Fable 5 falls back to Opus 4.8 on high-risk domains. But the dramatic "the prompt secretly routes you to a weaker model" angle is not something we found written in the leaked file. What the file actually contains is a list of classifier reminders — labels like cyber_warning, ethics_reminder, and ip_reminder — that get injected when a classifier fires. The routing-and-degradation narrative looks like commentary layered on top of the leak, not a quote from it.
The jailbreak claims. Screenshots allegedly show Fable 5 producing exploit code, and outlets reported timelines ranging from 24 to 72 hours. Against that, Anthropic's own launch materials reported no universal jailbreak in extensive red-teaming — only narrow, task-specific bypasses. We have not reproduced any jailbreak and are not endorsing one; the strongest claims here remain contested.
The shutdown/"resurrected with one line of code" stories. These come from crypto-news aggregators and viral posts with no primary sourcing. We treat them as rumor.

The file

If you want to inspect it yourself, our archived copy is here:

CLAUDE-FABLE-5.md — the full leaked file, exactly as downloaded from GitHub.

Read it as an unverified document of unknown provenance. It may be authentic, partially authentic, or fabricated; we make no guarantee either way.

Why it matters

The interesting thing about this leak isn't scandal — it's shape. If the file is genuine, it shows how a frontier model is actually steered in production: not with a short, clever persona, but with a long, dry specification of tools, search behavior, citation limits, and safety reminders. That's a useful window for anyone building on these systems, and a reminder that "the model" you interact with is a model plus a large, deliberate scaffold of instructions.

It's also a case study in reading AI news critically. The verifiable core here is real — the file, its size, several of its rules. The viral layer on top — secret routing, instant total jailbreaks, government-mandated shutdowns — is mostly unverified or contradicted by primary sources. Both traveled under the same headline.

Sources: CL4R1T4S repo (GitHub) · AlphaSignal · Anthropic — Fable 5 / Mythos 5 · Nowrap verification (direct character count)