On June 9, 2026, Anthropic released its strongest model yet, in two faces. Fable 5, which anyone can use, and Mythos 5, which unbolts the gate only for vetted partners. Drawing on the official announcement as the primary source, we put it all on the same table: what has changed, and how the numbers diverge from the previous version and rival models.
On June 9, 2026, Anthropic unveiled its next-generation top-tier model. What makes it interesting is that the company shipped it as two products. Claude Fable 5 is the new flagship open to the public; for prompts that brush against cybersecurity, bio/chemical topics, or model distillation, a safety classifier intercepts them and hands them off to Opus 4.8. Claude Mythos 5 is the same core with some of those bolts removed, available only to a limited set of vetted partners working on cyber defense and infrastructure (Project Glasswing) and to a handful of biology researchers. The API model string is claude-fable-5.
Using Anthropic's official announcement as the primary source, this piece lays out, fact-first, how Fable 5 differs from the previous flagship Opus 4.8 (2026.05.28) and how its benchmarks compare with rival frontier models. Where the official benchmark tables are provided as images and the figures are not exposed directly as body text, we cross-checked the numbers against reputable outlets that transcribed those tables, and we flag that in the notes beneath each table. For items the company did not disclose, such as the context window, we left them out.
The crux is a launch strategy that separates "capability" from "risk." Fable 5 and Mythos 5 share the same core. The difference lies in the safeguards. Fable 5 behaves conservatively in risk domains, substituting Opus 4.8's more restrained responses for requests that could be dangerous. Anthropic says this safety routing kicks in only on fewer than 5% of all sessions on average. For the rest of everyday work, you get the core model's full performance.
Mythos 5, by contrast, is the "unsealed" version with some of those bolts removed. For domains like cyber defense, where being faster than the attacker is the whole point, it is released only to vetted partners. Anthropic introduced it as "the strongest cybersecurity model in the world," and limited the initial rollout to Project Glasswing (cyber defense and infrastructure) and a handful of biology researchers. Its pricing matches Fable 5.
Real-world performance anecdotes were shared as well. Stripe reported that on a migration of a 50-million-line Ruby codebase, Fable 5 compressed months of engineering into days. Fable 5 cleared Pokémon FireRed using a vision-only harness alone, and Mythos 5 matched or outperformed skilled human operators on protein-design tasks, with roughly 80% of its molecular-biology hypotheses preferred over those from Opus-class models.
The figures below follow Anthropic's official benchmark table (cross-checked against secondary sources that transcribed the image table). Because it is a comparison against the previous flagship within the same company, it carries the highest confidence. The Fable 5 values in the table are the actual Fable 5 scores with the safety layer on; the higher unsealed scores in sensitive areas like cyber and biology are listed separately as Mythos 5 in the sidebar further down.
| Benchmark | Fable 5 | Opus 4.8 | Change |
|---|---|---|---|
| SWE-bench Pro (agentic coding) | 80.3 | 69.2 | +11.1 |
| FrontierCode (hardest coding) | 29.3 | 13.4 | +15.9 |
| OSWorld-Verified (computer use) | 85.0 | 83.4 | +1.6 |
| Blueprint-Bench 2 (spatial reasoning) | 38.6 | 14.5 | +24.1 |
| GDP.pdf (vision documents) | 29.8 | 22.5 | +7.3 |
| AutomationBench (tool use) | 17.4 | 15.5 | +1.9 |
| Legal Agent Benchmark (legal) | 13.3 | 10.4 | +2.9 |
| GDPval-AA (practical ELO) | 1932 | 1890 | +42 |
Units are % (GDPval-AA is an ELO score). The widest gains came in coding and spatial reasoning. FrontierCode, the hardest-difficulty benchmark built by Cognition, more than doubled, from 13.4 to 29.3. Opus 4.8's SWE-bench Pro (69.2) and GDPval-AA (1890) match the values from its own launch table, so they cross-check.
As of June 2026, the competing frontier models are OpenAI GPT-5.5 (released 2026.04) and Google Gemini 3.1 Pro (released 2026.02). Below is a transcription of the comparison values included alongside them in Anthropic's official table. Keep in mind that each company uses different measurement harnesses and conditions, so even the same benchmark name has limits for direct 1:1 comparison. Undisclosed or non-comparable items are marked N/A.
| Benchmark | Fable 5 | Opus 4.8 | GPT-5.5 | Gemini 3.1 Pro |
|---|---|---|---|---|
| SWE-bench Pro | 80.3 | 69.2 | 58.6 | 54.2 |
| FrontierCode | 29.3 | 13.4 | 5.7 | N/A |
| OSWorld-Verified | 85.0 | 83.4 | 78.7 | 76.2 |
| Blueprint-Bench 2 (spatial) | 38.6 | 14.5 | 36.2 | 26.5 |
| GDP.pdf (vision) | 29.8 | 22.5 | 24.9 | 16.7 |
| AutomationBench (tools) | 17.4 | 15.5 | 12.9 | 9.6 |
| Legal Agent Benchmark | 13.3 | 10.4 | 2.1 | 0.0 |
| GDPval-AA (ELO) | 1932 | 1890 | 1769 | 1314 |
Bold = highest value in that row. Fable 5 leads in every item above — across coding, computer use, tools, legal, and practical ELO. The gap is narrowest against GPT-5.5 (36.2) on spatial reasoning (Blueprint-Bench). Pure knowledge reasoning (GPQA Diamond) is not included in Anthropic's official Fable 5 table, so there is no direct comparison value — and as a benchmark already saturated in the low 90s (GPT-5.5 93.6, Gemini 3.1 Pro 94.3, Opus 4.7 94.2), many argue it has lost its discriminating power.
In coding, agentics, computer use, and practical work, Fable 5 leads the comparison set by a wide margin. The biggest leaps came in hard coding (FrontierCode) and spatial reasoning. — Benchmark summary
The Opus line has tightened its cadence — 4.5 (2025.11) → 4.6 → 4.7 → 4.8 (2026.05) — and this Fable/Mythos 5 is the new generation built on top of it. The release cycle keeps shortening, and pricing keeps rising in step with capability. "Stronger models at higher prices, with risk split into two products" is the one-line summary of this announcement.