Using Claude Fable 5: Access, Pricing, and What Developers Should Know

How to actually use Claude Fable 5 — where it's available after the July 2026 restoration, what it costs, how always-on thinking and the effort setting work, how to handle refusals and fallbacks in the API, and when to use it over Opus or Sonnet.

The story so far: what Fable 5 is, what it can do, how its safety system works, and how it got blocked and restored. This final chapter is the practical one: getting access, paying for it, and integrating it without surprises.

Where you can use it

Since the July 1, 2026 restoration, Fable 5 is available on:

Surface	Notes
claude.ai	Pro, Max, Team, and select Enterprise plans
Claude Code	For long-horizon coding work — its natural habitat
Claude Cowork	Agentic/collaborative surface
Claude API	Model ID `claude-fable-5`, generally available
AWS / Google Cloud / Microsoft Foundry	Redeployment following the restoration "as quickly as possible"

Claude Mythos 5 (claude-mythos-5) remains limited to organisations approved for Project Glasswing; everyone else gets the same capabilities through Fable 5.

On subscriptions, access is phased: through July 7, 2026, paid plans include Fable 5 up to 50% of weekly usage limits; after that it draws usage credits. Anthropic has said it aims to restore Fable 5 as a standard part of subscription plans.

What it costs, and when it's worth it

$10 per million input tokens, $50 per million output — double Claude Opus 4.8, and less than half what the old Mythos Preview cost.

The honest decision rule from the capabilities chapter applies: Fable 5's gains are concentrated above what Opus can do, not spread across everyday tasks.

Routine chat & code → Sonnet / Haiku

Hard single tasks → Opus 4.8

Multi-hour agentic & 1M-token jobs → Fable 5

A simple routing rule of thumb.

One more gate before you integrate: your organisation must allow 30-day data retention. Fable 5 is a Covered Model and isn't available under zero-data-retention configurations — the API rejects those requests outright. If your org runs ZDR for compliance reasons, that's a conversation to have before writing any code.

The API differences that matter

If you're coming from Opus or Sonnet, four things change:

1. Thinking is always on. Adaptive thinking is the only mode — there is no way to disable reasoning, and sending an explicit thinking configuration (a budget, or "disabled") is rejected. You simply omit the parameter.

2. Effort replaces budgets. Depth is tuned with the effort setting (low through max). Lower effort means faster, cheaper, more direct work; higher effort buys deeper reasoning and more thorough verification. Notably, even low effort on Fable 5 often exceeds what previous models managed flat-out.

3. The raw chain of thought is never returned. You can request a readable summary of the reasoning (display: "summarized"); by default, thinking blocks come back with empty text. Either way, the reasoning happens and is billed the same — the setting only controls visibility.

4. Plan for minutes-long responses. A hard task at high effort can legitimately run many minutes in a single request. Streaming, sensible timeouts, and progress UX stop that from looking like a hang.

Handling refusals like a grown-up

The one integration requirement that's genuinely new: Fable 5's safety classifiers can decline a request, and the API reports that as a successful response, not an error:

HTTP status: 200
stop_reason: "refusal"
A stop_details object says which category of classifier declined

So the cardinal rule is: check stop_reason before reading the content. Code that grabs the first content block unconditionally will break on the (rare — under 5% of sessions) refused request.

The good news is you rarely need to handle the dead-end yourself, because the platform offers three fallback routes:

Route	How it works	Best for
Server-side fallback	Pass a `fallbacks` list; the API retries on Opus 4.8 inside the same call	Simplest — Claude API and Claude Platform on AWS (beta)
Client-side middleware	Official SDK middleware retries refusals automatically	Any platform, incl. cloud providers
Manual retry + fallback credit	You resend the conversation to another model; a credit token refunds the prompt-cache cost of switching	Full control, any stack

Given that false positives on benign adjacent work (security tooling, life-sciences tasks) are a known cost of the deliberately cautious tuning, shipping Fable 5 code with a fallback configured from day one is the sensible default.

Getting the most out of it

Guidance that's specific to this model, distilled from Anthropic's prompting docs:

Specify the whole task up front. Fable 5 does its best long-horizon work from one well-specified goal it can run with, rather than instructions drip-fed across turns.
De-prescribe old prompts. Step-by-step scaffolding written for weaker models often reduces Fable 5's output quality — state the goal and the constraints, and let it plan.
Say why, not just what. The model performs measurably better when it understands the intent behind a request.
Start at the top of your difficulty range. The teams with the best early results gave it their hardest unsolved problems, not their routine backlog.

Recap

Fable 5 is live on claude.ai, Claude Code, Claude Cowork, and the API (claude-fable-5) since July 1, 2026, with cloud-platform redeployment following; subscription access is included up to 50% of weekly limits through July 7, then draws usage credits.
$10/$50 per million tokens — route routine work to cheaper tiers and save Fable 5 for long-horizon, high-stakes jobs; 30-day retention is a hard requirement.
API changes: always-on thinking, depth via effort, no raw chain of thought (summaries only), and potentially minutes-long turns.
Check stop_reason for "refusal" before reading content, and configure a fallback to Opus 4.8 from day one — declines are rare, unbilled pre-output, and recoverable automatically.
Prompt it like a senior colleague: whole task up front, intent included, minimal scaffolding, hardest problems first.

That completes the guide. For the foundations underneath everything here, see How Large Language Models Work — and for the skills to steer any model well, Prompt Engineering from Beginner to Advanced.