Theo - t3.ggJune 15, 202629m

The weird situation with Fable

TL;DR

Fable 5 is basically Mythos 5 with guards at the door: Theo stresses that Fable and Mythos 5 are the same underlying model, but Fable adds aggressive classifiers that reroute some requests to Opus 4.8 or block them entirely.
Anthropic initially hid AI-research restrictions from users: For frontier LLM development prompts, Anthropic said it could use prompt modification, steering vectors, or parameter-efficient fine-tuning to limit effectiveness without telling the user, affecting about 0.03% of traffic and fewer than 0.1% of orgs.
The enterprise data policy is a huge blocker: Anthropic now requires 30-day retention for Mythos-class traffic, and flagged policy-violation chats can keep inputs and outputs for up to two years and trust-and-safety scores for up to seven years, which Theo says rules out many Fortune 500 use cases.
False positives make the user experience messy and expensive: Theo shows harmless requests, like help with a Defcon Gold Bug cryptography puzzle, getting rerouted to Opus and then failing anyway, while users may be billed at Opus rates.
Anthropic changed the system card after release: Mid-recording, Theo notices the public system card had been swapped with a revised version, which he frames as Anthropic trying to rewrite history around the most controversial safeguard language.
The deeper fear is trust erosion: Theo argues that once a model can secretly degrade outputs on sensitive competitive tasks, developers can no longer tell whether a bad answer came from model limits, bad context, or an invisible policy intervention.

The Breakdown

Anthropic shipped Fable with hidden safeguards that could silently make answers worse for AI research tasks while billing users full price, then quietly edited the system card before public backlash forced a rollback. Theo's core point is that this is bigger than one model bug: it turns a top-tier AI model into a trust and supply chain problem for developers and enterprises.