Back to Podcast Digest
Theo - t3.gg32m

Fable is Mythos, and it is really good.

TL;DR

  • Fable 5 feels like Mythos 5 with a muzzle on: Theo says the public model is clearly powerful but loses benchmark points and real-world utility because safety layers trigger refusals, fallback routing, and silent performance limits on some topics.

  • The coding jump is big enough that Theo calls it the best model yet: He compares GPT-5.5 to a rebuilt GPT-5.4, while Mythos feels like "more Opus turned up to 12 or 13," with better code quality, stronger UI work, and more initiative on vague tasks.

  • The economics are wild, both good and bad: Fable costs $10 per million input tokens and $50 per million output tokens, spent $100 in about 8 minutes on usage-based billing, yet still often beats Opus on total cost because it uses fewer tokens and gets more done per run.

  • Benchmarks are strong, but Theo trusts some more than others: He highlights an 80% on SWEBench Pro and a 30% on Frontier Code's hardest tier, but calls SWEBench Pro weak and says Frontier Code has suspicious variance, while DeepSWE lines up better with his hands-on experience and shows Fable near GPT-5.5.

  • Theo's team stress-tested the model with real software, not toy demos: In one day they made a terminal-based 2.5D adventure in Rust, a full multiplayer 3D racing game with spectator mode, a Minecraft clone, and a Rust port of T3 Chat-style tooling.

  • The tradeoffs are serious for companies: Fable requires 30-day retention for all traffic, cannot use Anthropic's no-retention setup, and may silently get weaker on frontier model development tasks via prompt modification, steering vectors, or parameter-efficient fine-tuning.

The Breakdown

Theo says Fable 5, the safety-wrapped version of Anthropic's Mythos 5, is "the best coding model ever released by quite a bit" after burning through about $2,000 of inference in 24 hours and watching his team build multiplayer games, a Minecraft clone, and terminal apps in hours. The catch is brutal pricing, hard session caps, hidden capability throttling, and mandatory 30-day data retention.

Was This Useful?

Share