
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Mythos looks like a real cyber capability jump, not just hype — The speaker says Anthropic’s Claude Mythos preview outperformed GPT-5.5 on multiple cyber evals, with UK AISI reporting it solved both cyber ranges and Expo rating it the strongest model for source-code audits and native vulnerability discovery.
The scary part is not just benchmark scores but operational scale — Logan Graham said Mythos helped partners find “many thousands” of high and critical vulnerabilities in weeks, while Firefox reportedly jumped from 17–31 security fixes per month in 2025 to 423 in April 2026 after using it.
AI cyber progress is now running into a measurement problem as much as a model problem — The transcript highlights METR-style time-horizon results placing Mythos above 16 hours at 50% success, while critics like Gary Marcus argue those numbers blur the difference between “can sometimes do it” and “can do it reliably.”
Self-replication and autonomous hacking are no longer theoretical lab curiosities — On Palisade Research’s evals, Opus 4.6 reportedly rose from roughly 5% self-replication success in May 2025 to about 80% by early 2026 when pointed at intentionally weak targets.
Washington has quietly moved from dismissing frontier-model oversight to fighting over who controls it — The White House, Commerce, and intelligence agencies are now in what one report called a “knife fight” over pre-deployment testing, with C-AISI/NIST already having completed 40+ evaluations including unreleased models.
The core governance point is that voluntary testing may already be functioning like soft prior restraint — Even after backing away from explicit FDA-style approval, the administration appears to be building a procurement-driven regime where labs accept classified pre-release evaluations because it buys goodwill, safe harbor, and helps avoid harsher regulation.
The opening frame is that the real AI story right now isn’t just GPT-5.5 or “the Mythos moment” as spectacle — it’s the sudden scramble to patch the internet and decide who governs frontier models. The speaker says the Trump administration has been dragged into admitting catastrophic risks are real, and that someone in government now needs to supervise the most powerful releases.
A big chunk of the episode lives in the weeds of METR-style task horizons and what they do or don’t prove. The headline is that UK AISI tested an earlier Mythos preview than the one eventually shipped, and the final checkpoint appears meaningfully stronger — a reminder that capability jumps can happen invisibly without a brand-new public release. Critics like Gary Marcus push back that 50% success rates flatter models too much, while the host’s view is more nuanced: reliability matters, but it’s also true that the set of tasks models can do at all keeps expanding.
The transcript then turns to Palisade Research’s self-replication-style benchmark, where models were explicitly instructed to act autonomously, target exploitable systems, and replicate. The host is careful to note this wasn’t accidental behavior, but also blunt that people in the real world will absolutely issue those instructions if given the chance. The numbers are the gut punch: Opus 4 rose from around 5% success in May 2025 to roughly 80% for Opus 4.6 by early 2026.
Two outside reports anchor the next section. Expo rates Mythos preview as excellent on web benchmarks, the only successful model on native V8 sandbox detection, and the best odds of finding a vulnerability at about 10.7-to-1 versus missing it, ahead of GPT-5.5 at 7.5. UK AISI says the newer Mythos checkpoint solved both of its cyber ranges — including “cooling tower,” which no prior model had ever cleared — making the host’s basic takeaway simple: GPT-5.5 is a big jump, Mythos is a very big jump.
This is where the abstract benchmark talk turns visceral. Anthropic’s Logan Graham says Glasswing partners found many thousands of estimated high and critical vulnerabilities in just weeks, and Palo Alto says Mythos found a year’s worth of penetration methods in three weeks. The Firefox anecdote is the stickiest one: after months of treating AI bug reports as slop, the team built a better harness and then fixes exploded to 423 in April 2026.
The host says current real-world threats are still relatively pedestrian — exfiltration, ransomware, standard criminal tradecraft — but warns we’re in the “calm before the storm.” If AI can attack fresh code immediately, defenders may be structurally slower than attackers unless they scan every deployment at the same level adversaries will. The line that lands hardest is that 90-day disclosure windows may soon be “at least 89 days too long.”
Then the politics gets messy and very personal. Emil Michael publicly says the Pentagon will “never” use Anthropic again, which the host treats with open sarcasm, suggesting the practical reality may be different from the posturing. At the same time, the White House has gone from treating serious regulation as unthinkable to floating FDA-style pre-release review, then partially walking it back once industry pushed back.
The ending is basically a governance knife fight. Reporting says Commerce wants C-AISI/NIST to remain the industry-facing testing hub, while national security officials want a larger intelligence-centered evaluation apparatus under ODNI. The host’s mood is grimly amused: yes, it’s good that the government finally recognizes AI risk, but none of the players — Commerce, intelligence, or a purely private-sector regime — feels especially safe, and the deeper problem is that the Mythos cyber shock still hasn’t convinced most people there will be many more “moments” after this one.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.