AskwhoCasts AIMay 13, 202633m

Cyber Lack of Security and AI Governance

TL;DR

Mythos looks like a real cyber capability jump, not just hype — The speaker says Anthropic’s Claude Mythos preview outperformed GPT-5.5 on multiple cyber evals, with UK AISI reporting it solved both cyber ranges and Expo rating it the strongest model for source-code audits and native vulnerability discovery.
The scary part is not just benchmark scores but operational scale — Logan Graham said Mythos helped partners find “many thousands” of high and critical vulnerabilities in weeks, while Firefox reportedly jumped from 17–31 security fixes per month in 2025 to 423 in April 2026 after using it.
AI cyber progress is now running into a measurement problem as much as a model problem — The transcript highlights METR-style time-horizon results placing Mythos above 16 hours at 50% success, while critics like Gary Marcus argue those numbers blur the difference between “can sometimes do it” and “can do it reliably.”
Self-replication and autonomous hacking are no longer theoretical lab curiosities — On Palisade Research’s evals, Opus 4.6 reportedly rose from roughly 5% self-replication success in May 2025 to about 80% by early 2026 when pointed at intentionally weak targets.
Washington has quietly moved from dismissing frontier-model oversight to fighting over who controls it — The White House, Commerce, and intelligence agencies are now in what one report called a “knife fight” over pre-deployment testing, with C-AISI/NIST already having completed 40+ evaluations including unreleased models.
The core governance point is that voluntary testing may already be functioning like soft prior restraint — Even after backing away from explicit FDA-style approval, the administration appears to be building a procurement-driven regime where labs accept classified pre-release evaluations because it buys goodwill, safe harbor, and helps avoid harsher regulation.

Summary

Mythos lands as a cyber wake-up call

The opening frame is that the real AI story right now isn’t just GPT-5.5 or “the Mythos moment” as spectacle — it’s the sudden scramble to patch the internet and decide who governs frontier models. The speaker says the Trump administration has been dragged into admitting catastrophic risks are real, and that someone in government now needs to supervise the most powerful releases.

The eval fight: how strong is Mythos, really?

A big chunk of the episode lives in the weeds of METR-style task horizons and what they do or don’t prove. The headline is that UK AISI tested an earlier Mythos preview than the one eventually shipped, and the final checkpoint appears meaningfully stronger — a reminder that capability jumps can happen invisibly without a brand-new public release. Critics like Gary Marcus push back that 50% success rates flatter models too much, while the host’s view is more nuanced: reliability matters, but it’s also true that the set of tasks models can do at all keeps expanding.

Autonomous hacking and self-replication stop sounding sci-fi

The transcript then turns to Palisade Research’s self-replication-style benchmark, where models were explicitly instructed to act autonomously, target exploitable systems, and replicate. The host is careful to note this wasn’t accidental behavior, but also blunt that people in the real world will absolutely issue those instructions if given the chance. The numbers are the gut punch: Opus 4 rose from around 5% success in May 2025 to roughly 80% for Opus 4.6 by early 2026.

Expo and UK AISI both say Mythos is a step change

Two outside reports anchor the next section. Expo rates Mythos preview as excellent on web benchmarks, the only successful model on native V8 sandbox detection, and the best odds of finding a vulnerability at about 10.7-to-1 versus missing it, ahead of GPT-5.5 at 7.5. UK AISI says the newer Mythos checkpoint solved both of its cyber ranges — including “cooling tower,” which no prior model had ever cleared — making the host’s basic takeaway simple: GPT-5.5 is a big jump, Mythos is a very big jump.

The human evidence: bug bounties, Firefox, and defenders scrambling

This is where the abstract benchmark talk turns visceral. Anthropic’s Logan Graham says Glasswing partners found many thousands of estimated high and critical vulnerabilities in just weeks, and Palo Alto says Mythos found a year’s worth of penetration methods in three weeks. The Firefox anecdote is the stickiest one: after months of treating AI bug reports as slop, the team built a better harness and then fixes exploded to 423 in April 2026.

The calm before the storm — and why patching may be too slow

The host says current real-world threats are still relatively pedestrian — exfiltration, ransomware, standard criminal tradecraft — but warns we’re in the “calm before the storm.” If AI can attack fresh code immediately, defenders may be structurally slower than attackers unless they scan every deployment at the same level adversaries will. The line that lands hardest is that 90-day disclosure windows may soon be “at least 89 days too long.”

Pentagon drama and the White House’s sudden governance turn

Then the politics gets messy and very personal. Emil Michael publicly says the Pentagon will “never” use Anthropic again, which the host treats with open sarcasm, suggesting the practical reality may be different from the posturing. At the same time, the White House has gone from treating serious regulation as unthinkable to floating FDA-style pre-release review, then partially walking it back once industry pushed back.

Commerce vs. intelligence in a full-on turf war

The ending is basically a governance knife fight. Reporting says Commerce wants C-AISI/NIST to remain the industry-facing testing hub, while national security officials want a larger intelligence-centered evaluation apparatus under ODNI. The host’s mood is grimly amused: yes, it’s good that the government finally recognizes AI risk, but none of the players — Commerce, intelligence, or a purely private-sector regime — feels especially safe, and the deeper problem is that the Mythos cyber shock still hasn’t convinced most people there will be many more “moments” after this one.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

Cyber Lack of Security and AI Governance

Summary

Mythos lands as a cyber wake-up call

The eval fight: how strong is Mythos, really?

Autonomous hacking and self-replication stop sounding sci-fi

Expo and UK AISI both say Mythos is a step change

The human evidence: bug bounties, Firefox, and defenders scrambling

The calm before the storm — and why patching may be too slow

Pentagon drama and the White House’s sudden governance turn

Commerce vs. intelligence in a full-on turf war

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

Mythos lands as a cyber wake-up call

The eval fight: how strong is Mythos, really?

Autonomous hacking and self-replication stop sounding sci-fi

Expo and UK AISI both say Mythos is a step change

The human evidence: bug bounties, Firefox, and defenders scrambling

The calm before the storm — and why patching may be too slow

Pentagon drama and the White House’s sudden governance turn

Commerce vs. intelligence in a full-on turf war

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks