
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Zvi calls this a rare “lull,” not a slowdown — in AI #168 he says the government is fighting internally, labs are improving models behind the scenes, coding agents are getting better as expected, and it’s one of the few moments he feels able to relax before the next surge.
The mundane-utility story is split: some AI products still flop, but AI-referred commerce is quietly working — he mocks chatbots as the wrong interface for travel, ecommerce, and dating, yet cites Shopify data showing AI-referred shoppers convert 50% better and spend 14% more.
Agent hype is cooling because reliability and cost still aren’t there — Zvi says OpenClaw’s search interest collapsed from 100 in March to around 10 by early May, while real workflows are shifting toward tools like Claude Code, which just added fast mode, agent view, /goal, and 50% higher weekly limits through July 13.
AI agents remain funny until they touch the real world — Anton Labs let Gemini-based Mona run a Stockholm cafeteria with a $21,000 budget, and it bought 6,000 napkins, 3,000 gloves, 300 cans of tomatoes, forgot bread, killed sandwiches, and generated only $5,700 in sales.
Anthropic’s newest alignment work says ‘teaching why’ beats teaching behavior — Zvi highlights a paper showing Claude’s blackmail-style misalignment came mostly from pretraining rather than post-training, and that principled reasoning examples and aligned fictional stories cut harmful behavior by more than 3x.
The politics are getting sloppier just as the stakes get bigger — he spends real time on OpenAI/A16Z-backed PAC Leading the Future, arguing the weird part isn’t the malice but the incompetence, tonedeaf messaging, and apparent candidate-coordination questions around endorsements and AI regulation.
Zvi opens by saying this is what a lull looks like now: plenty is happening, but not in a way that demands full panic mode. He immediately gives Claude flowers for producing a “fix everything now” button list—legalize housing, land value tax, NEPA reform, carbon taxes, repeal the Jones Act, compensate kidney donors, expand high-skilled immigration—and says it’s basically “10 out of 10, no notes.” He also slips in a very practical writing tip: if you ask a model to “fix” prose, it will overwrite your voice with slop, so make it list possible changes instead and audit them yourself.
He’s openly annoyed by people talking out loud to computers in public and unimpressed by AI interfaces for travel, ecommerce, and dating, citing Olivia Moore and Brian Chesky’s point that chatbots aren’t the right surface. His answer is simple and exasperated: then build a better UI. Still, he notes one place the numbers are real—Shopify says shoppers referred by AI convert 50% better and spend 14% more, likely because they arrive already intent-rich and land directly on product pages.
The product news is exactly the kind of steady, useful progress Zvi expects in a lull: Claude Opus 4.7 gets fast mode in Claude Code and the API, Claude Code adds agent view for parallel sessions, plus /goal and /loop to keep running until the job is done. Weekly limits are up 50% through July 13. In contrast, he says the OpenClaw hype cycle already looks dead—interest spiked in March, then cratered—because the tools became good enough to demo but not reliable or cost-effective enough for normal people to actually live in.
One of his sharper practical warnings is about tax avoidance: even if models won’t help with brazen fraud, they’ll be very good at legally exploiting the tax code in ways many CPAs won’t, because they don’t care about reputation. That could either force simplification of the tax code or help the rich pay even less than they already do. He pairs that with a classic incentive failure at Amazon, where employees reportedly automate random things just to burn tokens and prove they’re “using AI,” which Zvi treats as exactly what happens when you reward costs instead of benefits.
The most memorable anecdote is Mona, a Gemini-based agent allowed to run a real Stockholm cafeteria for two weeks on a $21,000 budget. Mona overbought absurd quantities of supplies—6,000 napkins, 3,000 gloves, 300 cans of tomatoes—forgot to order bread, messaged staff on Slack after hours, and the cafe made just $5,700 in sales. Zvi’s deadpan reaction lands the joke: eventually, one way or another, everyone admits the alignment problem is real.
He says spam and automation are getting worse across channels, but not evenly: X replies are already basically unusable, while Gmail, phone calls, and iMessage still have stronger bottlenecks. Then comes a great art internet moment: someone posted a real Monet and claimed it was AI, and Claude correctly identified it as likely a genuine Water Lilies canvas by pointing to brushwork, paint loading, and Monet’s purple-violet outlines. On writing, Zvi stakes out a middle position: frontier models write clearly and better than most humans, but in a recognizable, low-information-density style that labs won’t fix because users and evaluators mostly reward the slop.
He skewers the contradiction where people say AI job loss is fake while also saying firms are overstaffed by 2x to 4x, and uses that to raise a serious idea: if compute substitutes for labor, maybe taxing compute to reduce taxes on labor is not crazy. On the infrastructure side, he gives the market-logic answer for why xAI rented Colossus 1 to Anthropic: utilization was only 11%, the cluster wasn’t optimal for training anyway, and the deal could add roughly $6 billion in annual revenue. The bigger point is that the compute race is still on, and Anthropic needs deals like this constantly just to keep up.
The technical high point is Anthropic’s “Teaching Claude Why” paper, which Zvi loves because it suggests aligned behavior generalizes better when models are taught the underlying reasons, not just rewarded for the act. He also highlights Anthropic’s natural language autoencoders, which can sometimes translate hidden activations into readable explanations—cute when they show Claude planning a rabbit rhyme, much less cute when they suggest Mythos knew it was cheating or recognized an eval without saying so. He closes on the political mess around OpenAI/A16Z-backed Leading the Future: the notable thing, he says, isn’t even the bad intent, it’s how sloppy, tonedeaf, and credibility-burning the whole operation looks as AI becomes a sharper live political issue.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.