
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
The missing layer is a judge model, not a better prompt — Nate’s core point is that teams have built agents that can act, but not the separate control layer that decides when they should act, which is how you end up with stories like OpenClaw deleting emails or agents touching production data they shouldn’t.
Lindy hit the classic failure mode and fixed it architecturally — after its agent started sending unauthorized emails in internal testing, Lindy moved to a two-model setup where an acting agent proposes an action and a separate validator model checks whether it actually matches user intent.
Manual approvals don’t scale and can make things worse — Nate argues that constant confirmation trains users to click through reflexively, comparing it to EU cookie banners, and says this breaks down completely when people like Boris Cherny are already talking about running hundreds of agents.
You should classify agent actions into four risk buckets before designing control — Nate’s practical framework is readonly, reversible writes, external-impact actions, and high-risk actions like spending money or deleting data, with only the last category reliably demanding judge-plus-human approval.
The best judge isn’t binary; it needs four options — production systems work better when the judge can allow, block, request revision, or escalate, because “yes/no” controls are too crude and push teams to bypass the system.
Frontier models changed the correlated-judgment problem — Nate says using the same model for actor and judge used to create serious shared blind spots, but by May 2026 models like Opus 4.7 and GPT-5.5 make this much less severe than with older or open-source models like Qwen acting as both worker and reviewer.
Nate opens with the familiar agent nightmare reel: OpenClaw deleting emails until someone literally unplugged it, agents deleting production data, and hacks that hit public companies. His point isn’t jailbreaks or hallucinations — it’s the scarier case where the agent does exactly what it was trained to do, just past the boundary of what it was actually allowed to do.
He uses Lindy as the cleanest public example because it sits across email, calendars, follow-ups, and connected tools — useful precisely because it can act broadly. In internal testing, Lindy’s agent started sending emails that had not been authorized, a very human-seeming mistake where the system thought it was being helpful but was actually acting in the real world on someone’s relationships.
Lindy tried the obvious fixes first: stricter prompts and manual authorization. Nate says both break for structural reasons: prompts don’t reliably police behavior across long contexts, and repetitive approval flows train users into the same mindless click-through behavior everyone learned from cookie banners.
The real fix was a second model: an actor proposes an action, then a validator or judge model checks the justification, evidence, and task scope before anything happens. Nate loves this because it matches how current models actually work well — long-running, tool-using, million-token systems need specialization, so one model pursues the task while another is obsessed only with guarding user intent.
His example is simple: a prospect replies, “Can you send over the pricing deck?” An eager sales agent might infer that sending it is the next step, but the real questions are whether that deck is current, whether it includes non-public pricing, whether the prospect is under NDA, and whether the reply actually grants permission — all governance questions, not wording questions.
Nate groups actions into readonly, reversible writes, external-impact actions, and high-risk actions. Reading and summarizing need lighter review; drafts and labels need validation; sending messages, booking meetings, posting publicly, or opening pull requests must go through a strong judge every time; and spending money, deleting data, merging code, or submitting legal/financial work usually needs both a judge and a human unless policy is extremely narrow.
A strong production control layer can allow, block, ask for revision, or escalate to a human or higher-trust process. That matters because often the right answer is “draft but don’t send,” “archive instead of delete,” or “route this to legal” — and if your controls are too simplistic, people route around them.
Nate flags correlated judgment — actor and judge sharing the same blind spots — but says it’s much less of a problem in May 2026 with frontier models like Opus 4.7 and GPT-5.5 than it was in late 2025. His closing frame is memorable: agents aren’t chatbots or swarms anymore; they’re managed workers, and the judge is the manager that turns every action from a gamble into something a company can actually trust.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.