
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.
Every fully moved from Claude Code to Codex for day-to-day knowledge work — Dan Shipper says Codex went from “trash” 3-6 months ago to his “daily driver,” and Austin now spends roughly 80% of his workday inside the Codex desktop app.
The real battle isn’t chatbot vs chatbot — it’s agent operating systems for work — Dan frames Codex, Claude Code/Co-work, and similar tools as a new desktop-based “agent management interface” that becomes the primary surface for Gmail, Slack, Notion, Stripe, files, and the browser.
Codex won for Every because the app is faster and better organized, not just because the model improved — Austin says GPT-5.5 reached rough parity with Opus for his work, but Codex’s desktop experience, subagents, folders, persistent chats, and automation flow made the decisive difference.
Their highest-leverage use case is having Codex assemble work they’ve already thought through — instead of asking AI to invent strategy, Austin uses it to read meeting transcripts, Slack threads, calendars, and templates, then draft documents like a plus-one go-to-market plan that came out 80-90% done in minutes.
Automation is most useful when it starts dumb and reliable — Austin’s examples include end-of-day reply drafts, event run-of-show generation, email triage, and recruiting pipelines, with a final human review happening in Gmail, Slack, or Notion before anything goes out.
The ceiling is high, but trust still requires human judgment on metrics and outputs — when rebuilding Every’s KPI tracker in Notion, Austin found Codex could get 90-95% there, but core business metrics like MRR still had to be checked column by column because being even 3-5% off is unacceptable.
Dan opens with a blunt reversal: 3-6 months ago, Codex was “trash,” built for senior engineers, emotionally tone-deaf, and weirdly combative. His big claim now is that OpenAI pivoted hard after seeing what Anthropic unlocked with Claude Code: if you have a coding agent on your computer, you don’t just have a programming tool — you have a general-purpose knowledge-work machine.
Dan’s broader thesis is that work is moving into an “agent management interface,” a desktop surface where the model becomes your way into software, files, and the internet. He frames this as a race: Anthropic has Claude Code/Co-work, OpenAI has Codex, xAI has effectively moved via Cursor, and Google will likely join with something more serious than “anti-gravity.”
Austin says his “agent pill moment” came in December/January, spending a week deep in Claude Code through Warp and wiring it into work and life. He initially resisted Codex because, two months earlier, it made him “feel more stupid than anything” — asking architecture questions and then basically replying “why?” when he asked for clarification — but the latest GPT model changed that, and the app experience sealed it.
For Austin, the biggest differentiator isn’t just model quality; it’s that the Codex desktop app is fast, organized, and actually pleasant to live in. He contrasts it with Claude’s desktop experience by saying Codex can handle parallel tasks — like shipping a PR while drafting a go-to-market plan — without getting clunky, and now it’s the first app he opens every morning.
Austin demos a folder-based setup called “Every Growth OS,” connected to Gmail, Slack, Notion, Stripe, and other company systems, with local files, project instructions, and custom reviewer agents. His favorite onboarding trick is simple: ask Codex to inspect your tools and suggest automations, then let it build things like follow-up radars, event command centers, or recruiting trackers that mostly just work.
When asked how he stays safe, Austin explains that Codex drafts inside the agent, but the final approval lives in the destination app: Slack drafts get checked in Slack, Gmail drafts get checked in Gmail, and strategy docs land in Notion or Proof for a final pass. He likes the cognitive reset of stepping out of the agentic workspace before something reaches another human.
Austin’s favorite example is building Every’s go-to-market plan for Plus One. Instead of asking the model to invent strategy, he asks it to gather what already exists across recorded meetings, Slack debates, templates, and launch calendars, then produce a draft; one version came out 80-90% complete in the gaps between meetings, which he says would previously have required blocking off a full day or staying up late.
The final big example is rebuilding Every’s KPI sheet in Notion so both humans and agents can act on a single source of truth. Codex can wire together Notion, Stripe, social data, scripts, and six-hour refreshes, but Austin says the last mile still matters: metrics like MRR are philosophical as much as technical, so he’s validating the system column by column because a business can’t run on numbers that are even slightly wrong.
Share
Keep Reading
The Weekly Echo. The inbox-shaped summary of what mattered.
New editorials announced here.

Playbook
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.