How I AIMay 27, 202630m

I let Codex run for 6 hours. Here’s what happened.

TL;DR

Goals turn AI from turn-taking into self-management — Claravo frames /goal in Codex as a loop of work, verify, and choose-the-next-step, instead of the usual "okay, what's next?" prompting pattern.
The best goals look like OKRs with guardrails — strong goals specify an outcome, verification method, constraints, boundaries, iteration policy, and stopping condition, like reducing P95 checkout latency while keeping the correctness suite green.
A real production bug burn-down took nearly 6 hours and ended at zero errors — in ChatPRD, Claravo gave Codex access to Sentry, had it categorize every invalid edit operation, fix root causes, replay historical failures, and systematically eliminate the whole class of issues.
This isn't just for engineers: inbox cleanup was the breakout demo — Codex used Gmail access over 3 hours and 52 minutes plus about 6 million tokens to read, label, unsubscribe, and reduce roughly 3,900 emails to just 68 needing human review.
Project management cleanup is another strong non-code use case — Claravo points /goal at a messy Linear backlog and has it cancel stale podcast tasks from already-released episodes so only future work stays open.
The limitation is clarity, not ambition — goals are a bad fit for one-line edits or vague asks like "make customers happy"; they work best when the objective is durable, the finish line is measurable, and the path requires multiple rounds of investigation.

The Breakdown

A single Codex goal ran for 5 hours and 45 minutes and wiped out an entire class of production edit errors — then Claravo showed the same workflow cleaning up 3,900 emails down to 68. The big idea: stop babysitting AI turn by turn and start giving it measurable outcomes with evidence-based finish lines.

LinkedIn X Email

Keep Reading

The Weekly Echo. The inbox-shaped summary of what mattered.

New editorials announced here.

Follow @alcreon on X