Back to Podcast Digest
Alex Finn3h 39m

LIVE: Is Hermes better than OpenClaw? FINALE!!!

TL;DR

  • Hermes on Claude Opus wins Alex Finn’s “Agent Olympics” finale — after five live-streamed tests, Hermes+Opus finished first overall, with OpenClaw+Opus in second, OpenClaw+ChatGPT in a distant third, and Hermes+ChatGPT dead last.

  • OpenClaw’s biggest problem isn’t capability, it’s reliability — Alex says it would be a “runaway win” over Hermes if the team stopped shipping daily updates that break installs, citing viewers who spend 30 minutes a day fixing it.

  • Claude Opus looked dramatically more consistent than ChatGPT across agent harnesses — both Hermes and OpenClaw performed better on Opus, while ChatGPT-powered versions produced worse UI, weaker communication, and more “slop,” especially in the infographic challenge.

  • Alex draws a hard line between open agent frameworks and closed coding tools — he argues Claude Code and Codex are not real competitors to OpenClaw or Hermes because they’re closed, server-bound coding products, while OpenClaw/Hermes are customizable open-source agent harnesses with shared memory and local model support.

  • The stream doubles as a manifesto for practical AI workflows — from using a $200 M1 Mac Mini as a dedicated agent box to meta-prompting /goal tasks in Codex and Hermes, Alex keeps pushing the same idea: use what works, don’t build your own for ego, and optimize for shipping.

  • Alex also goes off on AI creator incentives and tribalism — he trashes “undisclosed shills” for tools like Higgsfield, says selective sponsorships like Nvidia are fine, and mocks the reflexive internet move of accusing anyone with an opinion of being “paid by Big Kiwi.”

The Breakdown

The finale starts with a rant about broken OpenClaw updates

Alex opens the stream like it’s the gold-medal match of the “Agent Olympics,” but immediately detours into his real grievance: OpenClaw keeps shipping updates that break everything. He says the tool is fantastic, but the team needs to stop releasing broken builds every day because users are losing “half an hour a day” just repairing their setup.

Finn’s work philosophy: deep work means no treadmill, no lyrics, no distractions

Before the tests even begin, he goes on a very Alex Finn tangent about focus: standing desks maybe, under-desk treadmills no, and music with lyrics absolutely not. His theory is that every extra stimulus eats brainpower — treadmill might cost you 15%, lyrics another 10-15% — and if you’re trying to do real work, you should stop pretending multitasking is free.

The scoreboard heading in: Hermes+Opus already has the lead

When he finally pulls up the standings, Hermes on Opus is in first, OpenClaw on Opus is second, OpenClaw on ChatGPT is third, and Hermes on ChatGPT is “dead last by a country mile.” He also uses the pre-test banter to complain about side-tab browsers, praise the Apple Magic Keyboard over every Keychron/NuPhy he’s tried, and casually mention he got invited to OpenAI’s May 5 event — then immediately realizes basically everyone on Twitter got invited too.

A useful aside: how to actually get busy people to answer your DM

One of the better non-agent sections is Alex’s mini-lecture on outreach. His advice: if you want a reply from someone with an audience, make your request answerable with a yes or no, and bring receipts up front — examples, pricing, scope, even a contract — instead of forcing 20 rounds of back-and-forth.

Smart glasses, rings, and the dream of open hardware

He lights up talking about his Even Realities glasses and a ring that lets him control Claude Code while looking someone in the eye. That turns into a broader complaint that China keeps shipping open-ish hardware while American companies lock everything down; his billion-dollar idea is simple: customizable smart glasses with displays, no lockdown, and full developer freedom.

Why /goal matters, and why AI tools are all converging on it

Alex explains the new /goal feature in Codex and Hermes as a major shift from prompts to missions. Instead of “do this action,” you give an agent an overarching objective — like building a multiplayer looter-shooter — and a second agent can periodically check whether the first one is still on track; his key tip is to meta-prompt first and have another LLM write the /goal prompt for you.

Test four: make a dense infographic on the AI coding agent landscape

For the fourth challenge, he asks all four setups to research the AI coding agent market from 2022 to today and produce a polished infographic with funding rounds, milestones, open-vs-closed splits, and citations. Hermes+Opus produces the cleanest result and wins the round, OpenClaw+Opus comes in just behind, OpenClaw+ChatGPT looks cluttered and hard to read, and Hermes+ChatGPT is so bad he calls it “September 2024 AI” slop — including a market-leaders section that barely mentions Claude Code.

Test five: ASCII music videos, a surprise twist, and Hermes’ memory problem

The final test is more chaotic and fun: generate a 60-second ASCII-art animated music video synced to a lo-fi track. OpenClaw+ChatGPT ends up making the crowd favorite, but Hermes+Opus gets derailed by a compaction/memory failure and suddenly forgets the task entirely, which Alex says is his number-one issue with Hermes because it happens repeatedly and makes the agent feel like it woke up in Memento.

Final verdict: Hermes+Opus wins, but the stream’s bigger thesis is about trust

When the scores are totaled, Hermes on Opus wins the whole competition, with OpenClaw on Opus close behind. Alex’s bigger point isn’t just who won — it’s that model choice matters a lot, reliability matters even more, and the AI ecosystem is drowning in weird tribalism, reply-guy marketing, and creators cashing short-term sponsorship bags instead of building long-term trust.

Share