Why Opus 4.8 Pulled Me Back to Claude
TL;DR
Opus 4.8 feels like an Opus 5-level release — the reviewer says Anthropic is "back" after a disappointing Opus 4.7, calling 4.8 a legitimately great model and an S-tier, paradigm-shifting upgrade.
It scored 63 on Every’s senior engineer benchmark — that’s about 30 points above Opus 4.7 and one point ahead of GPT-5.5, putting it near the top of the pack for senior-engineer-style coding tasks.
Writing is where it really stands out — Opus 4.8 scored 79.6 on Every’s writing benchmark versus GPT-5.5’s 73, with stronger voice imitation, fewer AI tells, and notably better results on high or extra-high reasoning.
Reasoning settings matter a lot — the team found extra-high reasoning materially improved both coding and writing, while medium and even high could feel noticeably worse on difficult work.
The model beats the app right now — the speaker still lives in Codex because its desktop app is cleaner and faster, arguing that "the harness matters as much as the model does" in this phase of AI products.
It’s unusually strong at knowledge work and interpersonal thinking — beyond code and prose, it made a surprisingly deep slide deck on compound engineering and was praised for emotionally intelligent, frame-expanding responses on management and relationship questions.
The Breakdown
Opus 4.8 jumped roughly 30 points on Every’s senior engineer benchmark, beat GPT-5.5 by a hair, and was good enough to pull a lapsed Claude power user back into the app. The catch: the model feels like a major leap, but Anthropic’s clunky Claude desktop experience still keeps Codex in the daily-driver seat.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
The Codex /goal Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.