Back to Podcast Digest
Matthew Berman31m

Cursor just beat EVERYONE.

TL;DR

  • Composer 2.5 is nearly frontier-level coding at a tiny fraction of the cost — Berman highlights Cursor’s benchmark showing Composer 2.5 at roughly 64% on Cursor Bench versus about 65% for top models, while costing around $0.55 per task instead of roughly $11 for Opus 4.7.

  • Price-per-intelligence is becoming the real battleground — he argues most enterprises cannot spend $30 per million output tokens or "token max" indefinitely, so models that are fast, good enough, and cheap will win actual deployment.

  • Cursor’s moat is data plus distribution — because Cursor became an AI-first coding IDE early, Berman says it likely has one of the best coding datasets in the world, and Composer 2.5 improves on Moonshot’s Kimmy K2.5 with more RL and 25x more synthetic tasks.

  • Google is strategically right about workhorse models, even if Gemini 3.5 Flash disappointed here — Berman contrasts Composer 2.5 with Gemini 3.5 Flash and repeats Sundar Pichai’s point that Google has to optimize for cheap, efficient inference because AI must serve billions of search users.

  • Enterprise AI teams are already obsessing over routing and spend controls — citing Aaron Levie and Fortune 500 CIO conversations, he says companies are juggling model routing, spend caps, and workload prioritization because no one can afford to throw frontier models at everything.

  • Elon’s Cursor move is really a compute-plus-data play — Berman frames the SpaceX/xAI-Cursor deal as an almost-certain acquisition designed to avoid IPO delays, pairing Cursor’s coding data and team with Colossus-scale compute while xAI also sells capacity to Anthropic.

The Breakdown

Cursor’s new Composer 2.5 lands about 1.5 points off the coding frontier while costing roughly 55 cents per task instead of $11 — a price/performance gap Matthew Berman calls the most important shift in AI right now. His bigger argument: workhorse models, not absolute frontier models, are what most companies can actually afford to run at scale.

Share