AI EngineerJune 9, 202611m

RAG is dead, right?? — Kuba Rogut, Turbopuffer

TL;DR

RAG is not just vector search: Rogut argues retrieval in retrieval-augmented generation includes embeddings, BM25, grep, regex, globbing, and filters, not just a single semantic lookup.
Agentic search usually means iterative tool use: Instead of one retrieval call, agents repeatedly search, read, assess, and search again until they have enough context to act.
Cursor gets measurable gains from semantic search: Citing Cursor's 2026 posts, Rogut says semantic search improved answer accuracy by roughly 12.5 to 13.5 percent on Cursor's internal benchmark, with one composer model showing nearly a 24 percent lift.
Small product metrics can still matter: Cursor's online A/B test showed about 2.6 percent better retention on large codebases and a 2.2 percent drop in dissatisfied requests, even though semantic search is only helpful on a subset of queries.
Claude Code made a different tradeoff: Rogut references Boris Cherney saying early Claude Code tried local vector DB-based RAG, then moved away from it, favoring per-session grep-style discovery instead.
The real bottleneck is narrowing context, not expanding it: Using Jeff Dean's line, Rogut says even trillion-token windows do not remove the need for staged retrieval. "You don't need a trillion at once, you need the right million."

The Breakdown

Cursor saw about a 12.5 to 13.5 percent accuracy gain from semantic search, while Claude Code dropped local vector search because repeated grep-style exploration fit its workflow better. Kuba Rogut's real point is that RAG is not dead at all. It has evolved into iterative, tool-rich retrieval where agents mix embeddings, full-text search, and filters to find the right context fast.