Devin’s 80% Moment: Background Agents, 7x PRs, & End of Hand-Held Coding — Walden Yan & Cole Murray
TL;DR
Devin’s usage hit an 80% tipping point — Walden says Devin-authored code in Cognition’s repos rose from 16% in January to 80% in March, while merged PR volume grew roughly 7x in 2-3 months.
The real challenge is testing, not computer use — Clicking UI elements is the easy part; the hard part is reasoning through how to run multi-service apps, enable flags, get the right permissions, and verify cross-stack changes end to end.
Out-of-the-box agents are safer, but much harder to build — Running the “brain” outside the sandbox protects secrets and supports cleaner permission boundaries, but forces you to manage state, orchestration, and more complex infra.
Most companies still aren’t actually ready for autonomous coding — Repo setup remains a bottleneck because many teams still rely on tribal knowledge like “go ask Bob for the secrets,” which breaks when an agent needs to boot and test a repo on its own.
AI coding quality now depends on guardrails as much as model intelligence — The guests call out repeated failure modes like backward-compatibility hacks, untyped tuples,
getattrreward-hacking in Python, and codebase “slop” spreading from weak patterns unless linting and cleanup are enforced.The fastest-growing use cases are outside classic engineering — SRE triage, customer support, PM-driven bug fixes in Slack, and internal knowledge workflows are all becoming agent-native because the value comes from integrating code, logs, docs, tickets, and chat in one loop.
The Breakdown
Devin went from writing 16% of code in its own repos in January to 80% by March, while merged PRs grew 7x in a few months with barely any headcount growth — the clearest sign yet that background coding agents have crossed from novelty to serious infrastructure. Walden Yan and Cole Murray explain why the hard part isn’t clicking buttons but orchestrating real testing, repo setup, secrets, memory, and the messy company integrations that make autonomous coding actually work.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
The Codex /goal Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.