Alcreon
Back to Podcast Digest
AI Engineer··27m

Agentic Engineering: Working With AI, Not Just Using It — Brendan O'Leary

TL;DR

  • Agentic engineering is a workflow shift, not just better autocomplete — Brendan O'Leary says the leap from Copilot-style suggestions to 2025-era agents that edit files, run tests, and open pull requests means engineers are now “working with” machines, echoing Flask creator Armin Ronacher’s framing.

  • Treat the model like an enthusiastic, well-read, confidently wrong junior developer — O'Leary’s core mental model is that agents are fast, tireless, and broad in knowledge, but they lack judgment, business context, and architectural memory, so engineers must direct rather than defer.

  • Context engineering is the real skill bottleneck — He argues that once context windows get past roughly 50% full, quality can degrade, and extra MCP servers, stale instructions, or mixed tasks can poison output, which is why he recommends one task per session, aggressive summarization, and restarting when things drift.

  • A bad research path can become hundreds of lines of bad code — Borrowing a line from Dex Horthy, O'Leary pushes a strict research-plan-implement loop: first use chat-only “ask mode” to understand the system, then write an explicit plan file with tests and scope, and only then let the coding agent execute.

  • The best agent setups separate permanent rules from reusable workflows — He recommends a lightweight agents.md for always-on project conventions, commands, and test requirements, plus optional skills.md playbooks for recurring tasks like changelogs or motion graphics workflows.

  • MCP is useful but easy to overdo — Tools like GitHub MCP or Context7 can make agents dramatically more capable, but every server adds tokens and can distract the model, so if you’re doing front-end work, for example, a Postgres MCP should probably be turned off.

The Breakdown

From autocomplete to actual collaborators

O'Leary opens with a challenge: most engineers can say AI helps them code faster, but not actually explain what they hand off, what they keep, and how they decide. He frames that as the real gap, especially when 90% of engineers have used AI tools and regular usage is still rising. His timeline is short and sharp: early 2020s was line completion, 2022 brought whole-function help via GitHub Copilot, and by 2025 the big break was agents that can decompose tasks, modify files, run tests, and return with a pull request.

The junior developer analogy that carries the whole talk

The memorable mental model is the point of the talk: an AI agent is “an energetic, enthusiastic, extremely well-read, often confidently wrong junior developer.” It has speed, patience, zero ego, and has effectively read every Stack Overflow post ever written, but it does not have judgment. That’s why Armin Ronacher’s claim that he got back more than 30% of his day matters here: the gain comes from knowing what to delegate, not from blindly accepting whatever the model spits out.

Context engineering is where good usage becomes great usage

He leans on Andrej Karpathy’s definition of context engineering as carefully filling the context window with only what the model needs for the next step. More context costs more, but worse, O'Leary says quality can actually decline once you pass about 50% full, especially when always-on MCP servers are stuffing in extra tool descriptions. His practical rule is simple: do one task per session, summarize aggressively, and if the agent starts going off the rails, don’t argue with it for 20 more turns — start a new session.

The iPad intern story makes the point painfully clear

To explain bad context, he tells a great management story from a healthcare software company when the iPad was brand new. He designed a patient-history app mockup in Balsamiq, handed it to interns, and got back a prototype with Comic Sans and goofy placeholder emoji because that’s what the wireframe showed. His point lands cleanly: that wasn’t the interns’ fault, it was a failure to provide the right context — exactly the same mistake engineers now make with agents.

Why he prefers research, plan, implement over “just build this”

O'Leary says the classic beginner mistake is asking an agent to implement a feature immediately because LLMs are very good at producing lots of code very quickly. That makes for flashy demos, but also for garbage-in, garbage-out frustration and the kind of bad early experiences that leave people dismissing AI entirely. His fix is a three-step loop: first research the codebase and edge cases with a chat-only mode, then write an explicit plan with scope and test strategy, and only then move to implementation.

Ask mode, plan files, and using Git as your first PR review

In Kilo, the research phase happens in “ask mode,” where the agent can’t edit files and is forced to help understand the system instead of rushing into code. That should produce a research doc a human can review, followed by a plan file with exact file changes, verification steps, and clear in-scope versus out-of-scope boundaries. Once implementation starts, he recommends a fresh session with only the plan, frequent commits, and treating local Git history like your own first pull request review before teammates ever see it.

Modes, agents.md, skills, and the trap of too much MCP

He breaks agent setup into roles and memory: modes like ask, architect, and code; agents.md for always-on project rules and commands; and skills.md for reusable workflows the agent can load on demand. He also walks through practical IDE moves like @-mentioning files, adding selected code into context, and slash commands to compress or restart tasks. On MCP, he likes the power — GitHub MCP for PRs and issues, Context7 for newer docs — but warns that every extra server adds tokens and can nudge the model toward irrelevant actions, like a Postgres tool hanging around during purely front-end work.

The closing pitch: pick a tool, get reps, and make coding fun again

He closes with a light Kilo plug, mentioning expansion across surfaces and a focus on safe open-source agent use through OpenClaw and Kilo Claw. But the broader takeaway is less about product and more about reps: agentic engineering is “part art and part science,” so engineers need repeated practice to learn what to trust and what to keep. His hopeful ending is that with the tedious work pushed onto agents, some senior engineers are saying they’re having more fun programming than they’ve had in years.