Back to Podcast Digest
YC Root Access5m

How to Build an Internal AI Agent That Evolves Itself

TL;DR

  • A two-person startup hit $2M ARR with an internal AI ops agent doing founder work — Ayush from Answer This says the system processes 100+ emails a day, has closed 400+ support tickets, updates the CRM, and makes business status instantly queryable across tools.

  • The key idea is not automation but self-extension — when the agent hits a repeated task it can’t do, it calls a coding sub-agent to build a new tool, and that tool becomes permanent for future sessions.

  • Their setup is intentionally simple: Claude Code CLI wrapped in Python with a task queue — Slack, email, and other inbound messages flow into the queue, and the agent iteratively works through them using a thin harness.

  • Business context comes from giving the agent read-only access to the codebase and database — with a cron job keeping those copies fresh on every release, the agent can answer support questions by inspecting actual subscription logic and app behavior.

  • The agent evolves through an editable memory file, not just more prompts — Answer This uses an instructions.md loaded on every turn, and even Ayush’s non-technical co-founder Ryan can correct support behavior in Slack so the agent updates itself and stops repeating that mistake class.

  • Ayush’s framework is three kinds of memory: factual, behavioral, and procedural — factual memory is code and database, behavioral memory is instructions and feedback, and procedural memory is the library of tools the agent creates for recurring tasks, now 45+ CLIs deep.

The Breakdown

The punchy opening: $2M ARR with basically two people

Ayush opens with a strong claim: Answer This has crossed $2 million in ARR with just him and his co-founder as full-time employees, plus a couple contractors for design and outbound. He credits a big chunk of that leverage to an internal AI ops agent that absorbs the kind of operational work that usually eats founder time.

What the agent actually does all day

This isn’t a toy bot sitting in a demo environment. Ayush says it handles more than 100 emails a day, has closed 400+ customer support tickets, updates the CRM after meetings, and collects feedback across channels. The part he emphasizes most is that they can now ask plain-language business questions like “What’s the status of a lead?” or “What are the open issues for a customer?” without bouncing across a bunch of apps.

The real trick: it builds new tools for itself

Ayush says the most important feature is not that the agent does a fixed set of tasks, but that it is “self-extending.” When it encounters a repeated task it can’t yet handle, it asks a coding sub-agent to build the tool, and that new capability sticks around permanently. What started as a skeleton has, in his words, become a “full-blown tool” with 45+ CLIs it authored itself, including a cron job it created to monitor landing pages for ad uptime.

The architecture: thin harness, Claude Code, and a task queue

His recommended setup is straightforward: wrap Claude Code CLI in Python, pipe in messages from Slack, email, and other channels, and place them in a task queue. The agent then pulls tasks and works through them iteratively. He skips re-explaining the “thin harness” argument because others already covered it, but says Claude Code works especially well because it already knows how to inspect files, run commands, and use CLIs.

How the agent learns your business without constant hand-holding

Ayush’s answer to business-specific logic is to give the agent read-only copies of both the codebase and the database, refreshed by cron whenever they ship a release. That means when a support question comes in, the agent can inspect the actual code to understand subscription logic or where something lives in the product, instead of relying on stale documentation or handcrafted prompts.

The editable personality file that makes it improve like an employee

The self-evolution loop also depends on what Ayush calls editable personality or memory: an instructions.md file loaded on every turn and editable by the agent itself. He gives a memorable example from customer support, where his non-technical co-founder Ryan noticed a recurring class of mistakes, messaged the agent in Slack about what was wrong, and the agent updated its own instructions and tool link so that entire category of errors stopped recurring.

Ayush’s three-memory model and the copy-this blueprint

He closes with a clean framework: an internal agent needs factual memory, behavioral memory, and procedural memory. Factual memory is your codebase and database; behavioral memory is feedback and instructions; procedural memory is the repeatable work encoded as tools. His practical recipe is equally concise: use Claude Code or another coding-capable CLI as the main harness, give it read-only codebase access, basic startup CLIs, a coding agent as another CLI, and an instruction file that gets edited every turn—then wire it to Slack or email over SSH.

Share