Back to Podcast Digest
AI Engineer··14m

Agents need more than a chat - Jacob Lauritzen, CTO Legora

TL;DR

  • Chat breaks down for long-running agents — Jacob Lauritzen opens with the familiar nightmare: 30 minutes of sub-agents, web searches, and file writes, then one small correction triggers “compaction,” context rot, and a revised output you can no longer trust clause-by-clause.

  • The bottleneck has shifted from doing work to planning and review — As CTO of Legora, which serves 1,000+ customers across 50+ markets, Lauritzen argues that AI has made execution cheap, so the hard part now is specifying non-functional requirements and reviewing giant outputs—like painful GitHub PR reviews.

  • Verifier’s rule explains where agents work well — Borrowing Jason Wei’s idea, he says tasks that are easy to verify will get solved by AI; in legal, checking contract definitions is easy, but drafting a contract or setting litigation strategy is hard because there’s no objective truth until, in the extreme, a judge tests it in court.

  • Trust improves when you make work more verifiable — Lauritzen gives concrete levers: browser access and test-driven development for coding, “golden contracts” as proxy tests in legal, task decomposition, and guardrails like limiting which files, directories, or websites an agent can touch.

  • Control comes from embedding human judgment inside the workflow, not just upfront planning — Skills and elicitation beat one-shot plans because they let the agent ask targeted questions mid-task, handle contingencies like a special EU-law termination clause, and keep a decision log instead of stalling.

  • Vertical agents need high-bandwidth artifacts, not one-dimensional chat — His core product point is that humans and agents should collaborate in persistent interfaces like documents and tabular reviews, where a lawyer can highlight clause three, leave comments, and only change clause three instead of wrestling with an infinitely long chat thread.

The Breakdown

The Friday-evening contract horror story

Lauritzen starts with a joke about it being 5:00 p.m. on a Friday, then drops into a painfully recognizable agent demo: ask for research and a contract draft, watch it spawn sub-agents and churn for 30 minutes, then discover clause three is wrong. The killer moment is “compaction” — the sign, in his telling, that the agent is entering context rot and you should probably give up on getting a clean, local fix.

Legora’s wedge: vertical AI for legal work

He introduces himself as CTO of Legora, a collaborative AI workspace for law firms, and notes the company has 1,000+ customers in 50+ markets and is “maybe the fastest in history” in growth. The pitch underneath the humble-brag is clear: vertical AI companies are trying to push agents toward more complex end-to-end work, but the interaction model has to change with that ambition.

AI made execution cheap, so planning and review got expensive

The big frame is that the economics of production have flipped in the last 6–12 months. Doing the work is now cheap; planning it, capturing specs, and reviewing outputs are the real bottlenecks, which he compares to the misery of reviewing huge PRs on GitHub.

Verifier’s rule, from contracts to lawsuits to consumer apps

Using Jason Wei’s “verifier’s rule,” Lauritzen says AI wins when a task is solvable and easy to verify. He makes it concrete with legal examples: checking contract definitions is easy, writing the contract is harder because verification only really happens if a judge interprets it later, and litigation strategy is nearly impossible to verify because five lawyers will give five different answers; he makes the same point with coding, where shipping a successful consumer app is much fuzzier than writing testable code.

Trust: proxy tests, decomposition, and guardrails

To raise trust, he suggests making tasks more verifiable: in coding, use browser access and test-driven development; in legal, compare a draft against prior “golden contracts” that already worked well. He also recommends decomposing work so humans keep judgment-heavy choices like risk profile and negotiation stance, while agents handle formatting and definition checks, plus adding guardrails so an agent can only edit certain files or search certain sites—essentially the Claude Code spectrum from constant permission prompts to full YOLO mode.

Control: why planning isn’t enough

He then shifts to control, visualizing agent work as a tree or DAG where a user usually only gets input at the root. Planning helps by aligning on the approach upfront, but he says it’s like a co-worker who discusses the plan and then disappears until the final deliverable; worse, planning can’t account for surprises like a special clause in one contract that only becomes visible during execution.

Skills and elicitation put human judgment where it matters

His preferred answer is skills: encode expert judgment at the node level, so “review confidentiality this way” or “handle this EU-law termination issue like this” becomes reusable guidance inside the workflow. When skills run out, the agent should use elicitation—ask the human targeted questions, keep moving if needed, and write uncertain choices to a decision log so a person can reverse them later.

The real thesis: agents need artifacts, not endless chat

That leads to his central UX point: if the work tree gets 10x or 100x bigger, chat becomes a terrible interface because it collapses complex structure into one linear thread. Legora’s answer is high-bandwidth, persistent artifacts like documents—where you can highlight clause three and only change clause three—or tabular review UIs, where the agent flags a handful of contract issues for fast human judgment; chat is still great as input, he says, but agents aren’t humans, so we shouldn’t force them to collaborate only through human language.