AI EngineerMay 16, 202624m

How to Leverage Domain Expertise — Chris Lovejoy, Notius Labs

TL;DR

Winning in vertical AI is mostly an organizational problem, not a model race — Chris Lovejoy argues that with frontier models now “good enough,” the real moat is how a company operationalizes expert judgment around specific workflows, especially given Gartner’s stat that about 50% of generative AI projects were abandoned last year.
Lovejoy’s core framework is three roles for domain expertise: oracle, evaluator, architect — the oracle directly tweaks prompts and product behavior, the evaluator defines measurable quality and builds the review system, and the architect designs automated feedback loops so the product improves from usage at scale.
You usually do need domain expertise, but it doesn’t always mean hiring a traditional credentialed expert — the essential requirement is judgment about what “good” looks like in your use case, which can come from formal experts like doctors and lawyers or informal experts already inside the company.
The right structure depends on whether quality is measurable and whether manual iteration is fast enough — if quality is subjective, like meeting notes, a strong oracle can work for a long time; if quality is measurable and scale outpaces human fixes, you need to evolve toward evaluator and architect systems.
His case studies show the pattern in real companies: Granola stayed oracle-heavy, Tandem decentralized the oracle, and Anterior moved through all three stages — Joe at Granola still acts as the quality gatekeeper for AI meeting notes, Tandem hired doctors across specialties and geographies for prompt customization, and Anterior built clinician review dashboards and later automated improvement for prior authorization decisions.
The practical advice is to hire a principal domain expert early and give them real ownership — Lovejoy warns against treating experts as part-time advisors or splitting authority across committees, citing a company where two senior clinicians with ambiguous ownership moved slowly and both left after 12–18 months.

Summary

The thesis: domain expertise is the bottleneck now

Chris Lovejoy opens with a blunt claim: if you want to build better AI products, you need a “domain native AI organization.” He comes at this as a former Cambridge-trained doctor who worked in the NHS, then moved into AI at places like Tandem and Anterior, where the recurring problem was always the same — how do you actually bake expert judgment into the product?

Vertical AI is huge, but too many projects still die in the last mile

He frames vertical AI as a massive opportunity, echoing the VC excitement around AI moving beyond software into labor itself — from a roughly $50 billion vertical SaaS market toward something much larger. But he points out the ugly reality too: Gartner says about 50% of generative AI projects were abandoned last year, and his explanation is that companies are trying to automate workflows they don’t deeply understand.

The three roles: oracle, evaluator, architect

Lovejoy’s framework is the heart of the talk. The oracle is the expert who both judges outputs and directly improves the product — often by tweaking prompts, adding documents, or changing tools. The evaluator still defines what quality means, but turns that judgment into metrics and review systems; the architect goes one step further and designs the machinery for automated learning and improvement, with much less human-in-the-loop work.

How to choose the right setup for your company

His decision tree is refreshingly practical: first ask whether quality can actually be measured in a meaningful metric, or whether it’s more about taste. If it’s not measurable, you want an oracle; if it is measurable, then ask whether manual iteration by engineers is fast enough — if yes, evaluator may be enough, and if no, you’ll need architect-style automation. He also stresses that this can evolve over time, especially as a startup grows.

Granola: when one person’s taste is the product

His first case study is Granola, the AI meeting-notes company now valued at over $1 billion. He highlights Joe, an early employee with a writer/journalist background, who wrote all the prompts and spent “many, many hours” reading papers and talking to hundreds or thousands of users to understand what makes a good meeting note. That works as an oracle model because there’s no single objectively perfect meeting note — taste matters, and the product’s core output is narrow enough for one strong quality gatekeeper to matter.

Tandem: the oracle model, but decentralized across medicine

Tandem, which builds a medical AI scribe, started similarly with Roy — a doctor who had also been at McKinsey — reviewing notes and updating prompts himself. But scale broke the one-person model, so the company hired multiple doctors across specialties, countries, and note types, effectively creating a decentralized oracle system. The key detail here is the long tail: thousands of prompt variants tuned for different medical contexts, each needing someone who actually understands that slice of the workflow.

Anterior: moving from oracle to evaluator to architect

Lovejoy then uses his own experience at Anterior, a prior authorization startup, as the cleanest example of the full progression. He began as the oracle — building prompts and code, then putting on his “doctor hat” to assess whether approval decisions were clinically appropriate — but as customer variation grew, he defined metrics and failure modes, built a clinician review dashboard, and hired clinicians to produce scalable evaluation data. Even that eventually wasn’t enough, because insurers interpreted policies differently, so the product needed architect-style systems that could learn and adapt at the edge.

The org advice: one accountable expert, real ownership, broad skills

He closes with the management lesson: designate a principal domain expert with actual accountability for AI quality. Don’t reduce them to an advisor, and don’t split authority so broadly that nobody really owns the call — he shares a cautionary example of a company with two senior clinicians, fuzzy ownership, slow progress, and both leaders leaving after 12 to 18 months. His hiring advice is to optimize for relevant domain experience first, then stack as many adjacent skills as possible — prompting, data science intuition, product sense, leadership, even engineering — so the person can grow from oracle into evaluator or architect as the company matures.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

How to Leverage Domain Expertise — Chris Lovejoy, Notius Labs

Summary

The thesis: domain expertise is the bottleneck now

Vertical AI is huge, but too many projects still die in the last mile

The three roles: oracle, evaluator, architect

How to choose the right setup for your company

Granola: when one person’s taste is the product

Tandem: the oracle model, but decentralized across medicine

Anterior: moving from oracle to evaluator to architect

The org advice: one accountable expert, real ownership, broad skills

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

The thesis: domain expertise is the bottleneck now

Vertical AI is huge, but too many projects still die in the last mile

The three roles: oracle, evaluator, architect

How to choose the right setup for your company

Granola: when one person’s taste is the product

Tandem: the oracle model, but decentralized across medicine

Anterior: moving from oracle to evaluator to architect

The org advice: one accountable expert, real ownership, broad skills

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks