⚡️Every product of the future will be a living system — Ronak Malde, Trajectory.ai
TL;DR
Every future AI product should be a living system: Malde's core thesis is that static models waste the richest signal in software, namely the edits, corrections, and expert interventions users make after an agent gets something 80 percent right.
Windsurf convinced him real usage beats offline benchmarks: At Codium-Windsurf, the team went from autocomplete models to post-training on agent and user data for their in-house model Swei 1, which he says helped them beat frontier models and proved the power of product-model loops.
The Google acquisition was a 24-hour shock: Windsurf staff expected OpenAI, got a secret hotel-room meeting with Demis Hassabis instead, turned in badges that Thursday, and were DeepMind employees by Friday while Cognition bought the rest the following Monday.
Trajectory starts with regulated verticals where 80 percent is useless: In legal workflows with Harvey, Malde argues partial success does not count, so Trajectory trains models on expert traces and corrections to improve metrics like issue spotting, citation, completeness, and coverage while using cheaper open models like Nemotron 3 Super.
Their technical bet is richer than thumbs-up feedback or plain RL: Trajectory's self-distillation policy optimization uses privileged hints from production data to teach a student model with actual text guidance, instead of compressing a whole trajectory into one reward number.
Continual learning needs a new systems stack, not just new algorithms: Trajectory open-sourced infrastructure with SkyRL, Berkeley's SkyRL Lab, and Anyscale to run concurrent training jobs, because always-on model improvement behaves more like a scheduler problem than a classic one-shot training run.
The Breakdown
Ronak Malde says the next big AI shift is continual learning: products should get smarter from every real user correction instead of repeating the same mistakes forever. After helping build Windsurf's model flywheel and watching the Google-DeepMind deal happen overnight, he left DeepMind to start Trajectory, which is already training domain-specific models for companies like Harvey in under a month.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Cheap Models, Hard Tasks
Most agent workflows route every step to the frontier model by default. The bill scales with how chatty the agent gets, even when most steps don't need that brain.

Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.