How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed
TL;DR
Zed 2 is a small, specialized model built for speed — Ben Kunkle says edit prediction has to run on every keystroke, so Zed fine-tuned a narrow model for one job instead of relying on a larger general-purpose system.
The training pipeline starts with real editor snapshots, not synthetic prompts — Zed uses opt-in production data including cursor position, recent edits, nearby types and definitions, and diagnostics, then asks a frontier model what edit it would make in that exact context.
Frontier-model outputs were unreliable enough that Zed added a repair pass — if a teacher prediction crosses the editable-region boundary or just undoes what the user typed, Zed sends it to another model with a “you failed in this way, fix it” prompt before using it as training data.
Most of the pipeline is reusable and cached as giant JSONL files — each stage just adds or moves fields in a single-line JSON object, which lets the team reuse the expensive teacher/distillation work across many prompt-format experiments.
“Settled data” is promising but noisy, so Zed looks for near-matches instead of exact final code — after a user stops editing a region for 10 seconds, Zed compares the final state to multiple model samples with Levenshtein-style distance and keeps the middle band: not obvious, not random, but actually learnable.
Offline metrics don’t decide the winner by themselves — along with delta-carf, reversal ratio, and diagnostic error counts, Zed ships models to 15%+ of production traffic and watches acceptance rate and latency because eval scores don’t always match what users want in the editor.
The Breakdown
Zed trained its edit-prediction model on opt-in production snapshots, then used frontier models to generate and repair predictions before distilling them into a tiny model fast enough to run on every keystroke. The most interesting twist is “settled data”: instead of trusting what the user eventually typed, they filter for examples where many model samples land near the final code, using the student model itself to avoid a million expensive teacher calls.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
The Codex /goal Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.