Latent SpaceJune 25, 202641m

Cooking with OpenAI’s Research Chief: AGI, o1, Evals, and Scaling Laws — Mark Chen

TL;DR

Scaling laws still hold: Chen firmly believes in the exponential, arguing that every time people have said pre-training can't scale further, research and engineering breakthroughs have proven them wrong.
Reasoning was a hard internal sell: Even at OpenAI, the o1 reasoning project faced skepticism because the pre-training + post-training paradigm was working so well. It took conviction from Ilya Sutskever and others to push it forward.
The eval crisis is real: Chen says there are too few gold-standard benchmarks and teams must separate eval creation from model optimization to avoid "benchmaxing" or overfitting to test distributions.
Research taste comes from replication: To develop good research taste, Chen recommends fully replicating papers you admire, trying to match exact training curves teaches techniques authors don't write down.
Vibe research is becoming real: Researchers increasingly act as idea generators while models handle implementation. Chen predicts models will develop research taste within a three-year horizon.
High-risk bets are OpenAI's alpha: The lab consciously takes risky bets that often fail, but management avoids delusion by cutting losses. One mega hit can justify many misses, like a trading mentality.

The Breakdown

OpenAI's Chief Research Officer Mark Chen defends scaling laws, says pre-training is definitely not dead, and reveals that internal research roadmaps stay stable while implementation details shift during compute allocation cycles.