x

LESSWRONG

LW

florin_pop — LessWrong

florin_pop

florin_pop

Message

167

2y

florin_pop

167

2y

Self-prediction acts as an emergent regularizer

by Cameron Berg, Kvee, Mike Vaiana, Diogo de Lucena, florin_pop, and Trent Hodgeson

TL;DR: In our recent work with Professor Michael Graziano (arXiv, thread), we show that adding an auxiliary self-modeling objective to supervised learning tasks yields simpler, more regularized, and more parameter-efficient models. Across three classification tasks and two modalities, self-modeling consistently reduced complexity (lower RLCT, narrower weight distribution). This restructuring effect...

Oct 23, 2024•92

Key takeaways from our EA and alignment research surveys

by Cameron Berg, Kvee, florin_pop, and Trent Hodgeson

Many thanks to Spencer Greenberg, Lucius Caviola, Josh Lewis, John Bargh, Ben Pace, Diogo de Lucena, and Philip Gubbins for their valuable ideas and feedback at each stage of this project—as well as the ~375 EAs + alignment researchers who provided the data that made this project possible. Background Last...

May 3, 2024•114