Wilson Wu — LessWrong

Announcing the ARC White-Box Estimation Challenge

by Jacob_Hilton, paulfchristiano, and Wilson Wu

ARC has teamed up with AIcrowd to launch the ARC White-Box Estimation Challenge, a contest to improve upon our estimation algorithms for random MLPs. The warm-up round begins this week, and later rounds will have a total prize pool of at least $100,000. We are very grateful to Sharada Mohanty,...

Jun 2165

Mechanistic estimation for expectations of random products

by Jacob_Hilton, George Robinson, Eric Neyman, paulfchristiano, Mikewins, Victor Lecomte, Wilson Wu, and Gabriel Wu

We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as expectations of random products. This includes several different estimation problems, such as random halfspace intersections, random #3-SAT and random permanents. In this post, we will give a high-level introduction to...

May 1552

ARC progress update: Competing with sampling

by Eric Neyman, Victor Lecomte, Wilson Wu, Mikewins, Jacob_Hilton, and George Robinson

In 2025, the Alignment Research Center (ARC) has been making conceptual and theoretical progress at the fastest pace that I (Eric) have seen since I first interned in 2022. Most of this progress has come about because of a re-orientation around a more specific goal: outperforming random sampling when it...

Nov 18, 2025132

Ambiguous out-of-distribution generalization on an algorithmic task

Introduction It's now well known that simple neural network models often "grok" algorithmic tasks. That is, when trained for many epochs on a subset of the full input space, the model quickly attains perfect train accuracy and then, much later, near-perfect test accuracy. In the former phase, the model memorizes...

Feb 13, 202584

The slingshot helps with learning

The slingshot effect is a late-stage training anomaly found in various adaptive gradient optimization methods. In particular, slingshots are present with AdamW, the optimizer most widely used for modern transformer training. The original slingshot paper observes that slingshots tend to occur alongside grokking, a phenomenon in which neural networks trained...

Oct 31, 202433