We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as expectations of random products. This includes several different estimation problems, such as random halfspace intersections, random #3-SAT and random permanents. In this post, we will give a high-level introduction to these methods before sharing some more detailed notes. This is intended as an interim technical update and will be relatively light on motivation: for a broader discussion of this line of research, see our prior post.
All of the problems discussed in this post can be thought of particular choices of "architecture"
Honestly for me it's more of a strike against RNNs. Real deep neural networks that have been trained don't have this property, so it's a bridge we're going to need to cross at some point regardless. From a derisking point of view I'd kind of like to get to that point ASAP. There's a lot of talk about looking at random boolean circuits (which very obviously don't have this property), narrow MLPs, or even jumping all the way to wide MLPs trained in some sort of mean-field/maximum update regime that gets rid of it.
I am affiliated with ARC and played a major role in the MLP stuff
I'm loosely familiar with Greg Yang's work, and very familiar with the 'Neural Network Gaussian Process' canon. It's definitely relevant, especially as an intuition pump, but it tends to answer a different question. They answer 'what is the distribution of quantities x y and z over the set of all NNs' where quantities x y and z might be some preactivation on specific inputs. Knowing that they are jointly gaussian with such-and-such covariance has been a powerful intuition pump for us. But the...
I am affiliated with ARC and played a major role in the MLP stuff
The particular infinite sum discussed in this post is used for approximating MLPs with just one hidden layer, so things like vanishing gradients can't matter.
We are now doing work on deeper MLPs. In this case, the vanishing gradients story definitely does seem relevant. We definitely don't fully understand every detail, but I'll mouth off anyways.
On one hand, there are hyperparameter choices where gradients explode. It turns out that in this regime, matching sampling: exponentially larg...
In 2025, the Alignment Research Center (ARC) has been making conceptual and theoretical progress at the fastest pace that I (Eric) have seen since I first interned in 2022. Most of this progress has come about because of a re-orientation around a more specific goal: outperforming random sampling when it comes to understanding neural network outputs. Compared to our previous goals, this goal has the advantage of being more concrete and more directly tied to useful applications.
The purpose of this post is to:
Does the 1/sqrt(N) error for SGD assume single-pass? It seems like if we're bottlenecked on few data points we can use multi-pass and do nearly as well as bayesian (at least for half-spaces).