Epistemic status: This post is a synthesis of ideas that are, in my experience, widespread among researchers at frontier labs and in mechanistic interpretability, but rarely written down comprehensively in one place - different communities tend to know different pieces of evidence. The core hypothesis - that deep learning is...
Thanks to Jesse Hoogland and George Wang for feedback on these exercises. In learning singular learning theory (SLT), I found it was often much easier to understand by working through examples, rather than try to work through the (fairly technical) theorems in their full generality. These exercises are an attempt...
What this is for The learning coefficient (LC), or RLCT, is a quantity from singular learning theory that can help to quantify the "complexity" of deep learning models, among other things. This guide is primarily intended to help people interested in improving learning coefficient estimation get up to speed with...
The polytope theory of neural networks (also known as the "spline theory of deep learning") seeks to explain (ReLU) neural networks based on their piecewise linear regions. This gives a helpful intuition for how neural networks approximate functions, as well as a potential avenue for interpretability research. For anyone who...
> Produced under the mentorship of Evan Hubinger as part of the SERI ML Alignment Theory Scholars Program - Winter 2022 Cohort. > > > Thank you to @Mark Chiu and @Quintin Pope for feedback. Machine learning is about finding good models: of the world and the things in it;...