Toward Statistical Mechanics Of Interfaces Under Selection Pressure

David Lorell

Imagine using an ML-like training process to design two simple electronic components, in series. The parameters control the function performed by the first component, and the parameters $θ^{2}$ control the function performed by the second component. The whole thing is trained so that the end-to-end behavior is that of a digital identity function: voltages close to logical 1 are sent close to logical 1, voltages close to logical 0 are sent close to logical 0.

Background: Signal Buffering

We’re imagining electronic components here because, for those with some electronics background, I want to summon to mind something like this:

This electronic component is called a signal buffer. Logically, it’s an identity function: it maps 0 to 0 and 1 to 1. But crucially, it maps a wider range of logical-0 voltages to a narrower (and lower) range of logical-0 voltages, and correspondingly for logical-1. So if noise in the circuit upstream might make a logical-1 voltage a little too low or a logical-0 voltage a little too high, the buffer cleans that up, pushing the voltages closer to their ideal values.

This is a generalizable point about interfaces in scalable systems: for robustness and scalability, components need to accept less-precise inputs and give more-precise outputs.

That’s the background mental picture I want to invoke. But now, I want to combine it with an ML-like mental picture of training a system to match particular input/output behavior.

Back To The Original Picture: Introducing Interfaces

$θ^{1}$ chooses the function performed by the first component, $θ^{2}$ chooses the function performed by the second component; the colored curves show some possible functions for the two components. The whole system is trained to have a particular end-to-end behavior.

Here’s a conceptual story.

There are three interfaces - “APIs”, we’ll call them. The first ( $A P I^{1}$ ) is at the input of the whole system, the second ( $A P I^{2}$ ) between the two components, and the last ( $A P I^{3}$ ) is at the output of the whole system. At each of those APIs, there’s a set of “acceptable” voltages for each logical input to the full system (i.e. 0 or 1).

The APIs constrain the behavior of each component - e.g. component 1 is constrained by $A P I^{1}$ (which specifies its inputs) and $A P I^{2}$ (which specifies its outputs).

Let's put some math on that, with some examples.

A set of APIs might look like:

$A P I^{1} : (0 \mapsto [0 V, 1.2 V], 1 \mapsto [3.5 V, 5.0 V])$ - i.e. the full system accepts either a voltage between 0 and 1.2 volts (representing logical 0), or a voltage between 3.5 and 5.0 volts (representing logical 1). No particular behavior is guaranteed for other voltages.
$A P I^{2} : (0 \mapsto [0 V, 0.5 V] \cup [4.6 V, 5.0 V], 1 \mapsto [2.6 V, 3.8 V])$ - i.e. in the middle the system uses extreme voltages (either above 4.6 or below 0.5 volts) to represent logical 0, and middling voltages to represent logical 1. Weird, but allowed.
$A P I^{3} : (0 \mapsto [0 V, 0.5 V], 1 \mapsto [2.8 V, 5.0 V])$ - i.e. a narrower range of low voltages but wider range of high voltages, compared to the input. This might not be the most useful circuit behavior, but it’s an allowed circuit behavior.

(For simplicity, we’ll assume all voltages are between 0V and 5V). In order for the system to satisfy those particular APIs:

Component 1 must map every value in $A P I^{1} (0)$ to a value in $A P I^{2} (0)$ , and every value in $A P I^{1} (1)$ to a value in $A P I^{2} (1)$ - i.e. any value less than 1.2V must be mapped either below 0.5V or above 4.6V, while any value above 3.5V must be mapped between 2.6 and 3.8V.
Component 2 must likewise map every value in $A P I^{2} (0)$ to a value in $A P I^{3} (0)$ , and every value in $A P I^{2} (1)$ to a value in $A P I^{3} (1)$ .

Using $f^{i}$ for component $i$ and writing it out mathematically: the components satisfy a set of APIs if and only if

$\forall b \in 0, 1, x \in A P I^{i} (b) : f^{i} (x, θ^{i}) \in A P I^{i + 1} (b)$

That’s a set of constraints on $θ^{i}$ , for each component $i$ .

The Stat Mech Part

So the APIs put constraints on the components. Furthermore, subject to those constraints, the different components decouple: component 1 can use any parameters $θ^{1}$ internally so long as it satisfies the API set (specifically $A P I^{1}$ and $A P I^{2}$ ), and component 2 can use any parameters $θ^{2}$ internally so long as it satisfied the API set (specifically $A P I^{2}$ and $A P I^{3}$ ).

Last big piece: putting on our stat mech/singular learning theory hats, we educatedly-guess that the training process will probably end up with an API set which can be realized by many different parameter-values. A near-maximal number of parameter values, probably.

The decoupling now becomes very handy. Let’s use the notation $H (Θ | < c o n s t r a i n t s >)$ - you can think of it as the log number of parameter values compatible with the constraints, or as entropy or relative entropy of parameters given constraints (if we want to weight parameter values by some prior distribution, rather than uniformly). Because of the decoupling, we can write H as

$H (Θ | A P I) =$

$H (Θ^{1} | \forall b \in 0, 1, x \in A P I^{1} (b) : f^{1} (x, θ^{1}) \in A P I^{2} (b))$

$+ H (Θ^{2} | \forall b \in 0, 1, x \in A P I^{2} (b) : f^{2} (x, θ^{2}) \in A P I^{3} (b))$

So there’s one term which depends only on component 1 and the two APIs adjacent to component 1, and another term which depends only on component 2 and the two APIs adjacent to component 2.

Our stat-mech-ish prediction is then that the training process will end up with a set of APIs for which $H (Θ | A P I)$ is (approximately) maximal.

Why Is This Interesting?

What we like about this mental model is that it bridges the gap between stat mech/singular learning theory flavored intuitions (i.e. training finds structure compatible with the most parameters, subject to constraints) and internal structures in the net (i.e. internal interfaces). This feels to us like exactly the gap which needs to be crossed in order for stat mech flavored tools to start saying big things about interpretability.

LESSWRONG
LW