Inner Optimization Mechanisms in Neural Nets

4Hastings

New Comment

I've laid out a concrete example of this at https://www.lesswrong.com/posts/FgXjuS4R9sRxbzE5w/medical-image-registration-the-obscure-field-where-deep , following the "optimization on a scaffold level" route. I found a real example of a misaligned inner objective outside of RL, which is cool

I believe that current architecture of neural networks supports mesa-optimization: generally speaking, searching across some vectors in order to select one of them, which will be most useful for producing an answer.

Three inner optimization ways are already possible, and most likely there will be new ones.

If subnetwork has inputs A and B, it's pretty easy to output max(A;B). Additional information can be selected either by squeezing it into input numbers, or by building a slightly bigger subnetwork.

## Actual subnetwork design for point #3

Let's suppose the neural network consists of layers, as common now - composition of matrix multiplication and activation function - and activation function ReLU(x)=max(0;x).

A<B⟶C=0,D=B⟶E=B=max(A;B)

A≥B⟶C=A−B,D=B⟶E=A=max(A;B)

This construction can be extended to select maximum out of 2k/2 options in k layers; possibly, even 2k−1 options.

## Conclusion

I believe that inner optimization might exist in current neural networks, and that it can be used as evidence to approximate what future AIs can do at what levels of capability.