You are probably aware of this but there is indeed a mathematical theory of degeneracy/ multiplicity in which multiplicity/degeneracy in the parameter-function map of neural networks is key to their simplicity bias. This is singular learning theory.

The connection between degeneracy [SLT] and simplicity [algorithmic information theory] is surprisingly, delightfully simple. It's given by the padding/deadcode argument.

Reply

[-]Charlie Steiner3y20

First problem to come to mind : subagent stability (sometimes it's simple to build a new agent that does your task, even if it's complicated to explain what that agent would do).

Reply

[-]DavidHolmes3y10

Hi Charlie, If you can give a short (precise) description for an agent that does the task, then you have written a short programme that solves the task. I think then if you need more space to ‘explain what the agent would do’ then you are saying there also exists a less efficient/compact way to specify the solution. From this perspective I think the latter is then not so relevant. David

Reply

[-]Thomas Kwa3y*20

The upper bound for Kolmogorov complexity of an input-output map in Dingle is not very interesting; iirc it's basically saying that a program can be constructed of the form "specify this exact input-output map using the definition already provided", and the upper bound is just the length of this program.

Also one concern is it's not clear thinking about this differentially advances alignment over capabilities.

Reply

[-]evhub3y53

I think it very clearly advantages alignment over capabilities—understanding SGD's inductive biases is one of the primary bottlenecks for inner alignment imo.

The stuff linked here is pretty old, though, e.g. this stuff predates Mingard et al..

Reply

[-]DavidHolmes3y30

Thanks very much for the link!

Reply

[-]DavidHolmes3y10

P.s. the main thing I have taken so far from the link you posted is that the important part is not exactly about the biases of SGD. Rather, it is about the structure of the DNN itself; the algorithm used to find a (local) optimum plays less of a role than the overall structure. But probably I’m reading too much into your precise phrasing.

Reply

[-]DavidHolmes3y10

Hi Thomas, I agree the proof of the bound is not so interesting. What I found more interesting were the examples and discussion suggesting that, in practise, the upper bound seems often to be somewhat tight.

Concerning differential advancement, I agree this can advance capabilities, but I suspect that advancing alignment is somewhat hopeless unless we can understand better what is going on inside DNNs. On that basis I think it does differentials advance alignment, but of course other people may disagree.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

5

Bias towards simple functions; application to alignment?

5

5

Summary

Example

Slightly more formal statement

References