[ Question ]

Positive Feedback -> Optimization?

by johnswentworth1 min read16th Mar 20206 comments


Ω 8

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

When I imagine the beginnings of life on earth, I imagine a handful of molecules which just-so-happen to catalyze reactions which produce more of those same molecules. The more of molecule X there is, the more molecule X is produced. The chemical kinetic equations contain a positive feedback loop (aka instability).

There might also be other molecules which catalyze their own production. If Y catalyzes its own production more efficiently than X, then we eventually expect to see more Y than X. Still at the level of chemical kinetics, we’d say that the equations contain multiple positive feedback loops, and the feedback loop for Y has a faster doubling time than that for X.

We haven’t used the word “fitness” here at all. We’re quite literally talking about eigenvalues of a matrix (i.e. the Jacobian of the kinetic equations of the chemical system) - those are what determine the relevant doubling times, at least close to ambient steady-state concentrations. As we move away from ambient concentrations, the math will get a bit more complicated, but the qualitative idea remains: we’re talking about positive feedback loops, without any explicit mention of fitness or optimization.

But it sure seems like there should be a higher level of abstraction, at which the positive feedback loops for X and Y are competing, Y eventually wins out due to higher fitness, and there’s some meaningful sense in which the system is optimizing for fitness.

More generally, whenever there’s a dynamical system containing multiple instabilities (i.e. positive feedback loops), it seems like there should be a canonical way to interpret that system as multiple competing subsystems, under selection, optimizing for some kind of fitness function. I’d like a way to take a dynamical system containing positive feedback loops, and say both (a) what the competing subsystems are, and (b) what fitness function it’s implicitly maximizing.

Something like this would likely be useful in a number of areas:

  • Alignment: notice implicit optimization by looking for dynamic instabilities (e.g. instabilities in imperfect search).
  • Agent foundations: formulate “agents” as self-reinforcing feedback loops in dynamical systems. Tying effective self-reinforcement to world-models would probably be a key piece (e.g. along these lines).
  • Biology: generalize evolutionary theory
  • Economics: ground economic theory in selection effects (e.g. along these lines) rather than ideal agents, allowing it to apply much more broadly.


Ω 8

New Answer
Ask Related Question
New Comment