Differential Optimization Reframes and Generalizes Utility-Maximization

2Charlie Steiner

1J Bostock

New Comment

Neat. Why is it worth calling U "utility," or even utility-like, though? If I tell you the set of things that I observe that significantly change my behavior, this tells you a lot about me but it doesn't tell you which function of these observations I'm using to make decisions.

E.g. both teams in a soccer game will respond to the position of the ball (the ball is in U - or some relaxed notion of it, since I think your full notion might be too strong), but want to do different things with it.

I think the position of the ball is in V, since the players are responding to the position of the ball by forcing it towards the goal. It's difficult to predict the long-term position of the ball based on where it is now. The position of the opponent's goal would be an example of something in U for both teams. In this case both team's utility-functions contain a robust pointer to the goal's position.

Consider the following world: a set P of past nodes, a set of environment nodes E a set of actor/agent nodes A, and a set of future nodes F. Consider our previously defined function Op(A; p, f) for p∈P,f∈F, which represents how much A is optimizing the value of f with respect to p. For some choices of A and f, we might find that the values Op(A; p, f) allow us to split the members of P into two categories: those for which Op≫0, those with Op≈0, and those with Op≪0.

## Picking

PApartWe can define Vi={p∈P∣Op(A; p,fi)≫0}. In other words, Vi is the subset of P which has no local influence on the future node fi whatsoever. If you want to be cute you can think of Vi as the victim of A with respect to fi, because it is causally erased from existence.

Remember that Op is high when "freezing" the information going out of A means that the value of fi depends much

moreon the value p. What does it mean when "freezing" the information of A does the opposite?Now lets define Ui={p∈P∣Op(A; p, fi)≪0}. This means that when we freeze A, these nodes have much

lessinfluence on the future. Their influenceflows throughA. We can call Ui the utility-function-like region of P with respect to fi. We might also think of the actions of A as amplifying the influence of Ui on fi.For completeness we'll define Xi={p∈P∣Op(A; p, fi)≈0}≡P−U−V.

Let's now define U=U0∪...∪Un i.e. the set of points which are utility-like for

anyfuture point. V=V0∪...∪Vm will be the set of points which are victim-like foranyfuture points. We might want to define V′=V0∩...∩Vm as the set of "total" victims and U′=U0∩...∩Um similarly. X=P−U−V meaning X only contains nodes which really don't interact with A at all in an optimizing/amplifying capacity.We can also think of U and V as the sets of past nodes which have an outsizedly

largeandsmallinfluence on the future as a result of the actions of A, respectively.## Describing

DD={d∈E∪F∣∃v∈V∣Op(A; v, d)≫0}, in other words D is the region of E∪F in which A is removing the influence of V. We can even define a smaller set D′={d∈E∪F∣∀v∈V′ Op(A; v, d)≫0}, in other words the region in which A is totally removing the (local) influence of the total victim nodes V′. We can call D the domain of A.

## Approaching

AOne issue which we haven't dealt with is how to actually pick the set A! We sort of arbitrarily declared it to be relevant at the start with no indication as to how we would do it.

There are lots of ways to pick out sets of nodes in a causal network. One involves thinking about minimum information-flow in a very John Wentworth-ish way. Another might be to just start with a very large choice of A and iteratively try moving nodes from A to E. If this shrinks D by a lot then this node might be important for A, otherwise it might not be. This might not always be true though!

Perhaps sensible choices of A are ones which make V and V′ more similar, and U and U′ more similar i.e. those for which nodes in P are cleanly split into utility-like and victim-like nodes.

Either way, it seems like we can probably find ways to find agents in a system.

## Conjectures!

I conjecture that if A is a powerful optimizer, this will be expressed by having a large set D. It will also be expressed by having the values of Op(A;v,d) be large.

I conjecture that V, and especially V′ nodes get a pretty rough deal, and that it is bad to be a person living in V′.

I conjecture that the only nodes which get a good deal out of the actions of A are in U, and that for an AI to be aligned, U needs to contain the AI's creator(s).

## How does this help us think about optimizers?

The utility-function framework has not always been a great way to think about things. The differential optimization framework might be better, or failing that, different. The phrase "utility function" often implies a function which explicitly maps [world]→R and is explicitly represented in code. This framework defines U in a different way: as the set of regions of the past which have an oversized influence on the future via an optimizer A.

Thinking about the number of nodes in V and D (and how well the latter are optimized with respect to the former) also provides a link to Eliezer's definition of optimization power found here. The more nodes in V, the more information we are at liberty to discard; the more nodes in D, the more of the world we can still predict; and the more heavily optimized the nodes of D, the smaller our loss of predictive power.