# Ω 11

Logic & Mathematics Utility Functions
Frontpage
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

(This post idea is due entirely to Scott Garrabrant, but it has been several years and he hasn't written it up.)

In 2009, Vladimir Nesov observed that probability can be mixed up with utility in different ways while still expressing the same preferences. The observation was conceptually similar to one made by Jeffrey and Bolker in the book The Logic of Decision, so I give them intellectual priority, and refer to the result as "Jeffrey-Bolker rotation".

Based on Nesov's post, Scott came up with a way to represent preferences as vector-valued measures, which makes the result geometrically clear and mathematically elegant.

## Vector Valued Preferences

As usual, we think of a space of events which form a sigma algebra. Each event has a probability and an expected utility associated with it. However, rather than dealing with directly, we define . Vladimir Nesov called "shouldness", but that's fairly meaningless. Since it is graphed on the y-axis, represents utility times probability, and is otherwise fairly meaningless, a good name for it is "up". Here is a graph of probability and upness for some events, represented as vectors:

(The post title is a pun on the fact that this looks like the complex plane: events are complex numbers with real component P and imaginary component Q. However, it is better to think of this as a generic 2D vector space rather than the complex plane specifically.)

If we assume and are mutually exclusive events (that is, ), then calculating the P and Q of their union is simple. The probability of the union of two mutually exclusive events is just the sum:

The expected utility is the weighted sum of the component parts, normalized by the sum of the probabilities:

The numerator is just the sum of the shouldnesses, and the denominator is just the probability of the union:

But, we can multiply both sides by the denominator to get a relationship on shouldness alone:

Thus, we know that both coordinates of are simply the sum of the component parts. This means union of disjoint events is vector addition in our vector space, as illustrated in my diagram earlier.

## Linear Transformations

When we represent preferences in a vector space, it is natural to think of them as basis-independent: the way we drew the axes was arbitrary; all that matters is the system of preferences being represented. What this ends up meaning is that we don't care about linear transformations of the space, so long as the preferences don't get reflected (which reverses the preference represented). This is a generalization of the usual "utility is unique up to affine transformations with positive coefficient": utility is no longer unique in that way, but the combination of probability and utility is unique up to non-reflecting linear transformations.

Let's look at that visually. Multiplying all the expected utilities by a positive constant doesn't change anything:

Adding a constant to expected utility doesn't change anything:

Slightly weird, but not too weird... multiplying all the probabilities by a positive constant (and the same for Q, since Q is U*P) doesn't change anything (meaning we don't care if probabilities are normalized):

Here's the really new transformation, which can combine with the other 4 to create all the valid transformations. The Jeffrey-Bolker rotation, which changes what parts of our preferences are represented in probabilities vs utilities:

Let's pause for a bit on this one, since it is really the whole point of the setup. What does it mean to rotate our vector-valued measure?

A simple example: suppose that we can take a left path, or a right path. There are two possible worlds, which are equally probable: in Left World, the left path leads to a golden city overflowing with wealth and charity, which we would like to go to with V=+1. The right path leads to a dangerous badlands full of bandits, which we would like to avoid, V=-1. On the other hand, Right World (so named because we would prefer to go right in this world) has a somewhat nice village on the right path, V=+.5, and a somewhat nasty swamp on the left, V=-.5. Supposing that we are (strangely enough) uncertain about which path we take, we calculate the events as follows:

• Go left in left-world:
• P=.25
• V=1
• Q=.25
• Go left in right-world:
• P=.25
• V=-.5
• Q=-.125
• Go right in left-world:
• P=.25
• V=-1
• Q=-.25
• Go right in right-world:
• P=.25
• V=.5
• Q=.125
• Go left (union of the two left-going cases):
• P=.5
• Q=.125
• V=Q/P=.25
• Go right:
• P=.5
• Q=-.125
• V=Q/P=-.25

We can calculate the V of each action and take the best. So, in this case, we sensibly decide to go left, since the Left-world is more impactful to us and both are equally probable.

Now, let's rotate 30°. (Hopefully I get the math right here.)

• Left in L-world:
• P=.09
• Q=.34
• V=3.7
• Left in R-world:
• P=.28
• Q=.02
• V=.06
• Right in L-world:
• P=.34
• Q=-.09
• V=-.26
• Right in R-world:
• P=.15
• Q=.23
• V=1.5
• Left overall:
• P=.37
• Q=.36
• V=.97
• Right overall:
• P=.49
• Q=.14
• V=.29

Now, it looks like going left is evidence for being in R-world, and going right is evidence for being in L-world! The disparity between the worlds has also gotten larger; L-world now has a difference of almost 4 utility between the different paths, rather than 2. R-world now evaluates both paths as positive, with a difference between the two of only .9. Also note that our probabilities have stopped summing to one (but as mentioned already, this doesn't matter much; we could normalize the probabilities if we want).

In any case, the final decision is exactly the same, as we expect. I don't have a good intuitive explanation of what the agent is thinking, but roughly, the decreased control the agent has over the situation due to the correlation between its actions and which world it is in seems to be compensated for by the more extreme payoff differences in L-world.

## Rational Preferences

Alright, so preferences can be represented as vector-valued measures in two dimensions. Does that mean arbitrary vector-valued measures in two dimensions can be interpreted as preferences?

No.

The restriction that probabilities be non-negative means that events can only appear in quadrants I and IV of the graph. We want to state this in a basis-independent way, though, since it is unnatural to have a preferred basis in a vector space. One way to state the requirement is that there must be a line passing through the (0,0) point, such that all of the events are strictly to one side of the line, except perhaps events at the (0,0) point itself:

As illustrated, there may be a single such line, or there may be multiple, depending on how closely preferences hug the (0,0) point. The normal vector of this line (drawn in red) can be interpreted as the dimension, if you want to pull out probabilities in a way which guarantees that they are non-negative. There may be a unique direction corresponding to probability, and there may not. Since , we get a unique probability direction if and only if we have events with both arbitrarily high utilities and arbitrarily low. So, Jeffrey-Bolker rotation is intrinsically tied up in the question of whether utilities are bounded.

Actually, Scott prefers a different condition on vector-valued measures: that they have a unique (0,0) event. This allows for either infinite positive utilities (not merely unbounded -- infinite), or infinite negative utilities, but not both. I find this less natural. (Note that we have to have an empty event in our sigma-algebra, and it has to get value (0,0) as a basic fact of vector-valued measures. Whether any other event is allowed to have that value is another question.)

How do we use vector-valued preferences to optimize? The expected value of a vector is the slope, . This runs into trouble for probability zero events, though, which we may create as we rotate. Instead, we can prefer events which are less clockwise:

(Note, however, that the preference of a (0,0) event is undefined.)

This gives the same answers for positive-x-value, but keeps making sense as we rotate into other quadrants. More and less clockwise always makes sense as a notion since we assumed that the vectors always stay to one side of some line; we can't spin around in a full circle looking for the best option, because we will hit the separating line. This allows us to define a preference relation based on the angle of being within 180° of 's.

## Conclusion

This is a fun picture of how probabilities and utilities relate to each other. It suggests that the two are inextricably intertwined, and meaningless in isolation. Viewing them in this way makes it somewhat more natural to think that probabilities are more like "caring measure" expressing how much the agent cares about how things go in particular worlds, rather than subjective approximations of an objective "magical reality fluid" which determines what worlds are experienced. (See here for an example of this debate.) More practically, it gives a nice tool for visualizing the Jeffrey-Bolker rotation, which helps us think about preference relations which are representable via multiple different belief distributions.

A downside of this framework is that it requires agents to be able to express a preference between any two events, which might be a little absurd. Let me know if you figure out how to connect this to complete-class style foundations which only require agents to have preferences over things which they can control.