Optimisation Measures: Desiderata, Impossibility, Proposals

Alexander Gietelink Oldenziel

I'm very confused about why we think zero for unchanged expected utility and strict mononicity are reasonable.

A simple example: I want to maximize expected income. I have actions including "get a menial job," and "rob someone at gunpoint and get away with it," where the first gets me more money. Why would I assume that the second requires less optimization power than the first?

[-]mattmacdermott2y40

Is the general point that optimisation power should be about how difficult a state of affairs is to achieve, not how desirable it is?

I think that's very reasonable. The intuition going the other way is that maybe we only want to credit useful optimisation. If you neither enjoy robbing banks nor make much money from it, maybe I'm not that impressed about the fact you can do it, even if it's objectively difficult to pull off.

Another point is that we can sort of use the desirability of the state of affairs someone manages to achieve as a proxy for how wide a range of options they had at their disposal. This doesn't apply to the difficulty of achieving the state of affairs, since we don't expect people to be optimising for difficulty. This is an afterthought, though, and maybe there would be better ways to try to measure someone's range of options.

[-]Arthur Conmy2y30

1. I would have thought that VNM utility has invariance with alpha>0 not alpha>=0, is this correct?

2. Is there any alternative to dropping convex-linearity (perhaps other than changing to convexity, as you mention)? Would the space of possible optimisation functions be too large in this case, or is this an exciting direction?

[-]Alexander Gietelink Oldenziel2y50

Correct
Convexity rather than linearity would make OP an infra-expectation. It's not something we've looked into but perhaps somebody may find something interesting there.

[-]mattmacdermott2y10

Changed 1, thanks.

You definitely wouldn't want to drop invariance, I think. Probably zero for unchanged expected utility and strict monotocity could go, but I think you would need a conceptual argument about what you want OP to measure in order to constrain the search space a bit.

[-]Richard_Kennaway2y31

Your formula in the proof of Proposition 1 is scaling invariant but not translation invariant:

Should it be this?:

$r e p (p, [u]) = \frac{u (x) - u (x_{2})}{u (x_{1}) - u (x_{2})}$

[-]mattmacdermott2y30

Thanks, should be fixed now.

It's not that we needed to add a translation here to end up with the right definition of in terms of $u$ , but with the way we had written it $rep$ wasn't a well-defined function of equivalence classes. We had restated proposition 1 to try to make things cleaner, but turns out it messed things up so we've reverted to the previous statement. Hopefully it should all work now.

[-]Alex_Altair2y20

Utility functions might already be the true name - after all, they do directly measure optimisation, while probability doesn't directly measure information.
The true name might have nothing to do with utility functions - Alex Altair has made the case that it should be defined in terms of preference orderings instead.

My vote here is for something between "Utility functions might already be the true name" and "The true name might have nothing to do with utility functions".

It sounds to me like you're chasing an intuition that is validly reflecting one of nature's joints, and that that joint is more or less already named by the concept of "utility function" (but where further research is useful).

And separately, I think there's another natural joint that I (and Yudkowsky and others) call "optimization", and this joint has nothing to do with utility functions. Or more accurately, maximizing a utility function is an instance of optimization, but has additional structure.

[-]rotatingpaguro2y10

I remembered this when I read the following excerpt in Meaning and Agency:

In Belief in Intelligence, Eliezer sketches the peculiar mental state which regards something else as intelligent:
Imagine that I'm visiting a distant city, and a local friend volunteers to drive me to the airport. I don't know the neighborhood. Each time my friend approaches a street intersection, I don't know whether my friend will turn left, turn right, or continue straight ahead. I can't predict my friend's move even as we approach each individual intersection - let alone, predict the whole sequence of moves in advance.
Yet I can predict the result of my friend's unpredictable actions: we will arrive at the airport.
[...]
I can predict the outcome of a process, without being able to predict any of the intermediate steps of the process.
In Measuring Optimization Power, he formalizes this idea by taking a preference ordering and a baseline probability distribution over the possible outcomes. In the airport example, the preference ordering might be how fast they arrive at the airport. The baseline probability distribution might be Eliezer's probability distribution over which turns to take -- so we imagine the friend turning randomly at each intersection. The optimization power of the friend is measured by how well they do relative to this baseline.
I think this can be a useful notion of agency, but constructing this baseline model does strike me as rather artificial. We're not just sampling from Eliezer's world-model. If we sampled from Eliezer's world-model, the friend would turn randomly at each intersection, but they'd also arrive at the airport in a timely manner no matter which route they took -- because Eliezer's actual world-model believes the friend is capably pursuing that goal.
So to construct the baseline model, it is necessary to forget the existence of the agency we're trying to measure while holding other aspects of our world-model steady. While it may be clear how to do this in many cases, it isn't clear in general. I suspect if we tried to write down the algorithm for doing it, it would involve an "agency detector" at some point; you have to be able to draw a circle around the agent in order to selectively forget it. So this is more of an after-the-fact sanity check for locating agents, rather than a method of locating agents in the first place.

^{^}

By measure we mean a standard unit used to express the size, amount, or degree of something, not a probability measure. Alexander voted for yardstick to avoid confusion; Matt vetoed.

^{^}

Since we assume $Ω$ is finite, there is only one reasonable topology on $Δ Ω$ and $U$ , namely the Euclidean topology.

^{^}

When $u (x_{1}) = u (x_{2})$ we have to interpret $O P$ as negative infinity, zero, or positive infinity, depending on the sign of the numerator.

^{^}

Here we distinguish de-optimisation, by which we mean something like accidental or collateral damage, from disoptimisation - deliberate pessimisation of a utility function. If we are instead interested in interpreting expected utility decreases as disoptimisation, it would be natural to define $O P_{S G}^{-} = {min}_{p^{'} ⪯ p} D_{K L} (p ∣∣ p^{'})$ i.e. the amount of disoptimisation that has taken place is the least amount of disturbance needed to do even worse.

^{^}

Yudkowsky defined $O P$ as a function of default distribution, outcome, and preference ordering; we've made it a function of default distribution, achieved distribution, and utility function by taking an expectation under the achieved distribution and using the induced preference ordering of the utility function.

^{^}

Related ideas are to consider a sequence of distributions $p_{1}, p_{2}, p_{3}$ and require something like $O P (p_{1}, p_{3}, u) = O P (p_{1}, p_{2}, u) + O P (p_{2}, p_{3}, u)$ , or to get into more exotic operadic compositionality-style axioms like the one in Theorem 5.3 here.

^{^}

Another avenue is to replace convex-linearity with convexity, in which case $O P (p, p_{a}, u)$ might arrive as an infra-expectation of $O P (p, x, u)$ if not an expectation.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

36

Optimisation Measures: Desiderata, Impossibility, Proposals

36

Ω 17

36

Ω 17

Setup

Desiderata

Impossibility

New Proposal: Garrabrant

Previous Proposals

Yudkowsky

Yudtility

Altair

Future Directions

Appendix: Proofs

Proposition 1

Proposition 2

Proposition 3

Proposition 4

36

Optimisation Measures: Desiderata, Impossibility, Proposals

36

Ω 17

36

Ω 17

Setup

Desiderata

​Impossibility

New Proposal: Garrabrant

Previous Proposals

Yudkowsky

Yudtility

Altair

Future Directions

Appendix: Proofs

Proposition 1

Proposition 2

Proposition 3

Proposition 4

Impossibility