Thoughts and problems with Eliezer's measure of optimization power

OP measures the system's ability to achieve its goals, not how hard these goals are.

Not really. It's often largely a measure of how big the specified solution space is - e.g.:

What's the smallest prime larger than a million, and smaller than a billion?
What's the smallest prime larger than a million, and smaller than a trillion?

Essentially the same problem, bigger solution space - more EOP [Eliezer Optimization Power].

Thanks, have updated the text with a variant of that problem.

A concept I've played with, coming off of Eliezer's initial take on the problem of formulating optimization power, is: Suppose something generated N options randomly and then chose the best. Given the observed choice, what is the likelihood function for N?

For continuously distributed utilities, this can be computed directly using beta distributions. Beta(N, 1) is the probability density for the highest of N uniformly distributed unit random numbers. This includes numbers which are cumulative probabilities for a continuous distribution at values drawn from that distribution, and therefore numbers which are cumulative probabilities at the goodness of an observed choice. (N doesn't have to be an integer, because beta distributions are defined for non-integer parameters.)

(The second step of this construction, where you attach a beta distribution to another distribution's CDF, I had to work out by myself; it's not directly mentioned in any discussions of extreme value statistics that I could find. The Mathworld page on order statistics, one step of generalization away, uses the formula for a beta CDF transformed by another CDF, but it still doesn't refer to beta distributions by name.)

If the utilities are discretely distributed, you have to integrate the beta density over the interval of cumulative probabilities that invert to the observed utility.

To handle choices of mixtures, I guess you could modify this slightly, and ask about the likelihood function for N given the observed outcome, marginalizing over (as User:Kindly also suggests) the (unobserved) choice of option. This requires a distribution over options and a conditional distribution over observations given options. This would also cover situations with composite options where you only observe one of the aspects of the chosen option.

Opposed optimization might be very crudely modeled by increasing the number on the opposite side of the beta distribution from N. Somewhat related to this is Warren Smith's "Fixed Point for Negamaxing Probability Distributions on Regular Trees", which examines the distributions of position values that result when two opponents take turns choosing the worst option for each other.

Alternatively, instead of a likelihood function for N, you could have a likelihood function for an exponential weighting λ on the expected utility of the option:

Pr(A was chosen)/Pr(B was chosen) ∝ exp(λ(U(A)-U(B))).

Higher values of λ would be hypotheses in which better options were more strongly randomly selected over worse ones. (This is something like a logit-response model, for which λ (or "β", or "1/μ") would be the "rationality" parameter. It might be more familiar as a Gibbs distribution.) But this would fail when the expected utility from the null distribution was heavy-tailed, because then for some λ≠0 the distribution of optimized expected utilities would be impossible to normalize. Better would be for the cumulative probability at the expected utility of the chosen option to be what was exponentially weighted by λ. In that case, in the limit λ = (N-1) >> 1, the two models give the same distribution.

All of these statistics, as well as Eliezer's original formulation, end up encoding equivalent information in the limit where the utility of each option is an independent sum of many identical light-tailed-distributed components and you're predicting a marginal distribution of utilities for one of those components. In this limit you can safely convert everything to a statistical mechanics paradigm and back again.

Of course, the real criterion for a good formulation of optimization power is whether it helps people who use it in an argument about things that might be optimizers, or who hear it used in such an argument, to come to truthful conclusions.

In this respect, likelihood functions can have the problem that most people won't want to use them: they're hard to compute with, or communicate, unless they belong to a low-dimensional family. The likelihood functions I suggested won't do that except under very simple conditions. I'm not sure what the best way would be to simplify them to something lower-dimensional. I guess you could just communicate a maximum-likelihood estimate and precision for the optimization power parameter. Or, if you chose a reference prior over optimization power, you could communicate its posterior mean and variance.

All of this presupposes that the problems with the unoptimized probability measure can be dealt with. Maybe it would work better to describe the optimization power of a system in terms of a series of levels of simpler systems leading up to that system, where each level's new amount of optimization was only characterized approximately, and only relative to something like a distribution of outputs from the previous level. (This would at least patch the problem where, if thermodynamics is somehow involved in the results of an action, that action can count as very powerful relative to the uniform measure over the system's microstates.) If optimizer B is sort of like choosing the best of N options generated by optimizer A, and optimizer C is sort of like choosing the best of M options generated by optimizer B, that might not have to mean that optimizer C is much like choosing the best of N*M options generated by optimizer A.

[-]jsalvatier14y40

Interesting discussion. What do want to do with this measure once you have it? I can see how it would be elegant to have it, but it seems useful to think about how you would use this measure to inform your decisions.

[-]timtyler14y00

Those building optimization algorithms need measures of their performance on the target problem. They generally measure space-time resources to find a solution. AFAIK, few bother with measuring solution quality Eliezer-style. Counting the better solutions not found is often extremely expensive. Dividing by the size of the soultion space is usually pointless.

[-]Stuart_Armstrong14y00

Well, it's always good to have a measure of intelligence, it we're worrying about high intelligence beings. Also, I was hoping that it might give a way of formulating "reduced impact AI". Alas, it seems to be insufficient.

[-]John_Maxwell13y20

Would it make sense for the measure of real-world intelligence to be "optimization power per unit time" or similar? Given arbitrarily large amounts of time, I could do an exhaustive search of the solution space or something like that, which isn't very intelligent or useful.

Another point: It doesn't seem like optimization power is usefully defined for every possible search space. Let's say our search space is countably infinite, each item corresponds to a single unique natural number, and each item's score is equal to its corresponding natural number. I think for a while and come up with the solution that has score 1 million. What's my optimization power? 0? Regardless of the solution I come up with, my optimization power is going to be 0, even though I vastly prefer some solutions to others. (I don't know how much of a problem this would be in practice... it seems like it might depend on the "shape of the infinitude" of a given solution space.)

(Let me know if these don't make sense for some reason; I haven't taken the time to understand EY's idea of optimization power in depth.)

[-]amcknight14y20

If OP were an entropy, then we'd simply do a weighted sum 1/2(OP(X4)+OP(X7))=1/2(1+3)=2, and then add one extra bit of entropy to represent our (binary) uncertainty as to what state we were in, giving a total OP of 3.

I feel like you're doing something wrong here. You're mixing state distribution entropy with probability distribution entropy. If you introduce mixed states, shouldn't each mixed state be accounted for in the phase space that you calculate the entropy over?

[-]Stuart_Armstrong14y10

If you down the "entropy is ignorance about the exact microstate" route, this makes perfect sense. And various people have made convincing sounding arguments that this is the right way to see entropy, though I'm not expert myself.

[-]amcknight14y00

I'm not an expert either. However, the OP function has nothing to do with ignorance or probabilities until you introduce them in the mixed states. It seems to me that this standard combining rule is not valid unless you're combining probabilities.

[-]Stuart_Armstrong14y00

Hence OP is not an entropy.

[-]Kindly14y20

I think a lot of the problems come from starting with final states. Instead, we can let Y1, Y2, ... be the possible outputs of the system S, each with a certain utility attached. There is no such thing as mixed states: you can't half-output of Y1 and half-output Y2 (in some cases, you might have a probabilistic strategy which outputs Y1 50% of the time, but the utility calculation for that is different). Furthermore, there is no need to deal with the coarse-graining issue.

[-]Shmi14y00

How do you assign utility to an output that is a mixed final state? As a weighted sum of utilities?

[-]Kindly14y00

Presumably. It's debatable whether or not this captures risk aversion, but in my opinion it does (and risk aversion falls out of nonlinear utility).

But one thing to keep in mind is that if there is an output Y1 that leads to a final state X1, and an output Y2 that leads to a final state X2, then there is not necessarily an output leading to 0.5X1+0.5X2.

[-]Martín Soto3y10

Even if we just restrict to a tiny, non-zero area, around X₇, we get arbitrarily high OP - it's not a fluke or a calculation error. Which means that if we followed the distributive law, Q=(1-10^-1000) X₀ + 10^-1000 X₇ must have a much larger OP than X₆ - despite the fact that nearly every possible outcome is better than Q.

It's not weird that the discrete probability distribution X_7 has infinite OP, because in the real world we never get absolute certainty.

And anyway, that's obviously not how probabilities on the simplex are supposed to work, right? Just define a probability density on the simplex of probabilities (for example, uniform on the simplex), and for any given p in the simplex, its OP(p) is the reciprocal of the integral (using that measure) of the set {q | EV(q)>= EV(p)} (we do need utility for this, not only a preference ordering). Drawing a simplex for U(X_n)=n shows why we wouldn't even naturally expect distributivity to hold.

[-]MinibearRex13y10

Similarly, OP measures the system's ability to achieve its very top goals, not how hard these goals are. A system that wants to compose a brilliant sonnet has more OP than exactly the same system that wants to compose a brilliant sonnet while embodied in the Andromeda galaxy. Even though the second is plausibly more dangerous. So OP is a very imperfect measure of how powerful a system is.

I'm confused. A system that has to compose a brilliant sonnet and make sure that it exists in the Andromeda galaxy has to hit a smaller target of possible worlds than a system that wants to compose a brilliant sonnet, and doesn't care where it ends up. Achieving more complex goals require more optimization power, in Eliezer's sense, than achieving simple goals.

[-]amcknight14y10

It seems that optimization power as it's currently defined would be a value that doesn't change with time (unless the agent's preferences change with time). This might be fine depending what you're looking for, but the definition of optimization power that I'm looking for would allow an agent to gain or lose optimization power.

[-]amcknight14y10

Notice that OP is defined in terms of the state that S achieved. So it give us a measure of how powerful S is in practice in our model, not some platonic measure of how good S is in general situations. This does not seem a drawback to OP: after all, we want to measure how powerful a system actually is, not how powerful it could be in other circumstances.

I think that if you're looking for a useful measure of optimization power, you will not want to use actual achievement rather than potential achievement if you want to have a nicely encapsulated concept that doesn't include properties of the rest of the environment. Clearly the dangerous optimizers are the ones with actual power rather than merely potential power, but I think it's much more clear to just talk about optimization power as potentially dangerous, given an environment conducive to that optimizer.

[-]Cyan14y10

Measuring optimization power in terms of achieved outcome is wrong for the same reason that attributing a highly desired outcome to rationality is wrong. Even if I win the lottery, it was still a negative expected dollars choice to play the lottery. What we really want is something like the minimum utility outcome a process will achieve where the minimum is taken over a given a set of starting states. Or we could have a probability distribution over starting states and get the expected utility of the outcome. ETA: oops, I don't mean utility of the outcome, I mean measure of the set of outcomes with greater utility than the minimum or expected or whatever.

The problem of the size of the solution space isn't a problem if we content ourselves with comparing optimization power of different processes (I think...).

[-]Stuart_Armstrong14y30

Even if I win the lottery, it was still a negative expected dollars choice to play the lottery.

At that point in the post, I hadn't introduced mixed states. So the only options are to choose among fully deterministic worlds. Obviously if you had the choice between "win the lottery or don't win the lottery", then you'd go for the first.

Later when I bring in mixed states you can model such things as "buy a lottery ticket" as the achieved state, which generally has expected disutility.

[-]Alex_Altair14y10

I'm confused.

If OP were an entropy, then we'd simply do a weighted sum 1/2(OP(X4)+OP(X7))=1/2(1+3)=2, and then add one extra bit of entropy to represent our (binary) uncertainty as to what state we were in

Why do we add the extra bit? Doesn't the weighted sum already represent that uncertainty?

[-]Manfred14y60

Suppose the X's had 0 entropy each - that is, they were states with no "internal moving parts," like an electron.

Now imagine that you introduced ignorance into the problem - now we don't know if the electron is in state 4 or state 7, so you assign each P=0.5. What is the entropy of this distribution?

Well, it turns out the entropy (amount of ignorance) is 1 bit. Which is 1 bit more than the 0 bits of entropy that states 4 and 7 had on their own.

[-]amcknight14y00

It turns out we can't. Imagine that, without changing anything else, the utility of X7 is suddenly set to ten trillion, rather than 7. The OP of X7 is still 3 - it's still the best option, still with probability 1/8.

I think a problem with this line of attack is that you are mixing preferences and utilities. You could imagine two types of optimization power. A preference-centric one and a utility-centric one, both of which can be useful depending what you're talking about. You can map preferences to utilities and utilities to preferences, but one may be more natural than the other for your purposes.

[-]amcknight14y00

An additional problem with the original definition of optimization power is that it requires the agent to have a rich set of preferences. If it doesn't prefer much but would have been very effective with its capacities had it had more preferences, I'd still say the optimization process was powerful. It seems to me that the most useful concept of optimization power will be independent of the optimizer's actual preferences (or utility).

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

36

Thoughts and problems with Eliezer's measure of optimization power

36

36

The system, the whole system, and everything else in the universe

Is it OP an entropy? Is it defined for mixed states?

Measures and coarse-graining