**Previously in series**: Aiming at the Target

Yesterday I spoke of how "When I think you're a powerful intelligence, and I think I know something about your preferences, then I'll predict that you'll steer reality into regions that are higher in your preference ordering."

You can quantify this, at least in theory, supposing you have (A) the agent or optimization process's preference ordering, and (B) a measure of the space of outcomes - which, for discrete outcomes in a finite space of possibilities, could just consist of counting them - then you can quantify how small a target is being hit, within how large a greater region.

Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank. Dividing this by the total size of the space gives you the relative smallness of the target - did you hit an outcome that was one in a million? One in a trillion?

Actually, most optimization processes produce "surprises" that are exponentially more improbable than this - you'd need to try far more than a trillion random reorderings of the letters in a book, to produce a play of quality equalling or exceeding Shakespeare. So we take the log base two of the reciprocal of the improbability, and that gives us optimization power in bits.

This figure - roughly, the improbability of an "equally preferred" outcome being produced by a random selection from the space (or measure on the space) - forms the foundation of my Bayesian view of intelligence, or to be precise, optimization power. It has many subtleties:

(1) The wise will recognize that we are calculating the entropy of something. We could take the figure of the relative improbability of "equally good or better" outcomes, and call this the *negentropy of the system relative to a preference ordering.* Unlike thermodynamic entropy, the entropy of a system relative to a preference ordering can easily decrease (that is, the negentropy can increase, that is, things can get better over time relative to a preference ordering).

Suppose e.g. that a single switch will determine whether the world is saved or destroyed, and you don't know whether the switch is set to 1 or 0. You can carry out an operation that coerces the switch to 1; in accordance with the second law of thermodynamics, this requires you to dump one bit of entropy somewhere, e.g. by radiating a single photon of waste heat into the void. But you don't care about that photon - it's not alive, it's not sentient, it doesn't hurt - whereas you care a very great deal about the switch.

For some odd reason, I had the above insight while watching *X TV.* (Those of you who've seen it know why this is funny.)

Taking physical entropy out of propositional variables that you *care about* - coercing them from unoptimized states into an optimized states - and dumping the entropy into residual variables that you *don't care about,* means that *relative to your preference ordering*, the total "entropy" of the universe goes down. This is pretty much what life is all about.

We care more about the variables we plan to alter, than we care about the waste heat emitted by our brains. If this were *not* the case - if our preferences didn't neatly compartmentalize the universe into cared-for propositional variables and everything else - then the second law of thermodynamics would prohibit us from ever doing optimization. Just like there are no-free-lunch theorems showing that cognition is impossible in a maxentropy universe, optimization will prove futile if you have maxentropy preferences. Having maximally disordered preferences over an ordered universe is pretty much the same dilemma as the reverse.

(2) The quantity we're measuring tells us *how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities.* To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updating: it says that highly optimized events are strong evidence for optimization processes that produce them.

Ah, but how do you know a mind's preference ordering? Suppose you flip a coin 30 times and it comes up with some random-looking string - how do you know this wasn't because a mind *wanted* it to produce that string?

This, in turn, is reminiscent of the Minimum Message Length formulation of
Occam's Razor: if you send me a message telling me what a mind wants
and how powerful it is, then this should enable you to compress your
description of future events and observations, so that the total
message is shorter. *Otherwise there is no predictive benefit to
viewing a system as an optimization process*. This criterion tells us when to take the intentional stance.

(3) Actually, you need to fit another criterion to take the intentional stance - there can't be a better description that averts the need to talk about optimization. This is an epistemic criterion more than a physical one - a sufficiently powerful mind might have no need to take the intentional stance toward a human, because it could just model the regularity of our brains like moving parts in a machine.

(4) If you have a coin that always comes up heads, there's no need to say "The coin always wants to come up heads" because you can just say "the coin always comes up heads". Optimization will *beat alternative mechanical explanations* when our ability to perturb a system defeats our ability to predict its interim steps in detail, but not our ability to predict a narrow final outcome. (Again, note that this is an epistemic criterion.)

(5) Suppose you believe a mind exists, but you don't know its preferences? Then you use some of your evidence to infer the mind's preference ordering, and then use the inferred preferences to infer the mind's power, then use those two beliefs to testably predict future outcomes. The total gain in predictive accuracy should exceed the complexity-cost of supposing that "there's a mind of unknown preferences around", the initial hypothesis.

Similarly, if you're not sure whether there's an optimizer around, some of your evidence-fuel burns to support the hypothesis that there's an optimizer around, some of your evidence is expended to infer its target, and some of your evidence is expended to infer its power. The rest of the evidence should be well explained, or better yet predicted in advance, by this inferred data: this is your revenue on the transaction, which should exceed the costs just incurred, making an epistemic profit.

(6) If you presume that you know (from a superior epistemic vantage point) the probabilistic consequences of an action or plan, or if you measure the consequences repeatedly, and if you know or infer a *utility function* rather than just a p*reference ordering,* then you might be able to talk about the degree of optimization of an *action or plan* rather than just the negentropy of a final outcome. We talk about the degree to which a plan has "improbably" high expected utility, relative to a measure over the space of all possible plans.

(7) A similar presumption that we can measure the *instrumental value* of a device, relative to a terminal utility function, lets us talk about a Toyota Corolla as an "optimized" physical object, even though we attach little terminal value to it per se.

(8) If you're a human yourself and you take the measure of a problem, then there may be "obvious" solutions that don't count for much in your view, even though the solution might be very hard for a chimpanzee to find, or a snail. Roughly, because your own mind is efficient enough to calculate the solution without an apparent expenditure of internal effort, a solution that good will seem to have high probability, and so an equally good solution will not seem very *improbable.*

By presuming a base level of intelligence, we measure the improbability of a solution that "would take us some effort", rather than the improbability of the same solution emerging from a random noise generator. This is one reason why many people say things like "There has been no progress in AI; machines still aren't intelligent at all." There are legitimate abilities that modern algorithms entirely lack, but mostly what they're seeing is that AI is "dumber than a village idiot" - it doesn't even do as well as the "obvious" solutions that get most of the human's intuitive measure, let alone surprisingly better than that; it seems anti-intelligent, stupid.

To measure *the impressiveness of a solution to a human,* you've got to do a few things that are a bit more complicated than just measuring optimization power. For example, if a human sees an obvious computer program to compute many solutions, they will measure the total impressiveness of all the solutions as being no more than the impressiveness of writing the computer program - but from the internal perspective of the computer program, it might seem to be making a metaphorical effort on each additional occasion. From the perspective of Deep Blue's programmers, Deep Blue is a one-time optimization cost; from Deep Blue's perspective it has to optimize each chess game individually.

To measure human impressiveness you have to talk quite a bit about humans - how humans compact the search space, the meta-level on which humans approach a problem. People who try to take human impressiveness as their primitive measure will run into difficulties, because in fact the measure is not very primitive.

(9) For the vast majority of real-world problems we will not be able to calculate exact optimization powers, any more than we can do actual Bayesian updating over all hypotheses, or actual expected utility maximization in our planning. But, just like Bayesian updating or expected utility maximization, the notion of optimization power does give us a gold standard against which to measure - a simple mathematical idea of *what we are trying to do* whenever we essay more complicated and efficient algorithms.

(10) "Intelligence" is efficient cross-domain optimization.

"Intelligence" is efficient cross-domain optimization.It is not yet clear what distinguishes this from wealth and power, which also allow people to achieve apriori unlikely goals.

And there goes Caledonian making pointless arguments again... Couldn't you pick a more frivolous objection?

(Caledonian's comment was deleted.)

Robin, that's coming up in tomorrow's post "Efficient Cross-Domain Optimization" - for today, I just wanted to be clear that I don't directly equate intelligence to some number of bits of raw optimization power in an arbitrary domain.

Caledonian, you're not disputing anything Eliezer said yet you manage to find a way to be disrespectful. I wish you wouldn't.

I would really enjoy this post more if it were in the context of cognitive neuroscience. Or at least some phenomena actually extracted from the brain. For example, how could we detect a difference in intelligence biologically? Could this inspire a different kind of intelligence measure?

It seems to me that there's an aspect to this that isn't getting much attention: the domain.

Example domains include chess and Go, certainly. But probabilistic games surely should not be excluded. There is a spectrum of domains which go from "fair roulette" (which is not manipulatable by intelligence), though blackjack (slightly manipulable), and only at one end reach highly manipulatable games like chess and Go.

I'm sure Eliezer understands this, but his presentation doesn't spend much time on it.

For example, how do the calculations change when you admit that the domain may make some desirable situations impossible?

Im not sure if i am echoing another post by shane legg (cant remember where).

Consider a three dimensional space (topography) and a preference ordering given by height.

An optimizer that climbs a "hill space" would seems intuitively less powerful than one that finds the highest peak in a "multi hill space", even if relative to a random selection, and given that both spaces are the same size, both points are equally likely.

David, I would categorize this under the heading of "trying to measure how much effort something takes". I'm just trying to integrate over a measure, not describe the

structureof the space relative to a particular way of searching for solutions. One kind of search might bog down where another would succeed immediately - the neighborhood of a problem space as seen by a human is not like the neighborhood of DNA strands seen by natural selection. One mind's cheap problem that can be solved by a short program like water running downhill is another mind's stumper - to a transhuman mind, for example, chess might appear as a pointless exercise because you can solve it by a simple Deep-Blue like program.Indeed, there was recently developed a program that plays provably correct checkers from the canonical starting position - so it is now clear that playing checkers requires no optimization power, since you can solve it as deterministically as water running downhill. At least that's where this argument seems to me to lead.

I think you just have to construct "impressiveness" in a more complex way than "optimization power".

The quantity we're measuring tells us how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities. To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updatingThis seems to me to suggest the same fallacy as the one behind p-values... I don't want to know the tail area, I want to know the probability for the event that actually happened (and only that event) under the hypothesis of no optimization divided by the same probability under the hypothesis of optimization. Example of how they can differ: if we know in advance that any optimizer would optimize at least 100 bits, then a 10-bit-optimized outcome is evidence against optimization even though the probability given no optimization of an event at least as preferred as the one that happened is only 1/1024.

I guess it works out if any number of bits of optimization being exerted given the existence of an optimizer is as probable as any other number, but if that is the prior we're starting from then this seems worth stating (unless it follows from the rest in a way that I'm overlooking).

haha, a new anthropic principle for the ID folks: the existence of a universe highly optimized for life implies the existence of an optimizing agent.

This concept of "optimisation power" was previously mentioned here. To recap on some of the objections raised at that time:

Optimisation powersuggests something useful - but the proposed metric contains no reference to the number of trials, the number of trials in series on the critical path - or most of the other common ways of measuring the worth of optimisation processes. It seems to be more a function of the size of the problem space than anything else - in which case, why "power" and not, say "factor".Before christening the notion, there are some basic questions: Is the proposed metric any use? What is the point of it?

Surely not! Just because a problem has a known, deterministic solution, that doesn't mean it doesn't require optimisation to produce that solution.

It would be

extremelyodd epistemological terminology to classify problems as optimisation problems only if we do not already have access to their solutions.I give this article a Mentifax rating of 7.0 for incoherency and pointlessness.

|------X---|

I agree with Tim that the presence of a deterministic solution shouldn't be enough to say whether there's an optimization process going on. But then, Eliezer didn't say "Since there's a deterministic solution, no optimization is going on": it's more like "it's possible that no optimization is going on". Optimization power isn't required, but might be useful.

From The Bedrock of Morality:

For every mind that thinks that terminal value Y follows from moral argument X, there will be an equal and opposite mind who thinks that terminal value not-Y follows from moral argument X.Does the same apply to optimisation processes? In other words, for every mind that sees you flicking the switch to save the universe, does another mind see only the photon of 'waste' brain heat and think 'photon maximiser accidentally hits switch'? Does this question have implications for impartial measurements of, say, 'impressiveness' or 'efficiency'?

Emile, that's what I thought when I read Tim's comment, but then I immediately asked myself at what point between water flowing and neurons firing does a process become simple and deterministic? As Eliezer says, to a smart enough mind, we would look pretty basic. I mean, we weren't even designed by a mind, we sprung from simple selection! But yes, it's possible that optimisation isn't involved at all in water, whereas it pretty obviously is with going to the supermarket etc.

peeper, you score 2 on the comment incoherency criterion but an unprecedented 12 for pointlessness, giving you also an average of 7.0. Congrats!

What I would suggest to begin with (besides any further technical problems) is that optimization power has to be defined relative to a given space or a given class of spaces (in addition to relative to a preference ordering and a random selection)

This allows comparisons between optimizers with a common target space to be more meaningful. In my example above, the hill climber would be less powerful than the range climber because given a "mountain range" the former would be stuck on a local maximum. So for both these optimizers, we would define the target space as the class of NxN topographies, and the range climber's score would be higher, as an average.

It's possible to do that

fairlyneatly and generally - using a Solomonoff prior with a "simple" reference machine.This is backwards, isn't it? Reverse engineering a system designed by a (human?) intelligence is a lot easier than reverse engineering an evolved system.

One problem is that the search space is often unbounded. Looking for the shortest program that performs some specified task? The search space consists of all possible programs. Obviously you start with the short ones - but until you have solved the problem, you don't know how much of the space you will wind up having to search.

Another problem is that enlarging the search space doesn't necessarily make the problem any harder. Compressing the human genome? It probably doesn't make any difference if you search the space of smaller-than-1gb programs, or the space of smaller-than-10gb programs. Beyond a certain point, the size of the search space is often an irrelevance.

It is pretty common for search spaces to have these properties - so defining metrics relative to the size of the search space will often mean that your metrics may not be very useful.

In practice, if you are assessing agents, what you

usuallywant to know is how "good" a specified agent is at solving randomly-selected members of a specified class of problems - where goodness is measured in evaluations, time, cost - or something like that.If you are assessing problems, what you

usuallywant to know is how easy they are to solve - either by a specified agent, or by a population of different agents.Often, in the real world the size of a target tells you how difficult it is to hit. In optimisationverse, that isn't true at all - much depends on the lay of the surrounding land.

Humans were

largelybuilt by sexual selection - which means that the selecting agentsdidhave minds, and that the selection process was oftenextremelycomplex. Details are on http://alife.co.uk/essays/evolution_sees/.Ben Jones: I think a process can be deterministic and (relatively) simple, yet still count as an optimization process. An AI that implements an A* algorithm to find the best path across a maze might quality as a (specialized) Optimization process. You can make more accurate predictions about it's final state than about which way it will turn at a particular intersection - and you can't always do so for a stone rolling down a hill.

But I'm not sure about this, because you could say something similar about the deterministic checkers player - maybe it uses A* too!

In case it wasn't clear, I consider the provable checkers solver to be an optimization process - indeed, the maximally powerful (if not maximally efficient) optimizer for the domain "checkers from the canonical starting point". That it is deterministic or provably correct is entirely irrelevant.

Tim, when I said relative to a space I did not mean relative to its size. This is clear in my example of a hill topography, where increasing the scale of the hill does not make it a qualitatively different problem, just move to positions that are higher will work. In fact, the whole motivation for my suggestion is the realization that the

structureof that space is what limits the results of a given optimizer. So it is relative toallthe properties of the space that the power of an optimizer should be defined, to begin with. I say begin with because there are many other technical difficulties left, but i think that measures of power for optimizers that operate on different spaces do not compare meaningfully.Sure -

youaren't making the same mistake as the original poster in that department.Comparing to the size of the search space is pretty daft - since the search space is often unbounded in optimisation problems.

I'm not sure that I get this. Perhaps I understand the maths, but not the point of it. Here are two optimization problems:

1) You have to output 10 million bits. The goal is to output them so that no two consecutive bits are different.

2) You have to output 10 million bits. The goal is to output them so that when interpreted as an MP3 file, they would make a nice sounding song.

Now, the solution space for (1) consists of two possibilities (all 1s, all 0s) out of 2^10000000, for a total of 9,999,999 bits. The solution space for (2) is millions of times wider, leading to fewer bits. However, intuitively, (2) is a much harder problem and things that optimized (2) are actually doing more of the work of intelligence, after all (1) can be achieved in a few lines of code and very little time or space, while (2) takes much more of these resources.

I agree with David's points about the roughness of the search space being a crucial factor in a meaningful definition of optimization power.

Toby, if you were too dumb to see the closed-form solution to problem 1, it might take an intense effort to tweak the bit on each occasion, or perhaps you might have trouble turning the global criterion of total success or failure into a local bit-fixer; now imagine that you are also a mind that finds it very easy to sing MP3s...

The reason you think one problem is simple is that you perceive a solution in closed form; you can imagine a short program, much shorter than 10 million bits, that solves it, and the work of inventing this program was done in your mind without apparent effort. So this problem is very trivial on the

meta-levelbecause the program that solves it optimally appears very quickly in the ordering of possible programs and is moreover prominent in that ordering relative to our instinctive transformations of the problem specification.But if you were trying random solutions and the solution tester was a black box, then the alternating-bits problem would indeed be harder - so you can't be measuring the

rawdifficulty of optimization if you say that one is easier than the other.This is why I say that the human notion of "impressiveness" is best constructed out of a more primitive notion of "optimization".

We also do, legitimately, find it more natural to talk about "optimized" performance on multiple problems than on a single problem - if we're talking about just a single problem, then it may not compress the message much to say "This is the goal" rather than just "This is the output."

Does this notion of intelligence contain any factor accounting for how many resources have to be burnt to get to a good outcome? Simulated annealing, random walk metropolis, and gradient descent will all find you the local and/or global minima of a particular function. But depending on the landscape of that function, each method will do so in a much more or less illuminating way, not to mention have extremely different computational burdens.

A stochastic optimization approach, like annealing, doesn't require any knowledge about the problem domain at all beyond the utility function (objective functional). You can "set it and forget it" so to speak, which is important when you are aware that you don't know enough to make use of specialized knowledge to narrow the search... or when you can bear to make millions of calls to a single function evaluation (the utility function itself tried on different candidate output states) but cannot bear to compute the derivatives or use other local geometric cues.

The 'value' of an optimization scheme, to me, does not only lie in the Shannon entropy of the subset of outcomes that it can hand to you. Maybe this is just a consequence of utility functions needing to be recursive, so that in trying to optimize my utility, I partially burn some resources on choosing the correct optimization procedures (this choosing being itself a meta-layer of the overall optimization process). But it seems to me that this definition of intelligence is too narrow to be useful, in many of the same ways that Shannon entropy is too narrow to be useful.

Shannon entropy is great if you start out with some information in your hands and you want to send it to that guy over there. But it's a terrible terrible terrible way to quantify information if instead you're trying to pick apart what Nature hands you and decide if there is anything about it worth extracting that you can call "information."

There's a growing body of research which I think you would enjoy called "actionable information theory." It's an idea going back to Gibson that proceeds along the lines that information is just whatever-is-at-hand-that-I-can-exploit-to-achieve-my-goal. It presupposes goals and treats problems of cognition as control theory problems where I learn ways to interpret the world (what things to call information) such that goals are achieved.

I think you touch on these things in other posts, but I can't find anything concrete that would serve as a good connection to stochastic optimization. Most of what I see here tends to assume that the output of an optimization process is the maximizing argument of some distribution, as in decision theory. But there are many cases where you don't want to make a decision that way, and turning to genetic algorithms or stochastic optimization are good alternatives with their own set of theorems about convergence.

One difference between optimization power and the folk notion of "intelligence": Suppose the Village Idiot is told the password of an enormous abandoned online bank account. The Village Idiot now has vastly more optimization power than Einstein does; this optimization power is not based on social status nor raw might, but rather on the actions that the Village Idiot can think of taking (most of which start with logging in to account X with password Y) that don't occur to Einstein. However, we wouldn't label the Village Idiot as more intelligent than Einstein.

Is the Principle of Least Action infinitely "intelligent" by your definition? The PLA consistently picks a physical solution to the n-body problem that

surprisesme in the same way Kasparov's brilliant moves surprise me: I can't come up with the exact path the n objects will take, but after I see the path that the PLA chose, I find (for each object) the PLA's path has a smaller action integral than the best path I could have come up with.An AI whose only goal is to make sure such-and-such coin will not, the next time it's flipped, turn up heads, can apply only (slightly less than) 1 bit of optimization pressure by your definition, even if it vaporizes the coin and then builds a Dyson sphere to provide infrastructure and resources for its ongoing efforts to probe the Universe to ensure that it wasn't tricked and that the coin actually was vaporized as it appeared to be.

Lest my comments here appear

entirelynegative, conventional ways of measuring the merit of optimisation processes onparticularproblems include: cost of a solution and time to find a solution. For info-theory metrics, there's the number of trials, and the number of 'generations' of trials (there's a general generational form of optimisation problems with discrete trials).These metrics have the advantage of being reasonable to calculate while not presuming knowlege of unexplored areas of the search space. Also, they do not depend heavily on the size of the search space (which is often unbounded).

Whether and how well an intelligence "steer[s] reality into regions that are higher in [its] preference ordering" is dependent not just on its intelligence but also on it's "power". A vastly intelligent AI that is running on a disk connected to all kinds of sensors and a power source, but not connected to anything that lets it effect the world outside of itself (internet, contact with humans, a robot arm, a nano-factory, etc.) won't steer reality anywhere. All it can do is think about what it would do if ti could do ANYthing.

Defining intelligence in this way (as an optimization process), would lead us to disregard this hyper-intelligent entity as not an intelligence at all. It seems like we have account for power (the ability to actualize one's will) somewhere. Granted, intelligence makes one powerful, but it is not the only attribute that does so.

Is there a particular formula for negentropy that OP has in mind? I am not seeing how the log of the inverse of the probability of observing an outcome as good or better than the one observed can be interpreted as the negentropy of a system with respect to that preference ordering.

Edit: Actually, I think I figured it out, but I would still be interested in hearing what other people think.

When I started writing this comment I was confused. Then I got myself fairly less confused I think. I am going to say a bunch of things to explain my confusion, how I tried to get less confused, and then I will ask a couple questions. This comment got really long, and I may decide that it should be a post instead.

Take a system X with 8 possible states. Imagine X is like a simplified Rubik's cube type puzzle. (Thinking about mechanical Rubik's cube solvers is how I originally got confused, but using actual Rubik's cubes to explain would make the math harder.) Suppose I want to measure the optimization power of two different optimizers that optimize X, and share the following preference ordering:

x1∼x2∼x3∼x4∼x5∼x6<x7<x8

When I let optimizer1 operate on X, optimizer1 always leaves X=x8. So on the first time I give optimizer1 X I get:

OP=log2(8/1)=3

If I give X to optimizer1 a second time I get:

OP(X1)=log2(8/1)=3

OP(X2)=log2(8/1)=3

OP=log2(64/1)=OP(X1)+OP(X2)=6

This seems a bit weird to me. If we are imagining a mechanical robot with a camera that solves a Rubik's cube like puzzle, it seems weird to say that the solver gets stronger if I let it operate on the puzzle twice. I guess this would make sense for a measure of optimization pressure exerted instead of a measure of the power of the system, but that doesn't seem to be what the post was going for exactly. I guess we could fix this by dividing by the number of times we give optimizer1 X, and then we would get 3 no matter how many times we let optimizer1 operate on X. This would avoid the weird result that a mechanical puzzle solver gets more powerful the more times we let it operate on the puzzle.

Say that when I let optimizer2 operate on X, it leaves X=x7 with probability p, and leaves X=x8 with probability 1−p, but I do not know p. If I let optimizer2 operate on X one time, and I observe X=x7, I get:

OP=log2(8/2)=2

If I let optimizer2 operate on X three times, and I observe X1=x7, X2=x7, X3=x8, then I get:

OP(X1)=log2(8/2)=2

OP(X2)=log2(8/2)=2

OP(X3)=log2(8/1)=3

OP=log2(512/4)=OP(X1)+OP(X2)+OP(X3)=7

Now we could use the same trick we used before and divide by the number of instances on which optimizer2 was allowed to exert optimization pressure, and this would give us 7/3. The thing is though that we do not know p and it seems like p is relevant to how strong optimizer2 is. We can estimate p to be 2/5 using Laplace's rule, but it might be that the long run frequency of times that optimizer2 leaves X=x8 is actually .9999 and we just got unlucky. (I'm not a frequentist, long run frequency just seemed like the closest concept. Feel free to replace "long run frequency" with the prob a solomonoff bot using the correct language assigns at the limit, or anything else reasonable.) If the long run frequency is in fact that large, then it seems like we are underestimating the power of optimizer2 just because we got a bad sample of its performance. The higher p is the more we are underestimating optimizer2 when we measure its power from these observations.

So it seems then like there is another thing that we need to know besides the preference ordering of an optimizer, the measure over the target system in the absence of optimization, and the observed state of the target system, in order to perfectly measure the optimization power of an optimizer. In this case, it seems like we need to know p. This is a pretty easy fix, we can just take the expectation of the optimization power as originally defined with respect to the probability of observing that state when the optimizer is present, but it

isseem more complicated, and itisdifferent.With o being the observed outcome, Ubeing the utility function of the optimization process, and P being the distribution over outcomes in the absence of optimization, I took the definition in the original post to be:

log2(∑i∈{A|U(Ai)≥U(o)}P(Ai))

The definition I am proposing instead is:

EP(o|optimizer)[log2(∑i∈{A|U(Ai)≥U(o)}P(Ai|∼optimizer))]

That is, you take the expectation of the original measure with respect to the distribution over outcomes you expect to observe in the presence of optimization. We could then call the original measure "optimization pressure exerted", and the second measure optimization power. For systems that are only allowed to optimize once, like humans, these values are very similar; for systems that might exert their full optimization power on several occasions depending on circumstance, like Rubik's cube solvers, these values will be different insofar as the system is allowed to optimize several times. We can think of the first measure as measuring the actual amount of optimization pressure that was exerted on the target system on a particular instance, and we can think of the second measure as the expected amount of optimization pressure that the optimizer exerts on the target system.

To hammer the point home, there is the amount of optimization pressure that I in fact exerted on the universe this time around. Say it was a trillion bits. Then there is the expected amount of optimization pressure that I exert on the universe in a given life. Maybe I just got lucky (or unlucky) on this go around. It could be that if you reran the universe from the point at which I was born several times while varying some things that seem irrelevant, I would on average only increase the negentropy of variables I care about by a million bits. If that were the case, then using the amount of optimization pressure that I exerted on this go around as an estimate of my optimization power in general would be a huge underestimate.

Ok, so what's up here? This seems like an easy thing to notice, and I'm sure Eliezer noticed it.

Eliezer talks about how from the

perspectiveof deep blue, it is exerting optimization pressure every time it plays a game, but from the perspective of the programmers, creating deep blue was a one time optimization cost. Is that a different way to cache out the same thing? It still seems weird to me to say that the more times deep blue plays chess, the higher its optimization power is. It does not seem weird to me to say that the more times a human plays chess, the higher its optimization power is. Each chess game is a subsystem of the target system of that human, eg, the environment over time. Whereas it does seem weird to me to say that if you uploaded my brain and let my brain operate on the same universe 100 times, that the optimization power of my uploaded brain would be 100 times greater than if you only did this once.This is a consequence of one of the nice properties of Eliezer's measure: OP sums for independent systems. It makes sense that if I think an optimizer is optimizing two independent systems, then when I measure their OP with respect to the first system and add it to their OP with respect to the second, I should get the same answer I would if I were treating the two systems jointly as one system. The Rubik's cube the first time I give it to a mechanical Rubik's cube solver, and the second time I give it to a mechanical Rubik's cube solver are in fact two such independent systems. So are the first time you simulate the universe after my birth and the second time. It makes sense to me that you should sum my optimization power for independent parts of the universe in a particular go around should sum to my optimization power with respect to the two systems taken jointly as one, but it doesn't make sense to me that you should just add the optimization pressure I exert on each go to get my total optimization power. Does the measure I propose here actually sum nicely with respect to independent systems? It seems like it might, but I'm not sure.

Is this just the same as Eliezer's proposal for measuring optimization power for mixed outcomes? Seems pretty different, but maybe it isn't. Maybe this is another way to extend optimization power to mixed outcomes? It does take into account that the agent might not take an action that guarantees an outcome with certainty.

Is there some way that I am confused or missing something in the original post that it seems like I am not aware of?