Followup toWhat I Think, If Not Why

My esteemed co-blogger Robin Hanson accuses me of trying to take over the world.

Why, oh why must I be so misunderstood?

(Well, it's not like I don't enjoy certain misunderstandings.  Ah, I remember the first time someone seriously and not in a joking way accused me of trying to take over the world.  On that day I felt like a true mad scientist, though I lacked a castle and hunchbacked assistant.)

But if you're working from the premise of a hard takeoff - an Artificial Intelligence that self-improves at an extremely rapid rate - and you suppose such extra-ordinary depth of insight and precision of craftsmanship that you can actually specify the AI's goal system instead of automatically failing -

- then it takes some work to come up with a way not to take over the world.

Robin talks up the drama inherent in the intelligence explosion, presumably because he feels that this is a primary source of bias.  But I've got to say that Robin's dramatic story, does not sound like the story I tell of myself.  There, the drama comes from tampering with such extreme forces that every single idea you invent is wrong.  The standardized Final Apocalyptic Battle of Good Vs. Evil would be trivial by comparison; then all you have to do is put forth a desperate effort.  Facing an adult problem in a neutral universe isn't so straightforward.  Your enemy is yourself, who will automatically destroy the world, or just fail to accomplish anything, unless you can defeat you.  - That is the drama I crafted into the story I tell myself, for I too would disdain anything so cliched as Armageddon.

So, Robin, I'll ask you something of a probing question.  Let's say that someone walks up to you and grants you unlimited power.

What do you do with it, so as to not take over the world?

Do you say, "I will do nothing - I take the null action"?

But then you have instantly become a malevolent God, as Epicurus said:

Is God willing to prevent evil, but not able?  Then he is not omnipotent.
Is he able, but not willing?  Then he is malevolent.
Is both able, and willing?  Then whence cometh evil?
Is he neither able nor willing?  Then why call him God.

Peter Norvig said, "Refusing to act is like refusing to allow time to pass."  The null action is also a choice.  So have you not, in refusing to act, established all sick people as sick, established all poor people as poor, ordained all in despair to continue in despair, and condemned the dying to death?  Will you not be, until the end of time, responsible for every sin committed?

Well, yes and no.  If someone says, "I don't trust myself not to destroy the world, therefore I take the null action," then I would tend to sigh and say, "If that is so, then you did the right thing."  Afterward, murderers will still be responsible for their murders, and altruists will still be creditable for the help they give.

And to say that you used your power to take over the world by doing nothing to it, seems to stretch the ordinary meaning of the phrase.

But it wouldn't be the best thing you could do with unlimited power, either.

With "unlimited power" you have no need to crush your enemies.  You have no moral defense if you treat your enemies with less than the utmost consideration.

With "unlimited power" you cannot plead the necessity of monitoring or restraining others so that they do not rebel against you.  If you do such a thing, you are simply a tyrant who enjoys power, and not a defender of the people.

Unlimited power removes a lot of moral defenses, really.  You can't say "But I had to."  You can't say "Well, I wanted to help, but I couldn't."  The only excuse for not helping is if you shouldn't, which is harder to establish.

And let us also suppose that this power is wieldable without side effects or configuration constraints; it is wielded with unlimited precision.

For example, you can't take refuge in saying anything like:  "Well, I built this AI, but any intelligence will pursue its own interests, so now the AI will just be a Ricardian trading partner with humanity as it pursues its own goals."  Say, the programming team has cracked the "hard problem of conscious experience" in sufficient depth that they can guarantee that the AI they create is not sentient - not a repository of pleasure, or pain, or subjective experience, or any interest-in-self - and hence, the AI is only a means to an end, and not an end in itself.

And you cannot take refuge in saying, "In invoking this power, the reins of destiny have passed out of my hands, and humanity has passed on the torch."  Sorry, you haven't created a new person yet - not unless you deliberately invoke the unlimited power to do so - and then you can't take refuge in the necessity of it as a side effect; you must establish that it is the right thing to do.

The AI is not necessarily a trading partner.  You could make it a nonsentient device that just gave you things, if you thought that were wiser.

You cannot say, "The law, in protecting the rights of all, must necessarily protect the right of Fred the Deranged to spend all day giving himself electrical shocks."  The power is wielded with unlimited precision; you could, if you wished, protect the rights of everyone except Fred.

You cannot take refuge in the necessity of anything - that is the meaning of unlimited power.

We will even suppose (for it removes yet more excuses, and hence reveals more of your morality) that you are not limited by the laws of physics as we know them.  You are bound to deal only in finite numbers, but not otherwise bounded.  This is so that we can see the true constraints of your morality, apart from your being able to plead constraint by the environment.

In my reckless youth, I used to think that it might be a good idea to flash-upgrade to the highest possible level of intelligence you could manage on available hardware.  Being smart was good, so being smarter was better, and being as smart as possible as quickly as possible was best - right?

But when I imagined having infinite computing power available, I realized that no matter how large a mind you made yourself, you could just go on making yourself larger and larger and larger.  So that wasn't an answer to the purpose of life.  And only then did it occur to me to ask after eudaimonic rates of intelligence increase, rather than just assuming you wanted to immediately be as smart as possible.

Considering the infinite case moved me to change the way I considered the finite case.  Before, I was running away from the question by saying "More!"  But considering an unlimited amount of ice cream forced me to confront the issue of what to do with any of it.

Similarly with population:  If you invoke the unlimited power to create a quadrillion people, then why not a quintillion?  If 3^^^3, why not 3^^^^3?  So you can't take refuge in saying, "I will create more people - that is the difficult thing, and to accomplish it is the main challenge."  What is individually a life worth living?

You can say, "It's not my place to decide; I leave it up to others" but then you are responsible for the consequences of that decision as well.  You should say, at least, how this differs from the null act.

So, Robin, reveal to us your character:  What would you do with unlimited power?

New Comment
97 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Don't bogart that joint, my friend.

I would be very interested to know how many (other) OB/LW readers use cannabis recreationally.

The one ring of power sits before us on a pedestal; around it stand a dozen folks of all races. I believe that whoever grabs the ring first becomes invincible, all powerful. If I believe we cannot make a deal, that someone is about to grab it, then I have to ask myself whether I would weld such power better than whoever I guess will grab it if I do not. If I think I'd do a better job, yes, I grab it. And I'd accept that others might consider that an act of war against them; thinking that way they may well kill me before I get to the ring.

With the ring,... (read more)

I'm not asking you if you'll take the Ring, I'm asking what you'll do with the Ring. It's already been handed to you.

Take advice? That's still something of an evasion. What advice would you offer you? You don't seem quite satisfied with what (you think is) my plan for the Ring - so you must already have an opinion of your own - what would you change?

Eliezer, I haven't meant to express any dissatisfaction with your plans to use a ring of power. And I agree that someone should be working on such plans even if the chances of it happening are rather small. So I approve of your working on such plans. My objection is only that if enough people overestimate the chance of such scenario, it will divert too much attention from other important scenarios. I similarly think global warming is real, worthy of real attention, but that it diverts too much attention from other future issues.

This is a great device for illustrating how devilishly hard it is to do anything constructive with such overwhelming power, yet not be seen as taking over the world. If you give each individual whatever they want you’ve just destroyed every variety of collectivism or traditionalism on the planet , and those who valued those philosophies will curse you. If you implement any single utopian vision everyone who wanted a different one will hate you, and if you limit yourself to any minimal level of intervention everyone who wants larger benefits than you provid... (read more)

If you use your unlimited power to make everyone including yourself constantly happy by design, and reprogram the minds of everybody into always approving of whatever you do, nobody will complain or hate you. Make every particle in the universe cooperate perfectly to maximize the amount of happiness in all future spacetime (and in the past as well, if time travel is possible when you have unlimited power). Then there would be no need for free will or individual autonomy for anybody anymore.
Why was that downvoted by 3? What I did was, I disproved Billy Brown's claim that "If you implement any single utopian vision everyone who wanted a different one will hate you". Was it wrong of me to do so?
Perhaps they see you as splitting hairs between being seen as taking over the world, and actually taking over the world. In your scenario you are not seen as taking over the world because you eliminate the ability to see that - but that means that you've actually taken over the world (to a degree greater than anyone has ever achieved before). But in point of fact, you're right about the claim as stated. As for the downvotes - voting is frequently unfair, here and everywhere else.
Thanks for explaining! I didn't mean to split hairs at all. I'm surprised that so many here seem to take it for granted that, if one had unlimited power, one would choose to let other people remain having some say, and some autonomy. If I would have to listen to anybody else in order to be able to make the best possible decision about what to do with the world, this would mean I would have less than unlimited power. Someone who has unlimited power will always do the right thing, by nature. And besides: Suppose I'd have less than unlimited power but still "rather complete" power over every human being, and suppose I'd create what would be a "utopia" only to some, but without changing anybody's mind against their will, and suppose some people would then hate me for having created that "utopia". Then why would they hate me? Because they would be unhappy. If I'd simply make them constantly happy by design - I wouldn't even have to make them intellectually approve of my utopia to do that - they wouldn't hate me, because a happy person doesn't hate. Therefore, even in a scenario where I had not only "taken over the world", but where I would also be seen as having taken over the world, nobody would still hate me.
I recommend reading this sequence. Suffice it to say that you are wrong, and power does not bring with it morality. What is your support for this claim? (I smell argument by definition...)
It seems to me that the claim that Uni is making is not the same as the claim that you think e's making, mostly because Uni is using definitions of 'best possible decision' and 'right thing' that are different from the ones that are usually used here. It looks to me (and please correct me if I'm wrong, Uni) that Uni is basing eir definition on the idea that there is no objectively correct morality, not even one like Eliezer's CEV - that morality and 'the right thing to do' are purely social ideas, defined by the people in a relevant situation. Thus, if Uni had unlimited power, it would by definition be within eir power to cause the other people in the situation to consider eir actions correct, and e would do so. If this is the argument that Uni is trying to make, then the standard arguments that power doesn't cause morality are basically irrelevant, since Uni is not making the kinds of claims about an all-powerful person's behavior that those apply to. E appears to be claiming that an all-powerful person would always use that power to cause all relevant other people to consider their actions correct, which I suspect is incorrect, but e's basically not making any other claims about the likely behavior of such an entity.
Thanks for recommending. I have never assumed that "power brings with it morality" if we with power mean limited power. Some superhuman AI might very well be more immoral than humans are. I think unlimited power would bring with it morality. If you have access to every single particle in the universe and can put it wherever you want, and thus create whatever is theoretically possible for an almighty being to create, you will know how to fill all of spacetime with the largest possible amount of happiness. And you will do that, since you will be intelligent enough to understand that that's what gives you the most happiness. (And, needless to say, you will also find a way to be the one to experience all that happiness.) Given hedonistic utilitarianism, this is the best thing that could happen, no matter who got the unlimited power and what was initially the moral standards of that person. If you don't think hedonistic utilitarianism (or hedonism) is moral, it's understandable that you think a world filled with the maximum amount of happiness might not be a moral outcome, especially if achieving that goal took killing lots of people against their will, for example. But that alone doesn't prove I'm wrong. Much of what humans think to be very wrong is not in all circumstances wrong. To prove me wrong, you have to either prove hedonism and hedonistic utilitarianism wrong first, or prove that a being with unlimited power wouldn't understand that it would be best for him to fill the universe with as much happiness as possible and experience all that happiness. Observation.
What I got out of this sentence is that you believe someone (anyone?), given absolute power over the universe, would be imbued with knowledge of how to maximize for human happiness. Is that an accurate representation of your position? Would you be willing to provide a more detailed explanation? Not everyone is a hedonistic utilitarian. What if the person/entity who ends up with ultimate power enjoys the suffering of others? Is your claim is that their value system would be rewritten to hedonistic utilitarianism upon receiving power? I do not see any reason why that should be the case. What are your reasons for believing that a being with unlimited power would understand that?
I'm not sure about 'proof' but hedonistic utilitarianism can be casually dismissed out of hand as not particularly desirable and the idea that giving a being ultimate power will make them adopt such preferences is absurd.
I'd be interested to hear a bit more detail as to why it can be dismissed out of hand. Is there a link I could go read?
This is a cliche and may be false but it's assumed true: "Power corrupts and absolute power corrupts absolutely". I wouldn't want anybody to have absolute power not even myself, the only possible use of absolute power I would like to have would be to stop any evil person getting it. To my mind evil = coercion and therefore any human who seeks any kind of coercion over others is evil. My version of evil is the least evil I believe. EDIT: Why did I get voted down for saying "power corrupts" - the corrollary of which is rejection of power is less corrupt whereas Eliezer gets voted up for saying exactly the same thing? Someone who voted me down should respond with their reasoning.
Given humanity's complete lack of experience with absolute power, it seems like you can't even take that cliche for weak evidence. Having glided through the article and comments again, I also don't see where Eliezer said "rejection of power is less corrupt. The bit about Eliezer sighing and saying the null-actor did the right thing? (No, I wasn't the one who downvoted)
What's wrong with Uni's claim? If you have unlimited power, one possible choice is to put all the other sentient beings into a state of euphoric intoxication such that they don't hate you. Yes, that is by definition. Go figure out a state for each agent so that it doesn't hate you and put it into that state, then you've got a counter example to Billy's claim above. Maybe a given agent's former volition would have chosen to hate you if it was aware of the state you forced it into later on, but that's a different thing than caring if the agent itself hates you as a result of changes you make. This is a valid counter-example. I have read the coming of age sequence and don't see what you're referring to in there that makes your point. Perhaps you could point me back to some specific parts of those posts.
Suppose you'd say it would be wrong of me to make the haters happy "against their will". Why would that be wrong, if they would be happy to be happy once they have become happy? Should we not try to prevent suicides either? Not even the most obviously premature suicides, not even temporarily, not even only to make the suicide attempter rethink their decision a little more thoroughly? Making a hater happy "against his will", with the result that he stops hating, is (I think) comparable to preventing a premature suicide in order to give that person an opportunity to reevaluate his situation and come to a better decision (by himself). By respecting what a person wants right now only, you are not respecting "that person including who he will be in the future", you are respecting only a tiny fraction of that. Strictly speaking, even the "now" we are talking about is in the future, because if you are now deciding to act in someone's interest, you should base your decision on your expectation of what he will want by the time your action would start affecting him (which is not exactly now), rather than what he wants right now. So, whenever you respect someone's preferences, you are (or at least should be) respecting his future preferences, not his present ones. (Suppose for example that you strongly suspect that, in one second from now, I will prefer a painless state of mind, but that you see that right now, I'm trying to cut off a piece of wood in a way that you see will make me cut me in my leg in one second if you don't interfere. You should then interfere, and that can be explained by (if not by anything else) your expectation of what I will want one second from now, even if right now I have no other preference than getting that piece of wood cut in two.) I suggest one should respect another persons (expected) distant future preferences more than his "present" (that is, very close future) ones, because his future preferences are more numerous (since there is more tim
This is certainly true. If you have sufficient power, and if my existing values, preferences, beliefs, expectations, etc. are of little or no value to you, but my approval is, then you can choose to override my existing values, preferences, beliefs, expectations, etc. and replace them with whatever values, preferences, beliefs, expectations, etc. would cause me to approve of whatever it is you've done, and that achieves your goals.
While you are technically correct, the spirit of the original post and a charitable interpretation was, as I read it, "no matter what you decide to do with your unlimited power, someone will hate your plan". Of course if you decide to use your unlimited power to blow up the earth, no one will complain because they're all dead. But if you asked the population of earth what they think of your plan to blow up the earth, the response will be largely negative. The contention is that no matter what plan you try to concoct, there will be someone such that, if you told them about the plan and they could see what the outcome would be, they would hate it.
Incidentally, it is currently possible to achieve total happiness, or perhaps a close approximation. A carefully implanted electrode to the right part of the brain, will be more desirable than food to a starving rat, for example. While this part of the brain is called the "pleasure center", it might rather be about desire and reward instead. Nevertheless, pleasure and happiness are by necessity mental states, and it should be possible to artificially create these. Why should a man who is perfectly content, bother to get up to eat, or perhaps achieve something? He may starve to death, but would be happy to do so. And such a man will be content with his current state, which of course is contentment, and not at all resent his current state. Even a less invasive case, where a man is given almost everything he wants, yet not so much so that he does not eventually become dissatisfied with the amount of food in his belly and decide to put more in, even so there will be higher level motivations this man will lose. While I consider myself a utilitarian, and believe the best choices are those that maximize the values of everyone, I cannot agree with the above situation. For now, this is no problem because people in their current state would not choose to artificially fulfill their desires via electrode implants, nor is it yet possible to actually fulfill everyone's desires in the real world. I shall now go and rethink why I choose a certain path, if I cannot abide reaching the destination.
Welcome to Less Wrong! First, let me congratulate you on stopping to rethink when you realize that you've found a seeming contradiction in your own thinking. Most people aren't able to see the contradictions in their beliefs, and when/if they do, they fail to actually do anything about them. While it is theoretically possible to artificially create pleasure and happiness (which, around here, we call wirehading), converting the entire observable universe to orgasmium (maximum pleasure experiencing substance) seems to go a bit beyond that. In general, I think you'll find most people around here are against both, even though they'd call themselves "utilitarians" or similar. This is because there's more than one form of utilitarianism; many Less Wrongers believe other forms, like preference utilitarianism are correct, instead of the original Millsian hedonistic utilitarianism. Edit: fixed link formatting
Unless, of course, anyone actually wants to participate in such systems, in which case you have (for commonly-accepted values of 'want' and 'everyone') allowed them to do so. Someone who'd rather stand in the People's Turnip-Requisitioning Queue for six hours than have unlimited free candy is free to do so, and someone who'd rather watch everyone else do so can have a private world with millions of functionally-indistinguishable simulacra. Someone who demands that other real people participate, whether they want to or not, and can't find enough quasi-volunteers, is wallowing so deep in their own hypocrisy that nothing within the realm of logic could be satisfactory.
If you invoke the unlimited power to create a quadrillion people, then why not a quadrillion?

One of these things is much like the other...

Infinity screws up a whole lot of this essay. Large-but-finite is way way harder, as all the "excuses", as you call them, become real choices again. You have to figure out whether to monitor for potential conflicts, including whether to allow others to take whatever path you took to such power. Necessity is back in the game.

I suspect I'd seriously consider just tiling the universe with happy faces (very complex ones, but still probably not what the rest of y'all think you want). At least it would be pleasant, and nobody would complain.

This question is a bit off-topic and I have a feeling it has been covered in a batch of comments elsewhere so if it has, would someone mind directing me to it. My question is this: Given the existence of the multiverse, shouldn't there be some universe out there in which an AI has already gone FOOM? If it has, wouldn't we see the effects of it in some way? Or have I completely misunderstood the physics?

And Eliezer, don't lie, everybody wants to rule the world.

Okay, you don't disapprove. Then consider the question one of curiosity. If Tyler Cowen acquired a Ring of Power and began gathering a circle of advisors, and you were in that circle, what specific advice would you give him?

Eliezer, I'd advise no sudden moves; think very carefully before doing anything. I don't know what I'd think after thinking carefully, as otherwise I wouldn't need to do it. Are you sure there isn't some way to delay thinking on your problem until after it appears? Having to have an answer now when it seems an likely problem is very expensive.


What about a kind of market system of states? The purpose of the states will be will be to provide a habitat matching each citizen's values and lifestyle?

-Each state will have it's own constitution and rules. -Each person can pick the state they wish to live in assuming they are accepted in based on the state’s rules. -The amount of resources and territory allocated to each state is proportional to the number of citizens that choose to live there. -There are certain universal meta-rules that supercede the states' rules such as... -A citizen may leave a sta... (read more)

I'm glad to hear that you aren't trying to take over the world. The less competitors I have, the better.

@lowly undergrad

Perhaps you're thinking of The Great Filter (

"Eliezer, I'd advise no sudden moves; think very carefully before doing anything."

But about 100 people die every minute!

100 people is practically nothing compared to the gazillions of future people whose lives are at stake. I agree with Robin Hanson, think carefully for very long. Sacrifice the 100 people per minute for some years if you need to. But you wouldn't need to. With unlimited power, it should be possible to freeze the world (except yourself, and your computer and the power supply and food you need, et cetera) to absolute zero temperature for indefinite time, to get enough time to think about what to do with the world. Or rather: with unlimited power, you would know immediately what to do, if unlimited power implies unlimited intelligence and unlimited knowledge by definition. If it doesn't, I find the concept "unlimited power" poorly defined. How can you have unlimited power without unlimited intelligence and unlimited knowledge? So, just like Robin Hanson says, we shouldn't spend time on this problem. We will solve in the best possible way with our unlimited power as soon as we have got unlimited power. We can be sure the solution will be wonderful and perfect.
The entire point of this was an analogy for creating Friendly AI. The AI would have absurd amounts of power, but we have to decide what we want it to do using our limited human intelligence. I suppose you could just ask the AI for more intelligence first, but even that isn't a trivial problem. Would it be ok to alter your mind in such a way that it changes your personality or your values? Is it possible to increase your intelligence without doing that? And tons of other issues trying to specify such a specific goal.

PK: I like your system. One difficulty I notice is that you have thrust the states into the role of the omniscient player in the Newcomb problem. Since the states are unable to punish the members beyond expelling them. They are open to 'hit and run' tactics. They are left with the need to predict accurately which members and potential members will break a rule, 'two box', and be a net loss to the state with no possibility of punishment. They need to choose people who can one box and stay for the long haul. Executions and life imprisonment are simpler, from a game theoretic perspective.


James, it's ok. I have unlimited power and unlimited precision. I can turn back time. At least, I can rewind the state of the universe such that you can't tell the difference (


Tangentally, does anyone know what I'm talking about if I lament how much of Eleizer's thought stream ran through my head, prompted by Sparhawk?

Eliezer: Let's say that someone walks up to you and grants you unlimited power.

Lets not exaggerate. A singleton AI wielding nanotech is not unlimited power; it is merely a Big Huge Stick with which to apply pressure to the universe. It may be the biggest stick around, but it's still operating under the very real limitations of physics - and every inch of potential control comes with a cost of additional invasiveness.

Probably the closest we could come to unlimited power, would be pulling everything except the AI into a simulation, and allowing for arbitra... (read more)


But about 100 people die every minute!

If you have unlimited power, and aren't constrained by current physics, then you can bring them back. Of course, some of them won't want this.

Now, if you have (as I'm interpreting this article) unlimited power, but your current faculties, then embarking on a program to bring back the dead could (will?) backfire.

I think Sparhawk was a fool. But you need to remember, internally he was basically medieval. Also externally you need to remember Eddings is only an English professor and fantasy writer.


It's probably not the worst tradeoff, being cursed only by those who feel their values should take precedence over those of other people.

Why should your values take precedence over theirs? It sounds like you're asserting that tyranny > collectivism.

@Cameron: Fictional characters with unlimited power sure act like morons, don't they?

Singularitarians: The Munchkins of the real universe.

Sorry for being out of topic, but has that 3^^^^3 problem been solved already? I just read the posts and, frankly, I fail to see why this caused so much problems.

Among the things that Jaynes repeats a lot in his book is that the sum of all probabilities must be 1. Hence, if you put probabilities somewhere, you must remove elsewhere. What is the prior probability for "me being able to simulate/kill 3^^^^3 persons/pigs"? Let's call that nonzero number "epsilon". Now, I guess that the (3^^^^3)-1 case should have a probability greater or eq... (read more)

Pierre, it is not true that all probabilities sum to 1. Only for an exhaustive set of mutually exclusive events must the probability sum to 1.

Sorry, I have not been specific enough. Each of my 3^^^^3, 3^^^^3-1, 3^^^^3-2, etc. examples are mutually exclusive (but the sofa is part of the "0" case). While they might not span all possibilities (not exhaustive) and could thus sum to less than one, they cannot sum to higher than 1. As I see it, the weakest assumption here is that "more persons/pigs is less or equally likely". If this holds, the "worst case scenario" is epsilon=1/(3^^^^3) but I would guess for far less than that.

To ask what God should do to make people happy, I would begin by asking whether happiness or pleasure are coherent concepts in a future in which every person had a Godbot to fulfill their wishes. (This question has been addressed many times in science fiction, but with little imagination.) If the answer is no, then perhaps God should be "unkind", and prevent desire-saturation dynamics from arising. (But see the last paragraph of this comment for another possibility.)

What things give us the most pleasure today? I would say, sex, creative activ... (read more)


"But considering an unlimited amount of ice cream forced me to confront the issue of what to do with any of it."

"If you invoke the unlimited power to create a quadrillion people, then why not a quadrillion?"

"Say, the programming team has cracked the "hard problem of conscious experience" in sufficient depth that they can guarantee that the AI they create is not sentient - not a repository of pleasure, or pain, or subjective experience, or any interest-in-self - and hence, the AI is only a means to an end, and not an end i... (read more)



"If we are so unfortunate as to live in a universe in which knowledge is finite, then conflict may serve as a substitute for ignorance in providing us a challenge."

This is inconsistent. What would conflict really do is to provide new information to process ("knowledge").

I guess I can agree with the rest of post. What IMO is worth pointing out that the most pleasures, hormones and insticts excluded, are about processing 'interesting' infromations.

I guess, somewhere deep in all sentient beings, "interesting informations" ar... (read more)


Errr.... luzr, why would I assume that the majority of GAIs that we create will think in a way I define as 'right'?

Pierre, the proposition, "I am able to simulate 3^^^^3 people" is not mutually exclusive with the proposition "I am able to simulate 3^^^^3-1 people."

If you meant to use the propositions D_N: "N is the maximum number of people that I can simulate", then yes, all the D_N's would be mutually exclusive. Then if you assume that P(D_N) ≤ P(D_N-1) for all N, you can indeed derive that P(D_3^^^^3) ≤ 1/3^^^^3. But P("I am able to simulate 3^^^^3 people") = P(D_3^^^^3) + P(D_3^^^^3+1) + P(D^^^^3+2) + ..., which you don't have an upper bound for.

An expected utility maximizer would know exactly what to do with unlimited power. Why do we have to think so hard about it? The obvious answer is that we are adaptation executioners, not utility maximizers, and we don't have an adaptation for dealing with unlimited power. We could try to extrapolate an utility function from our adaptations, but given that those adaptations deal only with a limited set of circumstances, we'll end up with an infinite set of possible utility functions for each person. What to do?

James D. Miller: But about 100 people die every... (read more)


"Errr.... luzr, why would I assume that the majority of GAIs that we create will think in a way I define as 'right'?"

It is not about what YOU define as right.

Anyway, considering that Eliezer is existing self-aware sentient GI agent, with obviously high intelligence and he is able to ask such questions despite his original biological programming makes me suppose that some other powerful strong sentient self-aware GI should reach the same point. I also believe that more general intelligence make GI converge to such "right thinking".

What m... (read more)

Could reach the same point. Said Eliezer agent is programmed genetically to value his own genes and those of humanity. An artificial Elizer could reach the conclusion that humanity is worth keeping but is by no means obliged to come to that conclusion. On the contrary, genetics determines that at least some of us humans value the continued existence of humanity.


Is it any safer to think ourselves about how to extend our adaptation-executer preferences than to program an AI to figure out what conclusions we would come to, if we did think a long time?

I'm thinking here of studies I half-remember about people preferring lottery tickets whose numbers they made up to randomly chosen lottery tickets, and about people thinking themselves safer if they have the steering wheel than if equally competent drivers have the steering wheel. (I only half-remember the studies; don't trust the details.) Do you think a bias like that is involved in your preference for doing the thinking ourselves, or is there reason to expect a better outcome?

Robin wrote: Having to have an answer now when it seems an likely problem is very expensive.

(I think you meant to write "unlikely" here instead of "likely".)

Robin, what is your probability that eventually humanity will evolve into a singleton (i.e., not necessarily through Eliezer's FOOM scenario)? It seems to me that competition is likely to be unstable, whereas a singleton by definition is. Competition can evolve into a singleton, but not vice versa. Given that negentropy increases as mass squared, most competitors have to remain in t... (read more)


Wei Dai, singleton-to-competition is perfectly possible, if the singleton decides it would like company.

I quote:

"The young revolutionary's belief is honest. There will be no betraying catch in his throat, as he explains why the tribe is doomed at the hands of the old and corrupt, unless he is given power to set things right. Not even subconsciously does he think, "And then, once I obtain power, I will strangely begin to resemble that old corrupt guard, abusing my power to increase my inclusive genetic fitness."

"no sudden moves; think very carefully before doing anything" - doesn't that basically amount to an admission that human minds aren't up to this, that you ought to hurriedly self-improve just to avoid tripping over your own omnipotent feet?

This presents an answer to Eliezer's "how much self improvement?": there has to be some point at which the question "what to do" becomes fully determined and further improvement is just re-proving the already proven. So you improve towards that point and stop.


This is a general point concerning Robin's and Eliezer's disagreement. I'm posting it in this thread because this thread is the best combination of relevance and recentness.

It looks like Robin doens't want to engage with simple logical arguments if they fall outside of established, scientific frameworks of abstractions. Those arguments could even be damning critiques of (hidden assumptions in) those abstractions. If Eliezer were right, how could Robin come to know that?

I think Robin's implied suggestion -- to not be so quick to discard the option of building an AI that can improve itself in certain ways but not to the point of needing to hardcode something like Coherent Extrapolated Volition. Is it really impossible to make an AI that can become "smarter" in useful ways (including by modifying its own source code, if you like), without it ever needing to take decisions itself that have severe nonlocal effects? If intelligence is an optimization process, perhaps we can choose more carefully what is being optim... (read more)

Oops, Julian Morrison already said something similar.

Just a note of thanks to Phil Goetz for actually considering the question.

What if creating a friendly AI isn't about creating a friendly AI?

I may prefer Eliezer to grab the One Ring over others who are also trying to grab it, but that does not mean I wouldn't rather see the ring destroyed, or divided up into smaller bits for more even distribution.

I haven't met Eliezer. I'm sure he's a pretty nice guy. But do I trust him to create something that may take over the world? No, definitely not. I find it extremely unlikely that selflessness is the causal factor behind his wanting to create a friendly AI, despite how much he may claim so or how much he may believe so. Genes and memes do not reproduce via selflessness.


Peter de Blanc: You are right and I came to the same conclusion while walking this morning. I was trying to simplify the problem in order to easily obtain numbers <=1/(3^^^^3), which would solve the "paradox". We now agree that I oversimplified it.

Instead of messing with a proof-like approach again, I will try to clarify my intuition. When you start considering events of that magnitude, you must consider a lot of events (including waking up with blue tentacles as hands to take Eliezer's example). The total probability is limited to 1 for exclu... (read more)

Wei, yes I meant "unlikely." Bo, you and I have very different ideas of what "logical" means. V.G., I hope you will comment more.

Grant: We did not evolve to handle this situation. It's just as valid to say that we have an opportunity to exploit Elizer's youthful evolved altruism, get him or others like him to make an FAI, and thereby lock himself out of most of the potential payoff. Idealists get corrupted, but they also die for their ideals.

I have been granted almighty power, constrained only by the most fundamental laws of reality (which may, or may not, correspond with what we currently think about such things).

What do I do? Whatever it is that you want me to do. (No sweat off my almighty brow.)

You want me to kill thy neighbour? Look, he's dead. The neighbour doesn't even notice he's been killed ... I've got almighty power, and have granted his wish too, which is to live forever. He asked the same about you, but you didn't notice either.

In a universe where I have "almighty" power,... (read more)

Anna Salamon wrote: Is it any safer to think ourselves about how to extend our adaptation-executer preferences than to program an AI to figure out what conclusions we would come to, if we did think a long time?

First, I don't know that "think about how to extend our adaptation-executer preferences" is the right thing to do. It's not clear why we should extend our adaptation-executer preferences, especially given the difficulties involved. I'd backtrack to "think about what we should want".

Putting that aside, the reason that I prefer we d... (read more)

If living systems can unite, they can also be divided. I don't see what the problem with that idea could be.

Hmm, there are a lot of problems here.

"Unlimited power" is a non-starter. No matter how powerful the AGI is it will be of finite power. Unlimited power is the stuff of theology not of actually achievable minds. Thus the ditty from Epicurus about "God" does not apply. This is not a trivial point. I have a concern Eliezer may get too caught up in these grand sagas and great dilemnas on precisely such a theological absolutist scale. Arguing as if unlimited power is real takes us well into the current essay.

"Wieldable with... (read more)

"Give it to you" is a pretty lame answer but I'm at least able to recognise the fact that I'm not even close to being a good choice for having it.

That's more or less completely ignoring the question but the only answers I could ever come up with at the moment are what I think you call cached thoughts here.

[This comment is no longer endorsed by its author]Reply

Now this is the $64 google-illion question!

I don't agree that the null hypothesis: take the ring and do nothing with it is evil. My definition of evil is coercion leading to loss of resources up to and including loss of one's self. Thus absolute evil is loss of one's self across humanity which includes as one use case humanity's extinction (but is not limited to humanity's extinction obviously because being converted into zimboes isn't technically extinction..)

Nobody can argue that the likes of Gaddafi exist in the human population: those who are intereste... (read more)

What would you do with unlimited power?

Perhaps "Master, you now hold the ring, what do you wish me to turn the universe into?" isn't a question you have to answer all at once.

Perhaps the right approach is to ask yourself "What is the smallest step I can take that has the lowest risk of not being a strict improvement over the current situation?"

For example, are we less human or compassionate now we have Google available, than we were before that point?

Supposing an AI researcher, a year before the Google search engine was made availabl... (read more)

Perhaps. Personally, I suspect that if I had (something I was sufficiently confident was) an AI that can be trusted to give honest good advice without a hidden agenda and without unexpected undesirable side-effects, the opportunity costs of moving that slowly would weigh heavily on my conscience. And if challenged for why I was allowing humanity to bear those costs while I moved slowly, I'm not sure what I would say... it's not clear what the delays are gaining me in that case. Conversely, if I had something I was _in_sufficiently confident was trustworthy AI, it's not clear that the "cautious minimally risky small steps" you describe are actually cautious enough.
Freedom. The difference between the AI making a decision for humanity about what humanity's ideal future should be, and the AI speeding up humanity's own rise in good decision making capability to the point where humanity can make that same decision (and even, perhaps, come to the same conclusion the AI would have done, weeks earlier, if told to do the work for us), is that the choice was made by us, not an external force. That, to many people, is worth something (perhaps even worth the deaths that would happen in the weeks that utopia was delayed by). It is also insurance against an AI that is benevolent, but has imperfect understanding of humanity. (The AI might be able to gain a better understanding of humanity by massively boosting its own capability, but perhaps you don't want it to take over the internet and all attached computers - perhaps you'd prefer it to remain sitting in the experimental mainframe in some basement of IBM where it currently resides, at least initially)
In the case under discussion (an AI that can be trusted to give honest good advice without a hidden agenda and without unexpected undesirable side-effects) I don't see how the imperfect understanding of humanity matters. Conversely, an AI which would take over resources I don't want it to take over doesn't fall into that category. That aside though... OK, sure, if the difference to me between choosing to implement protocol A, and having protocol A implemented without my approval, is worth N happy lifetime years (or whatever unit you want to use), then I should choose to retain control and let people die and/or live in relative misery for it. I don't think that difference is worth that cost to me, though, or worth anything approaching it. Is it worth that cost to you?
Let's try putting some numbers on it. It is the difference between someone who goes house hunting then, on finding a house that would suit them perfectly, voluntarily decides to move to it; and that same person being forcibly relocated to that same new house, against their will by some well meaning authority. Using the "three week delay" figure from earlier, a world population of 7 billion, and an average lifespan of 70 years, that gives us approximately 6 million deaths during those three weeks. Obviously my own personal satisfaction and longing for freedom wouldn't be worth that. But there isn't just me to consider - it is also the satisfaction of whatever fraction of those 7 billion people share my attitude towards controlling our own destiny. If 50% of them shared that attitude, and would be willing to give up 6 weeks of their life to have a share of ownership of The Big Decision (a decision far larger than just which house to live in) then it evens out. Perhaps the first 10 minutes of the 'do it slowly and carefully' route (10 minutes = 2000 lives) should be to ask the AI to look up figures from existing human sources on what fraction of humanity has that attitude, and how strongly they hold it? And perhaps we need to take retroactive satisfaction from the future population of humanity into account? What if at least 1% of the humanity from centuries to come gains some pride and satisfaction from thinking the world they live in is one that humanity chose? Or at least 1% would feel resentment if it were not the case?
OK, sure. Concreteness is good. I would say the first step to putting numbers on this is to actually agree on a unit that those numbers represent. You seem to be asserting here that the proper unit is weeks of life (I infer that from "willing to give up 6 weeks of their life to have a share of ownership of The Big Decision"), but if so, I think your math is not quite right. For example, suppose implementing the Big Decision has a 50% chance of making the average human lifespan a thousand years, and I have a 1% chance of dying in the next six weeks, then by waiting six weeks I'm accepting a .01 chance of losing a .5 chance of 52000 life-weeks... that is, I'm risking an expected value of 260 life-weeks, not 6. Change those assumptions and the EV goes up and down accordingly. So perhaps it makes sense to immediately have the AI implement temporary immortality... that is, nobody dies between now and when we make the decision? But then again, perhaps not? I mean, suspending death is a pretty big act of interference... what about all the people who would have preferred to choose not to die, rather than having their guaranteed continued survival unilaterally forced on them? There's other concerns I have with your calculations here, but that's a relatively simple one so I'll pause here and see if we can agree on a way to handle this one before moving forward.
How about QALYs ?
Interesting question. Economists put a price on a life by looking at things like how much the person would expect to earn (net) during the remainder of their life, and how much money it takes for them to voluntarily accept a certain percentage chance of losing that amount of money. (Yes, that's a vast simplification and inaccuracy). But, in terms of net happiness, it doesn't matter that much which 7 billion bodies are experiencing happiness in any one time period. The natural cycle of life (replacing dead grannies with newborn babies) is more or less neutral, with the grieving caused by the granny dying being balanced by the joy the newborn brings to those same relatives. It matters to the particular individuals involved, but it isn't a net massive loss to the species, yes? Now no doubt there are things a very very powerful AI (not necessarily the sort we initially have) could do to increase the number of QALYs being experienced per year by the human species. But I'd argue that it is the size of the population and how happy each member of the population is that affects the QALYs, not whether the particular individuals are being replaced frequently or infrequently (except as far as that affects how happy the members are, which depends upon their attitude towards death). But, either way, unless the AI does something to change overnight how humans feel about death, increasing their life expectancy won't immediately change how much most humans fear a 0.01% chance of dying (even if, rationally, perhaps it ought to).
(shrug) OK. This is why it helps to get clear on what unit we're talking about. So, if I've understood you correctly, you say that the proper unit to talk about -- the thing we wish to maximize, and the thing we wish to avoid risking the loss of -- is the total number of QALYs being experienced, without reference to how many individuals are experiencing it or who those individuals are. Yes? All right. There are serious problems with this, but as far as I can tell there are serious problems with every choice of unit, and getting into that will derail us, so I'm willing to accept your choice of unit for now in the interests of progress. So, the same basic question arises: doesn't it follow that if the AI is capable of potentially creating N QALYs over the course of six weeks then the relevant opportunity cost of delay is N QALYs? In which case it seems to follow that before we can really decide if waiting six weeks is worth it, we need to know what the EV of N is. Right?
Over three weeks, but yes: right. If the AI makes dramatic changes to society on a very short time scale (such as uploading everyone's brains to a virtual reality, then making 1000 copies of everyone) then N would be very very large. If the AI makes minimal immediate changes in the short term (such as, for example, elliminating all nuclear bombs and putting in place measures to prevent hostile AIs from being developed - ie acting as insurance versus threats against the existance of the human species) then N might be zero. What the expected value of N is, depends on what you think the likely comparative chance is of those two sorts of scenarios. But you can't assume, in absense of knowledge, that the chances are 50:50. And, like I said, you could use the first 10 minutes to find out what the AI predicts N would be. If you ask the AI "If I gave you the go ahead to do what you thought humanity would ask you to do, were it wiser but still human, give me the best answer you can fit into 10 minutes without taking any actions external to your sandbox to the questions: what would your plan of action over the next three weeks be, and what improvement in number of QALYs experienced by humans would you expect to see happen in that time?" and the AI answers "My plans are X, Y and Z and I'd expect N to be of an order of magnitude between 10 and 100 QALYs." then you are free to take the nice slow route with a clear conscience.
Sure, agreed that if I have high confidence that letting the AI out of its sandbox doesn't have too much of an upside in the short term (for example, if I ask it and that's what it tells me and I trust its answer), then the opportunity costs of leaving it in its sandbox are easy to ignore. Also agreed that N can potentially be very very large. In which case the opportunity costs of leaving it in its sandbox are hard to ignore.
As a seperate point, I think that there isn't a consensus on what ought to be maximised is relevant. Suppose the human species were to spread out onto 1,000,000 planets, and last for 1,000,000 years. What happens to just one planet of humans for one year is very small compared to that. Which means that anything that has even a 1% chance of making a 1% difference in the species-lifespan happiness experienced by our species is still 100,000,000 times more important than a year long delay for our one planet. It would still be 100 times more important that a year off the lifespan of the entire species. Suppose I were the one who held the ring and, feeling the pressure of 200 lives being lost every minute, I told the AI to do whatever it thought best, or to do whatever maximised the QALYs for humanity and, thereby, set the AIs core values and purpose. An AI being benevolently inclined towards humanity, even a maginally housetrained one that knows we frown upon things like mass murder (despite that being in a good cause), is not the same as a "safe" AI or one with perfect knowledge of humanity. It might develop better knowledge of humanity later, as it grows in power, but we're talking about a fledgling just created AI that's about to have its core purpose expounded to it. If there's any chance that the holder of the ring is going to give the AI a sub-optimal purpose (maximise the wrong thing) or leave off sensible precautions, that going the 'small step cautious milestone' approach might catch, then that's worth the delay. But, more to the point, do we know there is a single optimal purpose for the AI to have? A single right or wrong thing to maximise? A single destiny for all species? A genetic (or computer code) template that all species will bioengineer themselves to, with no cultural differences? If there is room for a value to diversity, then perhaps there are multiple valid routes humanity might choose (some, perhaps, involving more sacrifice on humanity's part
Well, it's formulating a definition for the Q in QALY good enough for an AI to understand it without screwing up that's the hard part.
Yes. To be fair, we also don't have a great deal of clarity on what we really mean by L, either, but we seem content to treat "you know, lives of systems sufficiently like us" as an answer.
Throwing large numbers around doesn't really help. If the potential upside of letting this AI out of its sandbox is 1,000,000 planets 10 billion lives/planet 1,000,000 years * N Quality = Ne22 QALY, then if there's as little as a .00000001% chance of the device that lets the AI out of its sandbox breaking within the next six weeks, then I calculate an EV of -Ne12 QALY from waiting six weeks. That's a lot of QALY to throw away. The problem with throwing around vast numbers in hypothetical outcomes is that suddenly vanishingly small percentages of those outcomes happening or failing to happen start to feel significant. Humans just aren't very good at that sort of math. That said, I agree completely that the other side of the coin of opportunity cost is that the risk of letting it out of its sandbox and being wrong is also huge, regardless of what we consider "wrong" to look like. Which simply means that the moment I'm handed that ring, I'm in a position I suspect I would find crushing... no matter what I choose to do with it, a potentially vast amount of suffering results that might plausibly have been averted had I chosen differently. That said, if I were as confident as you sound to me that the best thing to maximize is self-determination, I might find that responsibility less crushing. Ditto if I were as confident as you sound to me that the best thing to maximize is anything in particular, including paperclips. I can't imagine being as confident about anything of that sort as you sound to me, though.
The only thing I'm confident of is that I want to hand the decision over to a person or group of people wiser than myself, even if I have to make them in order for them to exist, and that in the mean time I want to avoid doing things that are irreversible (because of the chance the wiser people might disagree and what those things not to have been done) and take as few risks as possible of humanity being destroyed or enslaved in the mean time. Doing things swiftly is on the list, but lower down the order of my priorities. Somewhere in there too is not being needlessly cruel to a sentient being (the AI itself) - I'd prefer to be a parental figure, than a slaver or jailer. Yes, that's far from being a clear cut 'boil your own' set of instructions on how to cook up a friendly AI; and is trying to maximise, minimise or optimise multiple things at once. Hopefully, though, it is at least food for thought, upon which someone else can build something closer resembling a coherent plan.
You can get away with (in fact, strictly improve the algorithm by) using only the second of the two caution-optimisers there, so: "What is the smallest step I can take that has the lowest risk of not being a strict improvement over the current situation?" Naturally when answering the question you will probably consider small steps - and in the unlikely even that a large step is safer, so much the better!
Assuming the person making the decision is perfect at estimating risk. However since the likelihood is that it won't be me creating the first ever AI, but rather that the person who does is reading this advice, I'd prefer to stipulate that they should go for small steps even if, in their opinion, there is some larger step that's less risky. The temptation exists for them to ask, as their first step, "AI of the ring, boost me to god-like wisdom and powers of thought", but that has a number of drawbacks they may not think of. I'd rather my advice contain redundant precautions, as a safety feature. "Of the steps of the smallest size that still advances things, which of those steps has the lowest risk?" Another way to think about it is to take the steps (or give the AI orders) that can be effectively accomplished with the AI boosting itself by the smallest amount. Avoid, initially, making requests that to accomplish the AI will need to massively boost itself; if you can improve your decision making position just through requests that the AI can handle with its current capacity.
Or merely aware of the same potential weakness that you are. I'd be overwhelmingly uncomfortable with someone developing a super-intelligence without the awareness of their human limitations at risk assessment. (Incidentally 'perfect' risk assessment isn't required. They make the most of whatever risk assessment ability they have either way.) I consider this a rather inferior solution - particularly in as much as it pretends to be minimizing two things. Since steps will almost inevitably be differentiated by size the assessment of lowest risks barely comes into play. An algorithm that almost never considers risk rather defeats the point. If you must artificially circumvent the risk assessment algorithm - presumably to counter known biases - then perhaps make the "small steps" a question of satisficing rather than minimization.
Good point. How would you word that?

I assume we're actually talking NEARLY unlimited power: no actually time-traveling to when you were born and killing yourself just to solve the grandfather paradox once and for all; given information theory, and the power to bypass the "no quantum xerox" limitation, I could effectively reset the relevant light cone and run it under read-only mode to gather information needed to make an afterlife for everyone's who's died...if i could also figure out how to pre-prune the run to ensure it winds up at exactly the same branch.

But move one is to hit... (read more)