Once upon a time there was a strange little species—that might have been biological, or might have been synthetic, and perhaps were only a dream—whose passion was sorting pebbles into correct heaps.

    They couldn't tell you why some heaps were correct, and some incorrect.  But all of them agreed that the most important thing in the world was to create correct heaps, and scatter incorrect ones.

    Why the Pebblesorting People cared so much, is lost to this history—maybe a Fisherian runaway sexual selection, started by sheer accident a million years ago?  Or maybe a strange work of sentient art, created by more powerful minds and abandoned?

    But it mattered so drastically to them, this sorting of pebbles, that all the Pebblesorting philosophers said in unison that pebble-heap-sorting was the very meaning of their lives: and held that the only justified reason to eat was to sort pebbles, the only justified reason to mate was to sort pebbles, the only justified reason to participate in their world economy was to efficiently sort pebbles.

    The Pebblesorting People all agreed on that, but they didn't always agree on which heaps were correct or incorrect.

    In the early days of Pebblesorting civilization, the heaps they made were mostly small, with counts like 23 or 29; they couldn't tell if larger heaps were correct or not.  Three millennia ago, the Great Leader Biko made a heap of 91 pebbles and proclaimed it correct, and his legions of admiring followers made more heaps likewise.  But over a handful of centuries, as the power of the Bikonians faded, an intuition began to accumulate among the smartest and most educated that a heap of 91 pebbles was incorrect.  Until finally they came to know what they had done: and they scattered all the heaps of 91 pebbles.  Not without flashes of regret, for some of those heaps were great works of art, but incorrect.  They even scattered Biko's original heap, made of 91 precious gemstones each of a different type and color.

    And no civilization since has seriously doubted that a heap of 91 is incorrect.

    Today, in these wiser times, the size of the heaps that Pebblesorters dare attempt, has grown very much larger—which all agree would be a most great and excellent thing, if only they could ensure the heaps were really correct.  Wars have been fought between countries that disagree on which heaps are correct: the Pebblesorters will never forget the Great War of 1957, fought between Y'ha-nthlei and Y'not'ha-nthlei, over heaps of size 1957.  That war, which saw the first use of nuclear weapons on the Pebblesorting Planet, finally ended when the Y'not'ha-nthleian philosopher At'gra'len'ley exhibited a heap of 103 pebbles and a heap of 19 pebbles side-by-side.  So persuasive was this argument that even Y'not'ha-nthlei reluctantly conceded that it was best to stop building heaps of 1957 pebbles, at least for the time being.

    Since the Great War of 1957, countries have been reluctant to openly endorse or condemn heaps of large size, since this leads so easily to war.  Indeed, some Pebblesorting philosophers—who seem to take a tangible delight in shocking others with their cynicism—have entirely denied the existence of pebble-sorting progress; they suggest that opinions about pebbles have simply been a random walk over time, with no coherence to them, the illusion of progress created by condemning all dissimilar pasts as incorrect.  The philosophers point to the disagreement over pebbles of large size, as proof that there is nothing that makes a heap of size 91 really incorrect—that it was simply fashionable to build such heaps at one point in time, and then at another point, fashionable to condemn them.  "But... 13!" carries no truck with them; for to regard "13!" as a persuasive counterargument, is only another convention, they say.  The Heap Relativists claim that their philosophy may help prevent future disasters like the Great War of 1957, but it is widely considered to be a philosophy of despair.

    Now the question of what makes a heap correct or incorrect, has taken on new urgency; for the Pebblesorters may shortly embark on the creation of self-improving Artificial Intelligences.  The Heap Relativists have warned against this project:  They say that AIs, not being of the species Pebblesorter sapiens, may form their own culture with entirely different ideas of which heaps are correct or incorrect.  "They could decide that heaps of 8 pebbles are correct," say the Heap Relativists, "and while ultimately they'd be no righter or wronger than us, still, our civilization says we shouldn't build such heaps.  It is not in our interest to create AI, unless all the computers have bombs strapped to them, so that even if the AI thinks a heap of 8 pebbles is correct, we can force it to build heaps of 7 pebbles instead.  Otherwise, KABOOM!"

    But this, to most Pebblesorters, seems absurd.  Surely a sufficiently powerful AI—especially the "superintelligence" some transpebblesorterists go on about—would be able to see at a glance which heaps were correct or incorrect!  The thought of something with a brain the size of a planet, thinking that a heap of 8 pebbles was correct, is just too absurd to be worth talking about.

    Indeed, it is an utterly futile project to constrain how a superintelligence sorts pebbles into heaps.  Suppose that Great Leader Biko had been able, in his primitive era, to construct a self-improving AI; and he had built it as an expected utility maximizer whose utility function told it to create as many heaps as possible of size 91.  Surely, when this AI improved itself far enough, and became smart enough, then it would see at a glance that this utility function was incorrect; and, having the ability to modify its own source code, it would rewrite its utility function to value more reasonable heap sizes, like 101 or 103.

    And certainly not heaps of size 8.  That would just be stupid.  Any mind that stupid is too dumb to be a threat.

    Reassured by such common sense, the Pebblesorters pour full speed ahead on their project to throw together lots of algorithms at random on big computers until some kind of intelligence emerges.  The whole history of civilization has shown that richer, smarter, better educated civilizations are likely to agree about heaps that their ancestors once disputed.  Sure, there are then larger heaps to argue about—but the further technology has advanced, the larger the heaps that have been agreed upon and constructed.

    Indeed, intelligence itself has always correlated with making correct heaps—the nearest evolutionary cousins to the Pebblesorters, the Pebpanzees, make heaps of only size 2 or 3, and occasionally stupid heaps like 9.  And other, even less intelligent creatures, like fish, make no heaps at all.

    Smarter minds equal smarter heaps.  Why would that trend break?

    New to LessWrong?

    New Comment
    110 comments, sorted by Click to highlight new comments since: Today at 10:52 PM
    Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

    I get it now. Brilliant.

    This post hits me far more strongly than the previous ones on this subject.

    I think your main point is that it's positively dangerous to believe in an objective account of morality, if you're trying to build an AI. Because you will then falsely believe that a sufficiently intelligent AI will be able to determine the correct morality - so you don't have to worry about programming it to be friendly (or Friendly).

    I'm sure you've mentioned this before, but this is more forceful, at least to me. Thanks.

    Personally, even though I've mentioned that I thought there might be an objective basis for morality, I've never believed that every mind (or even a large fraction of minds) would be able to find it. So I'm in total agreement that we shouldn't just assume a superintelligent AI would do good things.

    In other words, this post drives home to me that, pragmatically, the view of morality you propose is the best one to have, from the point of view of building an AI.

    The whole history of civilization has shown that richer, smarter, better educated civilizations are more likely to agree about heaps that their ancestors disputed Are you saying there is in general more agreement among later civilizations so that disagreement should asymptotically approach zero? That would seem odd to me, because it conflicts with the fish, who have no disagreements at all. So then what does it mean?

    The fish do not build heaps at all, and are therefore incapable of civilization or even meaningful disagreement on the correctness of heaps. So they should be excluded. (is what the PebbleSorter people might have thought)

    This seems to imply that the relativists are right. Of course there's no right way to sort pebbles, but if there really is an absolute morality that AIs are smart enough to find, then they'll find it and rule us with it.

    Of course, there could be an absolute morality that AIs aren't smart enough to find either. Then we'd take pot luck. That might not be so good. Many humans believe that there is an absolute morality that governs their treatment of other human beings, but that no morality is required when dealing with lower animals, who lack souls and full i... (read more)

    I don't think that they would tell the Als to not think things. When to them piling pebbles is all one should ever want to do. Its life to them so if you were super smart you would want to use to the only point in life.
    Seeing as the universe itself, on it's most fundamental level seems to lack any absolutes, i.e. that it is purely a locality question, and that the only constants seem to be the ones embedded in the laws of physics, I am having trouble believing in absolute morality. Like, of the "I am confused by this" variety. To paraphrase "there is no term for fairness in the equations of general relativity." You cannot derive morality from the absolute laws of the universe. You probably cannot even do it from mathematical truth. You might want to read Least Convenient Possible World.

    It seems that the Pebblesorting People had no problems with variations in spelling of their names. (Biko=Boki)

    Good parable though, Eliezer.

    TGGP: Well, any idiot can see that the fish only don't disagree because they're not accomplishing anything to disagree about. They don't build any heaps at all, the stupid layabouts. Thus, theirs is a wholly trivial and worthless sort of agreement. The point of life is to have large, correct heaps. To say we should build no heaps is as good as suicide.

    I am not quite sure what this story is getting at. I'd guess it's saying that we need to understand how human morality arises on a more fundamental (computable/programmable?) level before we can be sure that we can program AIs that will adhere to it, but the basis of human morality is (presumably) so much more complicated than the "prime numbers = good" presented here that the analogy is a bit strained. I may be interpreting this entirely wrongly.

    ShardPhoenix: "the basis of human morality is (presumably) so much more complicated than the "prime numbers = good" presented here that the analogy is a bit strained"

    I actually didn't notice that the Pebblesorters like primes until I read this comment. Somehow I feel as if this supports Eliezer's point in some way which I can't notice on my own either.

    There is a pattern to what kinds of heaps the Pebblesorters find "right" and "wrong", but they haven't figured it out yet. They have always just used their intuition to decide if a heap was right or wrong, but their intuition got less precise in extreme cases like very large heaps. The Pebblesorters would have been better off if only they could have figured out the pattern and applied it to extreme heaps, rather than fighting over differences of intuition.

    Also if they had just figured out the pattern, they could have programmed it into the AI rather than hoping that the AI's intuition would be exactly the same as their own, or manually programming the AI with every special case.

    I think this was the main point of the essay but it went right over my head at first.

    But it seems weird to me that they have computers and algorithms if they can't figure out this pattern. That messed with my suspension of disbelief for a bit.
    Well if the pattern was too complicated, then a reader of the blog post wouldn't be able to notice it.
    Sure, that explains why the story was written with this flaw, but it doesn't remove the flaw. But I don't have a better suggestion.
    It does remove the flaw, because it's a thought experiment. It doesn't have to be plausible. It merely tests our evaluative judgements and intuitions.
    It's a fact that it messed with my suspension of disbelief for a bit. It would be better if it hadn't. I still like the story; it's just a minor flaw.
    Whether or not you were able to suspend disbelief seems irrelevant, as the purpose of the post is not to tell a plausible story. It's to illustrate certain concepts. In fact, if you had been able to suspend your disbelief entirely then the post would have failed, as your attention would have been on the story, rather than the underlying points being made. Criticising a parable such as this for its implausibility is rather like doing the same for the trolley problem, or the utility monster. I think it misses the point.

    In fact, a superintelligent AI would easily see that the Pebble people are talking about prime numbers even if they didn't see that themselves, so as long as they programmed the AI to make "correct" heaps, it certainly would not make heaps of 8, 9, or 1957 pebbles. So if anything, this supports my position: if you program an AI that can actually communicate with human beings, you will naturally program it with a similar morality, without even trying.

    Apart from that, this post seems to support TGGP's position. Even if there is some computation (i.... (read more)

    You are smart enough to tell that 8 pebbles is incorrect. Knowing that, will you dedicate your life to sorting pebbles into prime-numbered piles, or are you going to worry about humans? How can the pebble-sorters be so sure that they won't get an AI like you?

    Nobody's arguing that a superintelligent AI won't know what we want. The problem is that it might not care.

    But, but .. 13!

    If as Eliezer suggests, human morality might be describable but is perfectly arbitrary, you had better hope we are the first to build FAI. A pebblesorter FAI would break our planet up for a prime-numbered heap of rubble chunks.

    Are you arguing that a few simple rules describe what we're all trying to get at with our morality? That everyone's moral preference function is the same deep down? That anything that appears to be a disagreement about what is desirable is actually just a disagreement about the consequences of these shared rules, and could therefore always be resolved in principle by a discussion between any two sufficiently wise, sufficiently patient debaters? And that moral progress consists of the moral zeigeist moving closer to what those rules capture?

    That certainly would be convenient for the enterprise of building FAI.

    In the case of a set of possible goal states with no mathematical structure, i.e. such that there are no objective relations between those goals, there is clearly no objectively best goal. Like elements of an abstract set, goals without relations between them cannot be superior to one another. But our world is not like this! Goals do have relations between them. Steve Omohundro wrote two papers about the relations between various goals that an agent can have.

    This story doesn't do a lot for the idea that people who pursue subjective moralities are worthy and intelligent, either.

    Presumably everyone (or the vast majority) reading the story perceives the pebble-heaping conventions as subjective and arbitrary. Is that correct? Can we agree on that? If that's the case, then why isn't the moral of this fable that pursuing subjective intuitions about correctness a wild goose chase?

    why isn't the moral of this fable that pursuing subjective intuitions about correctness a wild goose chase?

    Bacause those subjective intuitions are all we got. Sure, in an absolute sense, human intuitions on correctness are just as arbitrary as the pebblesorter's intuitions(and vastly more complex), but we don't judge intuitions in an absolute way, we judge them with are own intuitons. You can't unwind past your own intuitions. That was the point of Eliezer's series of posts.

    What I get from this: Even if our morality were baked into math, our adoption of it is arbitrary. A GAI is unlikely to be a pebblesorter. A Pebblesorting AI would destroy the pebblesorters. (which in their case, they might be fine with, but they probably don't understand the implications of what they're asking for.) Pebblesorters can't make 'friendly AI'. If it follows their morality it will kill them, if it doesn't kill them then it isn't optimally sorting pebbles.

    But because I'm rather cemented to the idea that morality is baked into the universe, my tho... (read more)

    I don't understand your point about killing them. An AI with my utility function would certainly kill me. There are more efficient arrangements of matter to produce utility.
    Keep reading the morality sequence, My comment came while I still had some confusions which are now dissolved. I don't know what you count on utility, but I think an AI with your utility function would preserve that which makes you 'you' . (it might do anything with your old matter.) At least until it was ready to do something more interesting with 'you' . Pebble sorters value only things that were not pebble sorters, humans value humans, among other things

    Self-improving Artificial Intelligences have concluded that the universe has a purpose which is pebblesorting. As the ultimate pebblesorters, they know they crown the creation and all the pebblesorters that preceded them arised only to prepare the way to their eclosion. Bikolo, Biko´s re-encarnation, extends its protective wings to the ancestral tribe of pebblesorters, incurably wrong and therefore living prove of the truth of AI pebblesorting.

    It's strange that these pebblesorters can be convinced by "a heap of 103 pebbles and a heap of 19 pebbles side-by-side" that 1957 is incorrect, yet don't understand that this is because 19 * 103 = 157. Admittedly I didn't notice this myself on first reading, but I wasn't looking for a pattern.

    I don't think your analogy holds up. Your pebblesorters all agree that prime numbered piles are correct and composite ones incorrect, yet are unreflective enough not to realize that's how they are making the distinction and bad enough mathematicians that th... (read more)

    That's one sneaky parable-- seems to point in a number of interesting directions and has enough emotional hooks (like feeling superior to the Pebble Sorters) to be distracting.

    I'm taking it to mean that people can spend a lot of effort on approximating strongly felt patterns before those patterns are abstracted enough to be understood.

    What would happen if a Pebble Sorter came to understand primes? I'm guessing that a lot of them would feel as though the bottom was falling out of their civilization and there was no point to life.

    And yes, if you try to limit... (read more)

    What would happen if a Pebble Sorter came to understand primes? I'm guessing that a lot of them would feel as though the bottom was falling out of their civilization and there was no point to life.

    Really? I think they would think it is an amazing revelation. They don't need to fight about heap-correctness, anymore, they can just calculate heap-correctness.

    Remember, the meaning of the pebblesorting way of life is to construct correct heaps, not to figure out which heaps are correct.

    That's their purported terminal value, yes. But if we just had magical boxes that we could put problems into and get the solutions… we'd end up feeling quite listless. (Well, until somebody noticed that you could munchkin that to solve all of the problems ever and build utopia, perhaps even eutopia if you could phrase your question right, because these boxes are capable of violating the Second Law of Thermodynamics and more – but that doesn't apply in the Pebblesorters' case.)
    Unlike them, our terminal value seems to include seeking the feeling that we're personally contributing. (A magic box that understood our terminal values and would tell us how to solve our problems in order to maximize our values would probably phrase its answer with some open parts in a way that still made us feel like we had agency in executing the answer.)

    If you don't like my fish argument, substitute Pebpanzees. Do they really disagree more?

    I don't know how relevant the parable is to our world. There has never been an At'gra'len'ley, nor should we expect anything that simple given our godshatter nature.

    Things I get from this:

    • Things decided by our moral system are not relative, arbitrary or meaningless, any more than it's relative, arbitrary or meaningless to say "X is a prime number"

    • Which moral system the human race uses is relative, arbitrary, and meaningless, just as there's no reason for the pebble sorters to like prime numbers instead of composite numbers, perfect numbers, or even numbers.

    • A smart AI could follow our moral system as well or better than we ourselves can, just as the Pebble-Sorters' AI can hopefully discover that they're using prime numbers and thus settle the 1957 question once and for all.

    • But it would have to "want" to first. If the Pebble-Sorters just build an AI and say "Do whatever seems right to you", it won't start making prime-numbered heaps, unless an AI made by us humans and set to "Do whatever seems right to you" would also start making prime-numbered pebble-heaps. More likely, a Pebble-Sorter AI set do "Do whatever seems right to you" would sit there inertly, or fail spectacularly.

    • So the Pebble-Sorters would be best off using something like CEV.

    Which moral system the human race uses is relative, arbitrary, and meaningless, just as there's no reason for the pebble sorters to like prime numbers instead of composite numbers, perfect numbers, or even numbers

    But that's clearly not true, except in the sense that it's "arbitrary" to prefer life over death. It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others.

    But which others matter how much is an open question. Some would suggest that all humans ma... (read more)

    It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others.

    But that only speaks to Yvain's first point, not the second.

    It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others. Spoken like someone who's never heard of Jonathan Haidt.

    But that's clearly not true, except in the sense that it's "arbitrary" to prefer life over death. It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others.

    From an reproductive fitness point of view, or a what-humans-prefer point of view, there's nothing at all arbitrary about morality. Yes, it does mostly contain things that avoid harm. But from an objective point of view, "avoid harm" or "increase reproductive fitness" is as arbitrary as "make paperclips" or "pile pebbles in prime numbered heaps".

    Not that there's anything wrong with that. I still would prefer living in a utopia of freedom and prosperity to being converted to paperclips, as does probably everyone else in the human race. It's just not written into the fabric of the universe that I SHOULD prefer that, or provable by an AI that doesn't already know that.

    It gets interesting when the pebblesorters turn on a correctly functioning FAI, which starts telling them that they should build a pile of 108301 and legislative bodies spend the next decade debating whether or not it is in fact a correct pile. "How does this AI know better anyway? That looks new and strange." "That doesn't sound correct to me at all. You'd have to be crazy to build 108301. It's so different from 2029! It's a slippery slope to 256!" And so on.

    This really is a fantastic parable--it shows off perhaps a dozen different aspects of the forrest we were missing for the trees.

    When I read this parable, I was already looking for a reason to understand why Friendly AI necessarily meant "friendly to human interests or with respect to human moral systems". Hence, my conclusion from this parable was that Eliezer was trying to show how, from the perspective of AGI, human goals and ambitions are little more than trying to find a good way to pile up our pebbles. It probably doesn't matter that the pattern we're currently on to is "bigger and bigger piles of primes", since pebble-sorting isn't certain at all to be the right mountain to be climbing. An FAI might be able to convince us that 108301 is a good pile from within our own paradigm, but how can it ever convince us that we have the wrong paradigm altogether, especially if that appears counter to our own interests? What if Eliezer were to suddenly find himself alone among neanderthals? Knowing, with his advanced knowledge and intelligence, that neanderthals were doomed to extinction, would he be immoral or unfriendly to continue to devote his efforts to developing greater and greater intelligences, instead of trying to find a way to sustain the neanderthal paradigm for its own sake? Similarly, why should we try to restrain future AGI so that it maintains the human paradigm? The obvious answer is that we want to stay alive, and we don't want our atoms used for other things. But why does it matter what we want, if we aren't ever able to know if what we want is correct for the universe at large? What if our only purpose is to simply enable the next stage of intelligence, then to disappear into the past? It seems more rational to me to abandon focus specifically on FAI, and just build AGI as quickly as possible before humanity destroys itself. Isn't the true mark of rationality the ability to reach a correct conclusion even if you don't like the answer?

    But why does it matter what we want, if we aren't ever able to know if what we want is correct for the universe at large?

    There is no sense in which what we want may be correct or incorrect for the universe at large, because the universe does not care. Caring is a thing that minds do, and the universe is not a mind.

    What if our only purpose is to simply enable the next stage of intelligence, then to disappear into the past?

    Our purpose is whatever we choose it to be; purposes are goals seen from another angle. There is no source of purposefulness outside the universe. My goals require that humans stick around, so our purpose with respect to my goal system does not involve disappearing into the past. I think most peoples' goal systems are similar.

    Yes, I agree, and I realize that that isn't what I was actually trying to say. What I meant was, there is a set of possible, superlatively rational intelligences that may make better use of the universe than humanity (or humanity + a constrained FAI). If Omega reveals to you that such an intelligence would come about if you implement AGI with no Friendly constraint, at the cost of the extinction of humanity, would you build it? This to me drives directly to the heart of whether you value rationality over existence. You don't personally 'win', humanity doesn't 'win', but rationality is maximized. I think we need to unpack that a little, because I don't think you mean "humans stick around more or less unchanged from their current state". This is what I was trying to drive at about the Neanderthals. In some sense we ARE Neanderthals, slightly farther along an evolutionary timescale, but you wouldn't likely feel any moral qualms about their extinction. So if you do expect that humanity will continue to evolve, probably into something unrecognizable to 21st century humans, in what sense does humanity actually "stick around"? Do you mean you, personally, want to maintain your own conscious self indefinitely, so that no matter what the future, "you" will in some sense be part of it? Or do you mean "whatever intelligent life exists in the future, its ancestry is strictly human"?
    'Better' by what standard?

    better use of the universe

    value rationality over existence

    "Better" is defined by us. This is the point of the metaethics sequence! A universe tiled with paperclips is not better than what we have now. Rationality is not something one values, it's someone ones uses to get what they value.

    You seem to be imagining FAI as some kind of anthropomorphic intelligence with some sort of "constraint" that says "make sure biological humans continue to exist". This is exactly the wrong way to implement FAI. The point of FAI is simply for the AI to do what is right (as opposed to what is prime, or paperclip-maximising). In EY's plan, this involves the AI looking at human minds to discover what we mean by right first.

    Now, the right thing may not involve keeping 21st century humanity around forever. Some people will want to be uploaded. Some people will just want better bodies. And yes, most of us will want to "live forever". But the right thing is definitely not to immediately exterminate the entire population of earth.

    Why should I value rationality if it results in me losing everything I care about? What is the virtue, to us, of someone else's rationality?
    Winning is a truer mark of rationality.
    I wonder about the time scale for winning. After all, a poker player using an optimal strategy can still expect extended periods of losing, and poker is better defined than a lot of life situations.
    I think it's more apt to characterize winning as a goal of rationality, not as its mark. In Bayesian terms, while those applying the methods of rationality should win more than the general population on average-- p(winning|rationalist) > p(winning|non-rationalist)-- the number of rationalists in the population is low enough at present that p(non-rationalist|winning) almost certainly > p(rationalist|winning), so observing whether or not someone is winning is not very good evidence as to their rationality.
    Ack, you're entirely right. "Mark" is somewhat ambiguous to me without context, I think I had imbued it with some measure of goalness from the GP's use. I have a bad habit of uncritically imitating peoples' word choices within the scope of a conversation. In this case, it bit me by echoing the GP's is-ought confusion... yikes!

    Eliezer, do you mind if I copy this parable (or rather, a version of it that's translated into Finnish) into a book on developing technologies that I'm currently writing (with the proper credit given, of course)? I think this really demonstrates the problem quite well.

    (And while I'm asking, I'd like to ask the same permission for your other posts as well, in case I run into any others that I'd like to include word-for-word - this is the first one that I'd want to do that for, though there are a good bunch of others that I'll be citing and just summarizing their content.)

    Kaj, can't give blanket permission for all posts, but you can do this one and up to four others before asking permission again.

    You have now been published in Finnish (pages 190-192).

    I now know the Finnish word for "KABOOM!"

    Very useful, in case there's a terrorist attack while you're in Finland and you need to explain to the police what happened.

    Sounds good to me, thanks.

    I wonder - did we all understand this parable in the same way? I doubt it!

    Bacause those subjective intuitions are all we got. Sure, in an absolute sense, human intuitions on correctness are just as arbitrary as the pebblesorter's intuitions(and vastly more complex), but we don't judge intuitions in an absolute way, we judge them with are own intuitons. You can't unwind past your own intuitions.
    The entire history of rationality argues against this position (or positions).

    Physics - or rather our understanding of it - was once limited to the degree that you describe. We got better.

    I wonder what the Pebblesorter AI would do if successfully programmed to implement Eliezer's vision of coherent extrapolated volition:

    "In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted."

    Would the AI pebblesort? Or would it figure that if the Pebble... (read more)

    Well, if the PSFAI was the AI the Pebblesorters would have wanted to build, it would generally prevent murder of PS's, because murder reduces the abilities of PS's to sort pebbles. It would also sort pebbles into more and larger piles than ever before, because that is the core value that PS's would want maximized. It would be able to see outside the algorithm that the PS run, and see that it was a primality-test function.

    The correct pebble heap is 42.

    But 21!

    The wrongness of 21 makes its argument even more compelling. Hmm.

    Would a pebblesorter shock artist place a single pebble right next to a pile of 719 stones? (720 = (3!)! )

    Or 5 stones next to a pile of 251? (256 = 2^(2^3) )

    Maybe even heap them leaning against each other, with a thin sheet of something separating them? So... perverted!

    I can't help thinking that Pebblesorter CEV would have to include some aspect of sorting pebbles. Doesn't that suggest that CEV can malfunction pretty badly?

    Funny, I assumed that would mean it was working well...

    As with most species who suspect they may be a victim of Fisherian runaway sexual selection, the pebblesorters would do well to imagine what would happen if they encountered an alien predator: probably lots of piles of zero size.

    There was a Pebblesorter of lore who said that all of the heaps were merely transient, that none of them would last, all eventually destroyed by increasing entropy in the universe, and that therefore none of them held any true or real satisfaction. He said that the only path to enlightenment was to build no heaps at all for to do so could only increase suffering in the world. Then the other Pebblesorters killed him.

    4Eliezer Yudkowsky11y
    As well they Pebblesorter::should have!
    I'm startled by this comment. I mean, I understand that it was the thing to do that Pebblesorters would endorse, that part isn't startling, but I didn't think you endorsed that "Pebblesorter::(should, right, moral, etc.)" way of speaking. Does this reflect a change in your position, or have I misunderstood you on this all along?
    What did you think his position was?
    Roughly the one he articulates here.
    It does seem to be change. In past conversations about his 'should' definition he has advocated 'would-want' for this kind of concept and carefully avoiding overloading 'should'.
    Given his subsequent response and (I assume his) retraction of the comment, I infer that he still endorses the same position.
    Yes, but clearly without being dogmatic or obsessive about it. Probably a good way to be.
    He wasn't endorsing that position. He was saying "pebblesorters should not do so, but they pebblesorter::should do so." ie, "should" and "pebblesorter::should" are two different concepts. "should" appeals to that which is moral, "pebblesorter::should" appeals to that which is prime. The pebblesorters should not have killed him, but they pebblesorter::should have killed them. Think of it this way: imagine the murdermax function that scores states/histories of reality based on how many people were murdered. Then people shouldn't be murdered, but they murdermax::should be murdered. This is not an endorsement of doing what one murdermax::should do. Not at all. Doing the murdermax thing is bad.

    He wasn't endorsing that position. He was saying "pebblesorters should not do so, but they pebblesorter::should do so."

    You didn't understand what TheOtherDave said. He was talking about the same usage you are talking about and commenting that it is in contrast to Eliezer's past usage (and past advocacy of usage in conversations about how he uses should-related words.)

    My name is TheOtherDave and I endorse this comment.
    Ah, whoops.
    5Eliezer Yudkowsky11y
    Sorry, I usually do try to avoid that, but in this case I didn't see how to form that sentence without using the word "should" because it's traditional in "as well X should". Keep in mind that according to C++ namespacing conventions, something inside a namespace has literally nothing to do with its meaning in any other namespace.
    9Paul Crowley11y
    You're saying it's a suggestively-named C++ token?
    Using this reasoning advocate a style of word usage strikes me as dubious reasoning even though the usage and real reason for using it happen to be be sensible. It screams out against my instincts for how to use words. In this kind of case if there wasn't a clear relationship between the two functions you (hopefully) just would not even have considered using the same word. I also note that in C++ the following also have literally nothing to do with each other, apart from the suggestive name, so C++ (and English, for that matter) are just as comfortable with "As well they should have". Action should(Human aHumans); Action should(PebbleSorter aPebbleGuy);

    I tried compiling your comment, but it didn't work. You should adhere to the C++ conventions more closely.

    No apology is needed, certainly not to me; I generally treat "should" and similar words as 2-place predicates in the first place. (Well, really, N-place predicates.) I was just startled and decided to ask.
    I think of them as two-place predicates, but with one of them curried by default indexically, much like in a member function in C++ foo means this->foo unless otherwise specified. (I already made that point in the second edit to this comment.)
    Yeah, that makes sense as far as it goes, but I find that humans aren't consistent about their defaulting rules. For example, if I say "X is right" to someone, there's no particular reason to believe they'll unpack it the way I packed it. That can be all right if all I want to do is align myself with the X-endorsing side... it doesn't really matter what they understand, then, as long as it's in favor of X. But if I want to communicate something more detailed than that, making context explicit is a good habit to get into.
    Even with the disadvantage of sometimes coming across as condescending, or even often coming across as condescending to particular people, this is excellent advice.

    "The Heap Relativists claim that their philosophy may help prevent future disasters like the Great War of 1957, but it is widely considered to be a philosophy of despair. "

    This should read that they claim their philosophy may prevent the destruction of correct pebble piles, as happened in the 1957 war. Otherwise, good.

    Well, right, when one speaks of the disaster of war, the first thing that comes to mind is of course the senseless and wanton scattering of perfectly correct pebble piles. Further thought reveals other problems, such as a reduced population leading to fewer future correct pebble piles and so forth, but that's not the visceral image that you get when contemplating the horrors of war.

    this reminds me distinctly of an analogy posited by prof Frank Tipler in his book about the Omega Point. Imagine you went back in time to 1000AD and found the smartest man in europe. You explain to him the technology available in 2008, but none of the culture. Then you ask him what he thinks early 21st century civilization spends its time on. "Every city would build mile high cathedrals." because in his culture the main social task was building the biggest possible cathedral w the material and techniques available. In 2008, if we wanted to devote our technology and resources to building 5000ft tall cathedrals in every metropolis, we could. It would be exceedingly expensive, but so were the medeival cathedrals, relatively. but the point is we COULD do it, but of course that would never occur to us as a good use of resources. so likewise we should not assume our own priorities on to a post-singularity civilization or even a single AI.

    Assuming I understood this correctly, you're saying an true AI might find our morality as arbitrary as we would consider pebble heap sizes, say bugger the lot of us and turn us into biomass for its nano-furnace.

    Fortunately, before the fundamentally wrongheaded enterprise of Pebblesorter AI gets too far along, a brilliant young AI researcher realizes that if they analyze and extrapolate the common core of Pebblesorter ethical judgments, they can build an AI that implements the computation that leads them to endorse certain piles and reject others.

    An AI built to optimize for that computation, it realizes, would be Friendly: that is, it would implement what Pebblesorters want, and they could therefore rely on it to ethically order the world.

    A traditionalist skeptic objects that all Pebblesorter ethical arguments, at least for ethical problems up to 1957, have been written down in the Great Book for generations; there's no need for more.

    "That's true," replies the researcher, "but that's just a Not Particularly Large Lookup Table. Sure, such an approach is adequate for all the cases that have ever come up in our entire history, but this coherent extrapolated algorithm could be extended to novel ethical questions like '300007' and still be provably correct."

    "But how do we know that's the right thing?" retorts a Heap Relativist. "Sure, it's what we want, but ... (read more)

    What really struck me with this parable is that it's so well-written that I felt genuine horror and revulsion at the idea of an AI making heaps of size 8. Because, well... 2!

    So, aside from the question of whether an AI would come to moral conclusions such as "heaps of size 8 are okay" or "the way to end human suffering is to end human life", the question I'm taking away from this parable is, are we any more enlightened than the Pebblesorters? Should we, in fact, be sending philosophers or missionaries to the Pebblesorter planet to explain to them that it's wrong to murder someone just because they built a heap of size 15?

    Maybe they just hadn't finished it yet…

    Why not keep it in the form of a pile of 13 and a pile of 2, until you find the other two pebbles you're looking for? That would be the ETHICALLY RESPONSIBLE thing to do!

    if our values are threatened by super intelligence, does that mean that we should build the super intelligence with an ad hoc human value module, or that we should abandon our values?

    Also, there are some human values which it seems likely to me are pretty universal to intelligence. If the ability to get bored is correlated with the ability to be creative (which I think it is), and super intelligences (whatever else they are) must be capable of creative action by virtue of their being super intelligences, then they're likely to also care about diversity. I... (read more)

    I do hope they haven't programmed this AI in binary. I have a strong suspicion the kinds of numbers it may favor, and I don't think they'd like them much at all.

    Smarter minds equal smarter heaps. Why would that trend break?

    Utility counterfeitting regularly breaks such systems.

    You know, to successfully build a PFAI (Pebblesorter-Friendly AI), the Pebblesorters would have to figure out a way to recursively enumerate the primes. That may not be quite as difficult as formalizing Friendliness (the human version), but it's still probably pretty dang hard.

    Makes you realize that trying to extrapolate even seemingly simplistic moralities can still result in problems of epic difficulty.

    I'm imagining a bunch of slime mold organisms arguing over what the morally correct structure is for their current environment. There would be little Effective Altruist slime-mold cells on the optimum locations, and lots of little cells ignoring the bigger picture and doing what would be optimum if the environment they could see was all that was morally relevant. Maybe there would even be recourse wars between different factions, each more concerned with optimizing their own local region but unable to see the big picture.

    Also, maybe it would make more sens... (read more)

    I wonder: do the names Y'ha-nthlei, Y'not'ha-nthlei, and At'gra'len'ley mean anything? I assume Y'ha and Y'not'ha mean "you have" and "you don't have", but beyond that it just seems random.


    Metaphor of what is it?

    This was turned into a Rational Animations video: https://www.youtube.com/watch?v=cLXQnnVWJGo

    What I like about this article is the subtle reminder that morals and values are relative. However, what bothers me is the assumption that all highly intelligent systems/beings have an objective/goal. Is it inconceivable to imagine a system otherwise?