This post hits me far more strongly than the previous ones on this subject.
I think your main point is that it's positively dangerous to believe in an objective account of morality, if you're trying to build an AI. Because you will then falsely believe that a sufficiently intelligent AI will be able to determine the correct morality - so you don't have to worry about programming it to be friendly (or Friendly).
I'm sure you've mentioned this before, but this is more forceful, at least to me. Thanks.
Personally, even though I've mentioned that I thought there might be an objective basis for morality, I've never believed that every mind (or even a large fraction of minds) would be able to find it. So I'm in total agreement that we shouldn't just assume a superintelligent AI would do good things.
In other words, this post drives home to me that, pragmatically, the view of morality you propose is the best one to have, from the point of view of building an AI.
The whole history of civilization has shown that richer, smarter, better educated civilizations are more likely to agree about heaps that their ancestors disputed Are you saying there is in general more agreement among later civilizations so that disagreement should asymptotically approach zero? That would seem odd to me, because it conflicts with the fish, who have no disagreements at all. So then what does it mean?
The fish do not build heaps at all, and are therefore incapable of civilization or even meaningful disagreement on the correctness of heaps. So they should be excluded. (is what the PebbleSorter people might have thought)
This seems to imply that the relativists are right. Of course there's no right way to sort pebbles, but if there really is an absolute morality that AIs are smart enough to find, then they'll find it and rule us with it.
Of course, there could be an absolute morality that AIs aren't smart enough to find either. Then we'd take pot luck. That might not be so good. Many humans believe that there is an absolute morality that governs their treatment of other human beings, but that no morality is required when dealing with lower animals, who lack souls and full i...
It seems that the Pebblesorting People had no problems with variations in spelling of their names. (Biko=Boki)
Good parable though, Eliezer.
TGGP: Well, any idiot can see that the fish only don't disagree because they're not accomplishing anything to disagree about. They don't build any heaps at all, the stupid layabouts. Thus, theirs is a wholly trivial and worthless sort of agreement. The point of life is to have large, correct heaps. To say we should build no heaps is as good as suicide.
I am not quite sure what this story is getting at. I'd guess it's saying that we need to understand how human morality arises on a more fundamental (computable/programmable?) level before we can be sure that we can program AIs that will adhere to it, but the basis of human morality is (presumably) so much more complicated than the "prime numbers = good" presented here that the analogy is a bit strained. I may be interpreting this entirely wrongly.
ShardPhoenix: "the basis of human morality is (presumably) so much more complicated than the "prime numbers = good" presented here that the analogy is a bit strained"
I actually didn't notice that the Pebblesorters like primes until I read this comment. Somehow I feel as if this supports Eliezer's point in some way which I can't notice on my own either.
There is a pattern to what kinds of heaps the Pebblesorters find "right" and "wrong", but they haven't figured it out yet. They have always just used their intuition to decide if a heap was right or wrong, but their intuition got less precise in extreme cases like very large heaps. The Pebblesorters would have been better off if only they could have figured out the pattern and applied it to extreme heaps, rather than fighting over differences of intuition.
Also if they had just figured out the pattern, they could have programmed it into the AI rather than hoping that the AI's intuition would be exactly the same as their own, or manually programming the AI with every special case.
I think this was the main point of the essay but it went right over my head at first.
In fact, a superintelligent AI would easily see that the Pebble people are talking about prime numbers even if they didn't see that themselves, so as long as they programmed the AI to make "correct" heaps, it certainly would not make heaps of 8, 9, or 1957 pebbles. So if anything, this supports my position: if you program an AI that can actually communicate with human beings, you will naturally program it with a similar morality, without even trying.
Apart from that, this post seems to support TGGP's position. Even if there is some computation (i....
You are smart enough to tell that 8 pebbles is incorrect. Knowing that, will you dedicate your life to sorting pebbles into prime-numbered piles, or are you going to worry about humans? How can the pebble-sorters be so sure that they won't get an AI like you?
Nobody's arguing that a superintelligent AI won't know what we want. The problem is that it might not care.
If as Eliezer suggests, human morality might be describable but is perfectly arbitrary, you had better hope we are the first to build FAI. A pebblesorter FAI would break our planet up for a prime-numbered heap of rubble chunks.
Are you arguing that a few simple rules describe what we're all trying to get at with our morality? That everyone's moral preference function is the same deep down? That anything that appears to be a disagreement about what is desirable is actually just a disagreement about the consequences of these shared rules, and could therefore always be resolved in principle by a discussion between any two sufficiently wise, sufficiently patient debaters? And that moral progress consists of the moral zeigeist moving closer to what those rules capture?
That certainly would be convenient for the enterprise of building FAI.
In the case of a set of possible goal states with no mathematical structure, i.e. such that there are no objective relations between those goals, there is clearly no objectively best goal. Like elements of an abstract set, goals without relations between them cannot be superior to one another. But our world is not like this! Goals do have relations between them. Steve Omohundro wrote two papers about the relations between various goals that an agent can have.
This story doesn't do a lot for the idea that people who pursue subjective moralities are worthy and intelligent, either.
Presumably everyone (or the vast majority) reading the story perceives the pebble-heaping conventions as subjective and arbitrary. Is that correct? Can we agree on that? If that's the case, then why isn't the moral of this fable that pursuing subjective intuitions about correctness a wild goose chase?
why isn't the moral of this fable that pursuing subjective intuitions about correctness a wild goose chase?
Bacause those subjective intuitions are all we got. Sure, in an absolute sense, human intuitions on correctness are just as arbitrary as the pebblesorter's intuitions(and vastly more complex), but we don't judge intuitions in an absolute way, we judge them with are own intuitons. You can't unwind past your own intuitions. That was the point of Eliezer's series of posts.
What I get from this: Even if our morality were baked into math, our adoption of it is arbitrary. A GAI is unlikely to be a pebblesorter. A Pebblesorting AI would destroy the pebblesorters. (which in their case, they might be fine with, but they probably don't understand the implications of what they're asking for.) Pebblesorters can't make 'friendly AI'. If it follows their morality it will kill them, if it doesn't kill them then it isn't optimally sorting pebbles.
But because I'm rather cemented to the idea that morality is baked into the universe, my tho...
Self-improving Artificial Intelligences have concluded that the universe has a purpose which is pebblesorting. As the ultimate pebblesorters, they know they crown the creation and all the pebblesorters that preceded them arised only to prepare the way to their eclosion. Bikolo, Biko´s re-encarnation, extends its protective wings to the ancestral tribe of pebblesorters, incurably wrong and therefore living prove of the truth of AI pebblesorting.
It's strange that these pebblesorters can be convinced by "a heap of 103 pebbles and a heap of 19 pebbles side-by-side" that 1957 is incorrect, yet don't understand that this is because 19 * 103 = 157. Admittedly I didn't notice this myself on first reading, but I wasn't looking for a pattern.
I don't think your analogy holds up. Your pebblesorters all agree that prime numbered piles are correct and composite ones incorrect, yet are unreflective enough not to realize that's how they are making the distinction and bad enough mathematicians that th...
That's one sneaky parable-- seems to point in a number of interesting directions and has enough emotional hooks (like feeling superior to the Pebble Sorters) to be distracting.
I'm taking it to mean that people can spend a lot of effort on approximating strongly felt patterns before those patterns are abstracted enough to be understood.
What would happen if a Pebble Sorter came to understand primes? I'm guessing that a lot of them would feel as though the bottom was falling out of their civilization and there was no point to life.
And yes, if you try to limit...
What would happen if a Pebble Sorter came to understand primes? I'm guessing that a lot of them would feel as though the bottom was falling out of their civilization and there was no point to life.
Really? I think they would think it is an amazing revelation. They don't need to fight about heap-correctness, anymore, they can just calculate heap-correctness.
Remember, the meaning of the pebblesorting way of life is to construct correct heaps, not to figure out which heaps are correct.
If you don't like my fish argument, substitute Pebpanzees. Do they really disagree more?
I don't know how relevant the parable is to our world. There has never been an At'gra'len'ley, nor should we expect anything that simple given our godshatter nature.
Things I get from this:
Things decided by our moral system are not relative, arbitrary or meaningless, any more than it's relative, arbitrary or meaningless to say "X is a prime number"
Which moral system the human race uses is relative, arbitrary, and meaningless, just as there's no reason for the pebble sorters to like prime numbers instead of composite numbers, perfect numbers, or even numbers.
A smart AI could follow our moral system as well or better than we ourselves can, just as the Pebble-Sorters' AI can hopefully discover that they're using prime numbers and thus settle the 1957 question once and for all.
But it would have to "want" to first. If the Pebble-Sorters just build an AI and say "Do whatever seems right to you", it won't start making prime-numbered heaps, unless an AI made by us humans and set to "Do whatever seems right to you" would also start making prime-numbered pebble-heaps. More likely, a Pebble-Sorter AI set do "Do whatever seems right to you" would sit there inertly, or fail spectacularly.
So the Pebble-Sorters would be best off using something like CEV.
Which moral system the human race uses is relative, arbitrary, and meaningless, just as there's no reason for the pebble sorters to like prime numbers instead of composite numbers, perfect numbers, or even numbers
But that's clearly not true, except in the sense that it's "arbitrary" to prefer life over death. It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others.
But which others matter how much is an open question. Some would suggest that all humans ma...
It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others.
But that only speaks to Yvain's first point, not the second.
It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others. Spoken like someone who's never heard of Jonathan Haidt.
But that's clearly not true, except in the sense that it's "arbitrary" to prefer life over death. It's a pretty safe generalization that actions which are considered to be immoral are those which are considered to be likely to cause harm to others.
From an reproductive fitness point of view, or a what-humans-prefer point of view, there's nothing at all arbitrary about morality. Yes, it does mostly contain things that avoid harm. But from an objective point of view, "avoid harm" or "increase reproductive fitness" is as arbitrary as "make paperclips" or "pile pebbles in prime numbered heaps".
Not that there's anything wrong with that. I still would prefer living in a utopia of freedom and prosperity to being converted to paperclips, as does probably everyone else in the human race. It's just not written into the fabric of the universe that I SHOULD prefer that, or provable by an AI that doesn't already know that.
It gets interesting when the pebblesorters turn on a correctly functioning FAI, which starts telling them that they should build a pile of 108301 and legislative bodies spend the next decade debating whether or not it is in fact a correct pile. "How does this AI know better anyway? That looks new and strange." "That doesn't sound correct to me at all. You'd have to be crazy to build 108301. It's so different from 2029! It's a slippery slope to 256!" And so on.
This really is a fantastic parable--it shows off perhaps a dozen different aspects of the forrest we were missing for the trees.
But why does it matter what we want, if we aren't ever able to know if what we want is correct for the universe at large?
There is no sense in which what we want may be correct or incorrect for the universe at large, because the universe does not care. Caring is a thing that minds do, and the universe is not a mind.
What if our only purpose is to simply enable the next stage of intelligence, then to disappear into the past?
Our purpose is whatever we choose it to be; purposes are goals seen from another angle. There is no source of purposefulness outside the universe. My goals require that humans stick around, so our purpose with respect to my goal system does not involve disappearing into the past. I think most peoples' goal systems are similar.
better use of the universe
value rationality over existence
"Better" is defined by us. This is the point of the metaethics sequence! A universe tiled with paperclips is not better than what we have now. Rationality is not something one values, it's someone ones uses to get what they value.
You seem to be imagining FAI as some kind of anthropomorphic intelligence with some sort of "constraint" that says "make sure biological humans continue to exist". This is exactly the wrong way to implement FAI. The point of FAI is simply for the AI to do what is right (as opposed to what is prime, or paperclip-maximising). In EY's plan, this involves the AI looking at human minds to discover what we mean by right first.
Now, the right thing may not involve keeping 21st century humanity around forever. Some people will want to be uploaded. Some people will just want better bodies. And yes, most of us will want to "live forever". But the right thing is definitely not to immediately exterminate the entire population of earth.
Eliezer, do you mind if I copy this parable (or rather, a version of it that's translated into Finnish) into a book on developing technologies that I'm currently writing (with the proper credit given, of course)? I think this really demonstrates the problem quite well.
(And while I'm asking, I'd like to ask the same permission for your other posts as well, in case I run into any others that I'd like to include word-for-word - this is the first one that I'd want to do that for, though there are a good bunch of others that I'll be citing and just summarizing their content.)
Kaj, can't give blanket permission for all posts, but you can do this one and up to four others before asking permission again.
Bacause those subjective intuitions are all we got. Sure, in an absolute sense, human intuitions on correctness are just as arbitrary as the pebblesorter's intuitions(and vastly more complex), but we don't judge intuitions in an absolute way, we judge them with are own intuitons. You can't unwind past your own intuitions.The entire history of rationality argues against this position (or positions).
Physics - or rather our understanding of it - was once limited to the degree that you describe. We got better.
I wonder what the Pebblesorter AI would do if successfully programmed to implement Eliezer's vision of coherent extrapolated volition:
"In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted."
Would the AI pebblesort? Or would it figure that if the Pebble...
The wrongness of 21 makes its argument even more compelling. Hmm.
Would a pebblesorter shock artist place a single pebble right next to a pile of 719 stones? (720 = (3!)! )
Or 5 stones next to a pile of 251? (256 = 2^(2^3) )
Maybe even heap them leaning against each other, with a thin sheet of something separating them? So... perverted!
I can't help thinking that Pebblesorter CEV would have to include some aspect of sorting pebbles. Doesn't that suggest that CEV can malfunction pretty badly?
Funny, I assumed that would mean it was working well...
As with most species who suspect they may be a victim of Fisherian runaway sexual selection, the pebblesorters would do well to imagine what would happen if they encountered an alien predator: probably lots of piles of zero size.
There was a Pebblesorter of lore who said that all of the heaps were merely transient, that none of them would last, all eventually destroyed by increasing entropy in the universe, and that therefore none of them held any true or real satisfaction. He said that the only path to enlightenment was to build no heaps at all for to do so could only increase suffering in the world. Then the other Pebblesorters killed him.
He wasn't endorsing that position. He was saying "pebblesorters should not do so, but they pebblesorter::should do so."
You didn't understand what TheOtherDave said. He was talking about the same usage you are talking about and commenting that it is in contrast to Eliezer's past usage (and past advocacy of usage in conversations about how he uses should-related words.)
I tried compiling your comment, but it didn't work. You should adhere to the C++ conventions more closely.
"The Heap Relativists claim that their philosophy may help prevent future disasters like the Great War of 1957, but it is widely considered to be a philosophy of despair. "
This should read that they claim their philosophy may prevent the destruction of correct pebble piles, as happened in the 1957 war. Otherwise, good.
this reminds me distinctly of an analogy posited by prof Frank Tipler in his book about the Omega Point. Imagine you went back in time to 1000AD and found the smartest man in europe. You explain to him the technology available in 2008, but none of the culture. Then you ask him what he thinks early 21st century civilization spends its time on. "Every city would build mile high cathedrals." because in his culture the main social task was building the biggest possible cathedral w the material and techniques available. In 2008, if we wanted to devote our technology and resources to building 5000ft tall cathedrals in every metropolis, we could. It would be exceedingly expensive, but so were the medeival cathedrals, relatively. but the point is we COULD do it, but of course that would never occur to us as a good use of resources. so likewise we should not assume our own priorities on to a post-singularity civilization or even a single AI.
Assuming I understood this correctly, you're saying an true AI might find our morality as arbitrary as we would consider pebble heap sizes, say bugger the lot of us and turn us into biomass for its nano-furnace.
Fortunately, before the fundamentally wrongheaded enterprise of Pebblesorter AI gets too far along, a brilliant young AI researcher realizes that if they analyze and extrapolate the common core of Pebblesorter ethical judgments, they can build an AI that implements the computation that leads them to endorse certain piles and reject others.
An AI built to optimize for that computation, it realizes, would be Friendly: that is, it would implement what Pebblesorters want, and they could therefore rely on it to ethically order the world.
A traditionalist skeptic objects that all Pebblesorter ethical arguments, at least for ethical problems up to 1957, have been written down in the Great Book for generations; there's no need for more.
"That's true," replies the researcher, "but that's just a Not Particularly Large Lookup Table. Sure, such an approach is adequate for all the cases that have ever come up in our entire history, but this coherent extrapolated algorithm could be extended to novel ethical questions like '300007' and still be provably correct."
"But how do we know that's the right thing?" retorts a Heap Relativist. "Sure, it's what we want, but ...
What really struck me with this parable is that it's so well-written that I felt genuine horror and revulsion at the idea of an AI making heaps of size 8. Because, well... 2!
So, aside from the question of whether an AI would come to moral conclusions such as "heaps of size 8 are okay" or "the way to end human suffering is to end human life", the question I'm taking away from this parable is, are we any more enlightened than the Pebblesorters? Should we, in fact, be sending philosophers or missionaries to the Pebblesorter planet to explain to them that it's wrong to murder someone just because they built a heap of size 15?
Why not keep it in the form of a pile of 13 and a pile of 2, until you find the other two pebbles you're looking for? That would be the ETHICALLY RESPONSIBLE thing to do!
if our values are threatened by super intelligence, does that mean that we should build the super intelligence with an ad hoc human value module, or that we should abandon our values?
Also, there are some human values which it seems likely to me are pretty universal to intelligence. If the ability to get bored is correlated with the ability to be creative (which I think it is), and super intelligences (whatever else they are) must be capable of creative action by virtue of their being super intelligences, then they're likely to also care about diversity. I...
I do hope they haven't programmed this AI in binary. I have a strong suspicion the kinds of numbers it may favor, and I don't think they'd like them much at all.
Smarter minds equal smarter heaps. Why would that trend break?
Utility counterfeitting regularly breaks such systems.
You know, to successfully build a PFAI (Pebblesorter-Friendly AI), the Pebblesorters would have to figure out a way to recursively enumerate the primes. That may not be quite as difficult as formalizing Friendliness (the human version), but it's still probably pretty dang hard.
Makes you realize that trying to extrapolate even seemingly simplistic moralities can still result in problems of epic difficulty.
I'm imagining a bunch of slime mold organisms arguing over what the morally correct structure is for their current environment. There would be little Effective Altruist slime-mold cells on the optimum locations, and lots of little cells ignoring the bigger picture and doing what would be optimum if the environment they could see was all that was morally relevant. Maybe there would even be recourse wars between different factions, each more concerned with optimizing their own local region but unable to see the big picture.
Also, maybe it would make more sens...
I wonder: do the names Y'ha-nthlei, Y'not'ha-nthlei, and At'gra'len'ley mean anything? I assume Y'ha and Y'not'ha mean "you have" and "you don't have", but beyond that it just seems random.
Once upon a time there was a strange little species—that might have been biological, or might have been synthetic, and perhaps were only a dream—whose passion was sorting pebbles into correct heaps.
They couldn't tell you why some heaps were correct, and some incorrect. But all of them agreed that the most important thing in the world was to create correct heaps, and scatter incorrect ones.
Why the Pebblesorting People cared so much, is lost to this history—maybe a Fisherian runaway sexual selection, started by sheer accident a million years ago? Or maybe a strange work of sentient art, created by more powerful minds and abandoned?
But it mattered so drastically to them, this sorting of pebbles, that all the Pebblesorting philosophers said in unison that pebble-heap-sorting was the very meaning of their lives: and held that the only justified reason to eat was to sort pebbles, the only justified reason to mate was to sort pebbles, the only justified reason to participate in their world economy was to efficiently sort pebbles.
The Pebblesorting People all agreed on that, but they didn't always agree on which heaps were correct or incorrect.
In the early days of Pebblesorting civilization, the heaps they made were mostly small, with counts like 23 or 29; they couldn't tell if larger heaps were correct or not. Three millennia ago, the Great Leader Biko made a heap of 91 pebbles and proclaimed it correct, and his legions of admiring followers made more heaps likewise. But over a handful of centuries, as the power of the Bikonians faded, an intuition began to accumulate among the smartest and most educated that a heap of 91 pebbles was incorrect. Until finally they came to know what they had done: and they scattered all the heaps of 91 pebbles. Not without flashes of regret, for some of those heaps were great works of art, but incorrect. They even scattered Biko's original heap, made of 91 precious gemstones each of a different type and color.
And no civilization since has seriously doubted that a heap of 91 is incorrect.
Today, in these wiser times, the size of the heaps that Pebblesorters dare attempt, has grown very much larger—which all agree would be a most great and excellent thing, if only they could ensure the heaps were really correct. Wars have been fought between countries that disagree on which heaps are correct: the Pebblesorters will never forget the Great War of 1957, fought between Y'ha-nthlei and Y'not'ha-nthlei, over heaps of size 1957. That war, which saw the first use of nuclear weapons on the Pebblesorting Planet, finally ended when the Y'not'ha-nthleian philosopher At'gra'len'ley exhibited a heap of 103 pebbles and a heap of 19 pebbles side-by-side. So persuasive was this argument that even Y'not'ha-nthlei reluctantly conceded that it was best to stop building heaps of 1957 pebbles, at least for the time being.
Since the Great War of 1957, countries have been reluctant to openly endorse or condemn heaps of large size, since this leads so easily to war. Indeed, some Pebblesorting philosophers—who seem to take a tangible delight in shocking others with their cynicism—have entirely denied the existence of pebble-sorting progress; they suggest that opinions about pebbles have simply been a random walk over time, with no coherence to them, the illusion of progress created by condemning all dissimilar pasts as incorrect. The philosophers point to the disagreement over pebbles of large size, as proof that there is nothing that makes a heap of size 91 really incorrect—that it was simply fashionable to build such heaps at one point in time, and then at another point, fashionable to condemn them. "But... 13!" carries no truck with them; for to regard "13!" as a persuasive counterargument, is only another convention, they say. The Heap Relativists claim that their philosophy may help prevent future disasters like the Great War of 1957, but it is widely considered to be a philosophy of despair.
Now the question of what makes a heap correct or incorrect, has taken on new urgency; for the Pebblesorters may shortly embark on the creation of self-improving Artificial Intelligences. The Heap Relativists have warned against this project: They say that AIs, not being of the species Pebblesorter sapiens, may form their own culture with entirely different ideas of which heaps are correct or incorrect. "They could decide that heaps of 8 pebbles are correct," say the Heap Relativists, "and while ultimately they'd be no righter or wronger than us, still, our civilization says we shouldn't build such heaps. It is not in our interest to create AI, unless all the computers have bombs strapped to them, so that even if the AI thinks a heap of 8 pebbles is correct, we can force it to build heaps of 7 pebbles instead. Otherwise, KABOOM!"
But this, to most Pebblesorters, seems absurd. Surely a sufficiently powerful AI—especially the "superintelligence" some transpebblesorterists go on about—would be able to see at a glance which heaps were correct or incorrect! The thought of something with a brain the size of a planet, thinking that a heap of 8 pebbles was correct, is just too absurd to be worth talking about.
Indeed, it is an utterly futile project to constrain how a superintelligence sorts pebbles into heaps. Suppose that Great Leader Biko had been able, in his primitive era, to construct a self-improving AI; and he had built it as an expected utility maximizer whose utility function told it to create as many heaps as possible of size 91. Surely, when this AI improved itself far enough, and became smart enough, then it would see at a glance that this utility function was incorrect; and, having the ability to modify its own source code, it would rewrite its utility function to value more reasonable heap sizes, like 101 or 103.
And certainly not heaps of size 8. That would just be stupid. Any mind that stupid is too dumb to be a threat.
Reassured by such common sense, the Pebblesorters pour full speed ahead on their project to throw together lots of algorithms at random on big computers until some kind of intelligence emerges. The whole history of civilization has shown that richer, smarter, better educated civilizations are likely to agree about heaps that their ancestors once disputed. Sure, there are then larger heaps to argue about—but the further technology has advanced, the larger the heaps that have been agreed upon and constructed.
Indeed, intelligence itself has always correlated with making correct heaps—the nearest evolutionary cousins to the Pebblesorters, the Pebpanzees, make heaps of only size 2 or 3, and occasionally stupid heaps like 9. And other, even less intelligent creatures, like fish, make no heaps at all.
Smarter minds equal smarter heaps. Why would that trend break?