[SEQ RERUN] Sorting Pebbles Into Correct Heaps

An old comment from Unknown, in that thread:

In fact, a superintelligent AI would easily see that the Pebble people are talking about prime numbers even if they didn't see that themselves, so as long as they programmed the AI to make "correct" heaps, it certainly would not make heaps of 8, 9, or 1957 pebbles. So if anything, this supports my position: if you program an AI that can actually communicate with human beings, you will naturally program it with a similar morality, without even trying.

Actually, it's you that can easily see the Pebble people are talking about prime numbers even though they don't know it. It's easy for you to see that an AI would figure it out prime numbers; the Pebblesorters have no such confidence.

To put it another way:

The Pebblesorters build an AI, and a critic debates turning it on. The topic is "Will the AI make correct heaps?"

An elder stands up and says, "Yes, it will." The critic says, "How do you know?". The elder replies, "It must make correct heaps!". The critic asks, "Why must it make correct heaps?". The elder says, "Well, it's obvious! It's so easy to see that a heap is correct or incorrect, how could something so smart miss it?".

Then you stand up and say, "Yes, it will." The critic says, "How do you know?". You reply, "Well, prime numbers are a fundamental part of reality; more fundamental still for a mind that is more like a computer than ours. In order for an AI to be powerful, it has to perform some task isomorphic to detecting complex patterns; it seems extremely unlikely that any pattern-finding mechanism that misses prime numbers could possibly support powerful optimisation processes. And so we can be pretty sure that the AI will build heaps of prime numbers only." The critic responds, "What the hell are prime numbers?". You say, "Oh! Some unimportant mathematical property, but it turns out that no pile of pebbles that has this property is incorrect, and no pile of pebbles that lacks it is correct, so it acts as a good constraint on the AI."

Some morals from my story:

Notice how the elder is not justified in his belief and should not turn on the AI, but you are justified, maybe even enough to turn it on. Notice also that when it comes to human morality, we are more like the elder.
"Prime numbers" is an exceedingly simple concept, yet I was only capable of getting to the "we can be pretty sure" level of certainty.
Your explanation is longer and more complicated and has many more ways of failing to be true. Indeed, it's only the simplicity of the concept that lets me formulate such an explanation and remotely expect it to be correct. (Even then, I'm pretty sure there's a few nits to pick.)

[-]OrphanWilde13y-10

The response to Unknown sums up the issue already, though.

You may be justified in ascertaining that the AI will figure out what they're doing. You're not justified in assuming it will then act on this knowledge instead of identifying and pursuing its own purposes (presuming you've codified "purpose" enough for it to not just sit there and modify its own utility function to produce the computer equivalent of shooting up heroin).

Until you know what you're doing, you can't get something else to do it for you. The AI programmed without knowledge of what they wanted it to do might cooperate, might not. It would be better to start over, programming it specifically to do what you want it to do.

[-]MrMind13y20

This is definitely my favourite post in the meta-ethics sequence, and the second favourite (the first being "The simple truth") of all the sequences...

[-]shokwave13y20

the Pebblesorters will never forget the Great War of 1957, fought between Y'ha-nthlei and Y'not'ha-nthlei, over heaps of size 1957. That war finally ended when the Y'not'ha-nthleian philosopher At'gra'len'ley exhibited a heap of 103 pebbles and a heap of 19 pebbles side-by-side. So persuasive was this argument that even Y'not'ha-nthlei reluctantly conceded that it was best to stop building heaps of 1957 pebbles, at least for the time being.

Interesting potential-parallels: the argument of 103 and 19 is easy to check (multiplication) but hard to formulate (prime factorisation). Evaluating the statement by itself is in between (primality test).

[-][anonymous]13y20

Primality testing is easy in the sense that if someone discovered that factorization was that easy, they would win the Nobel Prize in Math. Which doesn't even exist.

[-]shokwave13y00

Right, a primality test is not hard like factorisation, but it's harder than multiplication. Our pebblesorters are clearly somewhere between multiplication and prime testing. If a pebblesorter proved something like the AKS algorithm, they would win more than the Gödel prize!

[-]wedrifid13y00

they would win the Nobel Prize in Math. Which doesn't even exist.

Have you been watching Teen Wolf?

[-][anonymous]13y00

Nope - didn't even know that was a TV series until I wikipediaed it just now.

[-]FiftyTwo13y00

[Meta ]I suspect the discussion around this post would be more productive if the piles weren't prime but entirely random* so commenters weren't distracted by noticing the pattern, but by what it says about moral intuitions. Unless Eleizer was making a point by the use of primes that I'm missing of course.

*(To a species tragically lacking the pile-sorters intuition of course)

[-]Oscar_Cunningham13y20

The point is that there's a computation that describes which heaps are correct and which aren't. In the same way there's a computation that describes which actions are human!right and human!wrong. Humans don't know the exact algorithm for this computation, in the same way that pebblesorters don't know any algorithm that tests primality.

[-]MinibearRex13y00

Eliezer in't a moral relativist, and does believe that there is a pattern to morality.

[-]wedrifid13y20

Eliezer in't a moral relativist, and does believe that there is a pattern to morality.

Eliezer is technically not a moral relativist but this is mostly a matter of that label being a terrible way to carve reality. Unless I am very much mistaken, in terms of practical connotations Eliezer's beliefs would be closer to those of a naive philosophy student who professes moral relativism than a similarly naive philosopher who professes the contrary position.

[-][anonymous]13y00

Now the question of what makes a heap correct or incorrect, has taken on new urgency.

Questions of what makes a heap incorrect will reveal more useful answers than questions of what makes a heap correct. Questions of correct and incorrect are not of equal utility in minimizing error.

[-]DanielLC13y00

Is it better to make more smaller heaps, or fewer bigger ones?

[-]Luke_A_Somers13y10

Based on their behavior, it looks like fewer bigger ones. Otherwise they could just sit around pairing up pebbles all day and that would be ideal.

Either that, or they need to use bigger heaps to accommodate all the pebbles, because otherwise you begin stacking the heaps, and that doesn't do at all.

LESSWRONG
LW

LESSWRONG
LW

8

[SEQ RERUN] Sorting Pebbles Into Correct Heaps

8

8