Failure By Affective Analogy

Eliezer Yudkowsky

Previously in series: Failure By Analogy

Alchemy is a way of thinking that humans do not instinctively spot as stupid. Otherwise alchemy would never have been popular, even in medieval days. Turning lead into gold by mixing it with things that seemed similar to gold, sounded every bit as reasonable, back in the day, as trying to build a flying machine with flapping wings. (And yes, it was worth trying once, but you should notice if Reality keeps saying "So what?")

And the final and most dangerous form of failure by analogy is to say a lot of nice things about X, which is similar to Y, so we should expect nice things of Y. You may also say horrible things about Z, which is the polar opposite of Y, so if Z is bad, Y should be good.

Call this "failure by affective analogy".

Failure by affective analogy is when you don't just say, "This lemon glazing is yellow, gold is yellow, QED." But rather say:

"And now we shall add delicious lemon glazing to the formula for the Philosopher's Stone the root of all wisdom, since lemon glazing is beautifully yellow, like gold is beautifully yellow, and also lemon glazing is delightful on the tongue, indicating that it is possessed of a superior potency that delights the senses, just as the beauty of gold delights the senses..."

That's why you find people saying things like, "Neural networks are decentralized, just like democracies" or "Neural networks are emergent, just like capitalism".

A summary of the Standard Prepackaged Revolutionary New AI Paradigm might look like the following - and when reading, ask yourself how many of these ideas are affectively laden:

The Dark Side is Top-Down. The Light Side is Bottom-Up.
The Dark Side is Centralized. The Light Side is Distributed.
The Dark Side is Logical. The Light Side is Fuzzy.
The Dark Side is Serial. The Light Side is Parallel.
The Dark Side is Rational. The Light Side is Intuitive.
The Dark Side is Deterministic. The Light Side is Stochastic.
The Dark Side tries to Prove things. The Light Side tries to Test them.
The Dark Side is Hierarchical. The Light Side is Heterarchical.
The Dark Side is Clean. The Light Side is Noisy.
The Dark Side operates in Closed Worlds. The Light Side operates in Open Worlds.
The Dark Side is Rigid. The Light Side is Flexible.
The Dark Side is Sequential. The Light Side is Realtime.
The Dark Side demands Control and Uniformity. The Light Side champions Freedom and Individuality.
The Dark Side is Lawful. The Light Side, on the other hand, is Creative.

By means of this tremendous package deal fallacy, lots of good feelings are generated about the New Idea (even if it's thirty years old). Enough nice words may even manage to start an affective death spiral. Until finally, via the standard channels of affect heuristic and halo effect, it seems that the New Idea will surely be able to accomplish some extremely difficult end -

- like, say, true general intelligence -

- even if you can't quite give a walkthrough of the internal mechanisms which are going to produce that output.

(Why yes, I have seen AGIfolk trying to pull this on Friendly AI - as they explain how all we need to do is stamp the AI with the properties of Democracy and Love and Joy and Apple Pie and paint an American Flag on the case, and surely it will be Friendly as well - though they can't quite walk through internal cognitive mechanisms.)

From such reasoning as this (and this), came the string of false promises that were published in the newspapers (and led futurists who grew up during that era to be very disappointed in AI, leading them to feel negative affect that now causes them to put AI a hundred years in the future).

Let's say it again: Reversed stupidity is not intelligence - if people are making bad predictions you should just discard them, not reason from their failure.

But there is a certain lesson to be learned. A bounded rationalist cannot do all things, but the true Way should not overpromise - it should not (systematically/regularly/on average) hold out the prospect of success, and then deliver failure. Even a bounded rationalist can aspire to be well calibrated, to not assign 90% probability unless they really do have good enough information to be right nine times out of ten. If you only have good enough information to be right 6 times out of 10, just say 60% instead. A bounded rationalist cannot do all things, but the true Way does not overpromise.

If you want to avoid failed promises of AI... then history suggests, I think, that you should not expect good things out of your AI system unless you have a good idea of how specifically it is going to happen. I don't mean writing out the exact internal program state in advance. But I also don't mean saying that the refrigeration unit will cool down the AI and make it more contemplative. For myself, I seek to know the laws governing the AI's lawful uncertainty and lawful creativity - though I don't expect to know the full content of its future knowledge, or the exact design of its future inventions.

Don't want to be disappointed? Don't hope!

Don't ask yourself if you're allowed to believe that your AI design will work.

Don't guess. Know.

For this much I do know - if I don't know that my AI design will work, it won't.

There are various obvious caveats that need to be attached here, and various obvious stupid interpretations of this principle not to make. You can't be sure a search will return successfully before you have run it -

- but you should understand on a gut level: If you are hoping that your AI design will work, it will fail. If you know that your AI design will work, then it might work.

And on the Friendliness part of that you should hold yourself to an even higher standard - ask yourself if you are forced to believe the AI will be Friendly - because in that aspect, above all, you must constrain Reality so tightly that even Reality is not allowed to say, "So what?" This is a very tough test, but if you do not apply it, you will just find yourself trying to paint a flag on the case, and hoping.

27

Failure By Affective Analogy

27

27

27