Expected Creative Surprises


32


Eliezer_Yudkowsky

Imagine that I'm playing chess against a smarter opponent.  If I could predict exactly where my opponent would move on each turn, I would automatically be at least as good a chess player as my opponent.  I could just ask myself where my opponent would move, if they were in my shoes; and then make the same move myself.  (In fact, to predict my opponent's exact moves, I would need to be superhuman - I would need to predict my opponent's exact mental processes, including their limitations and their errors.  It would become a problem of psychology, rather than chess.)

So predicting an exact move is not possible, but neither is it true that I have no information about my opponent's moves.

Personally, I am a very weak chess player - I play an average of maybe two games per year.  But even if I'm playing against former world champion Garry Kasparov, there are certain things I can predict about his next move.  When the game starts, I can guess that the move P-K4 is more likely than P-KN4.  I can guess that if Kasparov has a move which would allow me to checkmate him on my next move, that Kasparov will not make that move.

Much less reliably, I can guess that Kasparov will not make a move that exposes his queen to my capture - but here, I could be greatly surprised; there could be a rationale for a queen sacrifice which I have not seen.

And finally, of course, I can guess that Kasparov will win the game...

Supposing that Kasparov is playing black, I can guess that the final position of the chess board will occupy the class of positions that are wins for black.  I cannot predict specific features of the board in detail; but I can narrow things down relative to the class of all possible ending positions.

If I play chess against a superior opponent, and I don't know for certain where my opponent will move, I can still endeavor to produce a probability distribution that is well-calibrated - in the sense that, over the course of many games, legal moves that I label with a probability of "ten percent" are made by the opponent around 1 time in 10.

You might ask:  Is producing a well-calibrated distribution over Kasparov, beyond my abilities as an inferior chess player?

But there is a trivial way to produce a well-calibrated probability distribution - just use the maximum-entropy distribution representing a state of total ignorance.  If my opponent has 37 legal moves, I can assign a probability of 1/37 to each move.  This makes me perfectly calibrated:  I assigned 37 different moves a probability of 1 in 37, and exactly one of those moves will happen; so I applied the label "1 in 37" to 37 different events, and exactly 1 of those events occurred.

Total ignorance is not very useful, even if you confess it honestly.  So the question then becomes whether I can do better than maximum entropy.  Let's say that you and I both answer a quiz with ten yes-or-no questions.  You assign probabilities of 90% to your answers, and get one answer wrong.  I assign probabilities of 80% to my answers, and get two answers wrong. We are both perfectly calibrated but you exhibited better discrimination - your answers more strongly distinguished truth from falsehood.

Suppose that someone shows me an arbitrary chess position, and asks me:  "What move would Kasparov make if he played black, starting from this position?"  Since I'm not nearly as good a chess player as Kasparov, I can only weakly guess Kasparov's move, and I'll assign a non-extreme probability distribution to Kasparov's possible moves.  In principle I can do this for any legal chess position, though my guesses might approach maximum entropy - still, I would at least assign a lower probability to what I guessed were obviously wasteful or suicidal moves.

If you put me in a box and feed me chess positions and get probability distributions back out, then we would have - theoretically speaking - a system that produces Yudkowsky's guess for Kasparov's move in any chess position.  We shall suppose (though it may be unlikely) that my prediction is well-calibrated, if not overwhelmingly discriminating.

Now suppose we turn "Yudkowsky's prediction of Kasparov's move" into an actual chess opponent, by having a computer randomly make moves at the exact probabilities I assigned.  We'll call this system RYK, which stands for "Randomized Yudkowsky-Kasparov", though it should really be "Random Selection from Yudkowsky's Probability Distribution over Kasparov's Move."

Will RYK be as good a player as Kasparov?  Of course not.  Sometimes the RYK system will randomly make dreadful moves which the real-life Kasparov would never make - start the game with P-KN4.  I assign such moves a low probability, but sometimes the computer makes them anyway, by sheer random chance.  The real Kasparov also sometimes makes moves that I assigned a low probability, but only when the move has a better rationale than I realized - the astonishing, unanticipated queen sacrifice.

Randomized Yudkowsky-Kasparov is definitely no smarter than Yudkowsky, because RYK draws on no more chess skill than I myself possess - I build all the probability distributions myself, using only my own abilities.  Actually, RYK is a far worse player than Yudkowsky.  I myself would make the best move I saw with my knowledge.  RYK only occasionally makes the best move I saw - I won't be very confident that Kasparov would make exactly the same move I would.

Now suppose that I myself play a game of chess against the RYK system.

RYK has the odd property that, on each and every turn, my probabilistic prediction for RYK's move is exactly the same prediction I would make if I were playing against world champion Garry Kasparov.

Nonetheless, I can easily beat RYK, where the real Kasparov would crush me like a bug.

The creative unpredictability of intelligence is not like the noisy unpredictability of a random number generator.  When I play against a smarter player, I can't predict exactly where my opponent will move against me.  But I can predict the end result of my smarter opponent's moves, which is a win for the other player.  When I see the randomized opponent make a move that I assigned a tiny probability, I chuckle and rub my hands, because I think the opponent has randomly made a dreadful move and now I can win.  When a superior opponent surprises me by making a move to which I assigned a tiny probability, I groan because I think the other player saw something I didn't, and now I'm about to be swept off the board.  Even though it's exactly the same probability distribution!  I can be exactly as uncertain about the actions, and yet draw very different conclusions about the eventual outcome.

(This situation is possible because I am not logically omniscient; I do not explicitly represent a joint probability distribution over all entire games.)

When I play against a superior player, I can't predict exactly where my opponent will move against me.  If I could predict that, I would necessarily be at least that good at chess myself.  But I can predict the consequence of the unknown move, which is a win for the other player; and the more the player's actual action surprises me, the more confident I become of this final outcome.

The unpredictability of intelligence is a very special and unusual kind of surprise, which is not at all like noise or randomness.  There is a weird balance between the unpredictability of actions and the predictability of outcomes.