Vinge's Principle — LessWrong

Vinge's Principle says that, in domains complicated enough that perfect play is not possible, less intelligent agents will not be able to predict the exact moves made by more intelligent agents.

For example, if you knew exactly where Deep Blue would play on a chessboard, you'd be able to play chess at least as well as Deep Blue by making whatever moves you predicted Deep Blue would make. So if you want to write an algorithm that plays superhuman chess, you necessarily sacrifice your own ability to (without machine aid) predict the algorithm's exact chess moves.

This is true even though, as we become more confident of a chess algorithm's power, we become more confident that it will eventually win the chess game. We become more sure of the game's final outcome, even as we become less sure of the chess algorithm's next move. This is Vingean uncertainty.

Now consider agents that build other agents (or build their own successors, or modify their own code). Vinge's Principle implies that the choice to approve the successor agent's design must be made without knowing the successor's exact sensory information, exact internal state, or exact motor outputs. In the theory of tiling agents, this appears as the principle that the successor's sensory information, cognitive state, and action outputs should only appear inside quantifiers. This is Vingean reflection.

For the rule about fictional characters not being smarter than the author, see Vinge's Law.