Superintelligence isn’t Approximated by a Rational Agent
(Thank you to Jessica Taylor and others who provided feedback on this essay) The practice of coherently ordering the good and the bad began with Plato and Socrates. In the day of the ancient Greeks, this ordering was accomplished through the method of classical dialectics, essentially by debate through dialogue about the definitions and nature of things. Today, among neo-rationalists, a similar sort of operation, the creation of coherent preferences, what is better or worse for a given goal, is accomplished through various types of calculations. Regardless of the function used to order the good vs the bad, this ordering and its coherency has remained the basis for rationality, even Plato would say that this ordering must be free from contradiction, such that an item cannot take both 1st and 5th place on the list. Within the context of AI safety debates, a number of arguments have used this idea of coherent ordering of the good in the following way: 1. Agents which do not utility maximize according to coherent preferences are vulnerable to dominated strategies 2. Superintelligent agents are unlikely to pursue dominated strategies 3. Therefore superintelligent agents will have coherent preferences, or some approximation thereof. I intend to show that there are properties that most agree that superintelligent AI will have to possess which are incompatible with coherent preferences. Specifically, that if superintelligent AI is open-ended and capable of producing novel artifacts, as Google Deepmind researchers put it, it will necessarily have incoherent preferences due to making functions, rather than specific representations of reality (i.e. world states), its goal. I’ve previously made an argument very close to this one in previous posts on my blog, however I believe I may have used concepts and language which was unfamiliar to those in the AI Safety/Lesswrong world, so, after several long conversations with rationalists on this topic, I’m writing this post in a