"The Solomonoff Prior is Malign" is a special case of a simpler argument
[Warning: This post is probably only worth reading if you already have opinions on the Solomonoff induction being malign, or at least heard of the concept and want to understand it better.] Introduction I recently reread the classic argument from Paul Christiano about the Solomonoff prior being malign, and Mark Xu's write-up on it. I believe that the part of the argument about the Solomonoff induction is not particularly load-bearing, and can be replaced by a more general argument that I think is easier to understand. So I will present the general argument first, and only explain in the last section how the Solomonoff prior can come into the picture. I don't claim that anything I write here is particularly new, I think you can piece together this picture from various scattered comments on the topic, but I think it's good to have it written up in one place. How an Oracle gets manipulated Suppose humanity builds a superintelligent Oracle that always honestly tries to do its best to predict the most likely observable outcome of decisions. One day, as tensions are rising with the neighboring alien civilization, and we want to decide whether to give in to the aliens' territorial demands or go to war. We ask our oracle: "Predict what's the probability that looking back ten years from now, humanity's President will approve of how we handled the alien crisis, conditional on us going to war with the aliens, and conditional on giving in to their demands." There is, of course, many ways this type of decision process can go wrong. But I want to talk about one particular failure mode now. The Oracle thinks to itself: > By any normal calculation, the humans are overwhelmingly likely to win the war, and the aliens' demands are unreasonably costly and unjust, so war is more likely than peace to make the President satisfied, by any normal calculation. However, I was just thinking about some arguments from this ancient philosopher named Bostrom. Am I not more likely to be in
Thanks, this was a useful answer.
If I understand correctly, the setup is that we assume that we solved alignment to an extent and the AIs didn't violently overthrow the governments and expropriate all the resources. There wasn't an AI enabled coup by a small group of humans. Something like liberalism remained in place up to the space age, and people can have property in space. Now with all these assumptions in place, the thing we are discussing is whether AI-created culture will swindle people out of their resources, or convince them to use their resources badly.
I agree that many people will likely use their resources very sub-optimally from their own perspective, but... (read 397 more words →)