Under an Occam prior the laws already lean simple. SSA leaves that tilt unchanged, whereas SIA multiplies each world’s weight by the total number of observers in the reference class. That means SSA, relative to SIA, favors worlds that stay simple, while SIA boosts those that are populous once the simplicity penalty is paid. Given that, can we update our credence in SSA vs. SIA by looking at how simple our universe’s laws appear and how many observers it seems to contain?
Is this trivializing the concept of a Utility Function?
This post was from a long time ago. I think it is important to reconsider everything written, after developments in machine learning.
How are humans exploitable, given that they don't have utility functions?
Since humans are not EU maximizers and are exploitable, can someone give an example of how they are exploitable?
Is exploitability necessarily unstable? Could there be a tolerable level of exploitability, especially if it allows for tradeoffs with desirable characteristics that are only available to non-EU maximizers?"
Why is this not true for most humans? Many religious people would not want to modify the lightcone as they think that it's God's territory to modify.
The initial distribution of values need not be highly related to the resultant values after moral philosophy and philosophical self-reflection. Optimizing hedonistic utilitariansm, for example, looks very little like any values from the outer optimization loop of natural selection.
Although there would be pressure for an AI to not be exploitable, wouldn't there also be pressure for adaptability and dynamism? The ability to alter preferences and goals given new environments?
If an agent has a preference for a move in a specific position in chess, but then gets more compute and more optimization and gets better at chess, and makes a different move after getting better, would you say it's preference changed, or it reduced epistemic uncertainty and got better at achieving it's preference, which stayed the same?