LESSWRONG
LW

Wikitags

User maximization

Edited by Eliezer Yudkowsky last updated 1st Apr 2016

A sub-principle of avoiding user manipulation - if you've formulated the AI in terms of an argmax over X or an "optimize X" instruction, and X includes a user interaction as a subpart, then you've just told the AI to optimize the user. For example, let's say that the AI's criterion of action is "Choose a policy p by maximizing/optimizing the probability X that the user's instructions are carried out." If the user's instructions are a variable inside the formula for X rather than a constant outside it, you've just told the AI to try and get the user to give it easier instructions.

Parents:
User manipulation
2
2
Discussion0
Discussion0