Preference framework

Edited by Eliezer Yudkowsky, et al. last updated 14th Feb 2017

A 'preference framework' refers to a fixed algorithm that updates, or potentially changes in other ways, to determine what the agent prefers for terminal outcomes. 'Preference framework' is a term more general than 'utility function' which includes structurally complicated generalizations of utility functions.

As a central example, the utility indifference proposal has the agent switching between utility functions and $U_{Y}$ depending on whether a switch is pressed. We can call this meta-system a 'preference framework' to avoid presuming in advance that it embodies a VNM-coherent utility function.

An even more general term would be Decision_algorithm which doesn't presume that the agent operates by preferring outcomes.

Parents:

Value alignment problem

Children:

Moral uncertainty

Attainable optimum

and 1 more