You are viewing revision 1.0.0, last edited by Eliezer Yudkowsky

Gloss: An expected utility agent has some way of scoring the consequences of its actions (e.g., rescuing a burning orphanage is worth 20 points), and it weighs actions according to their expected scores. This simple-sounding assumption has a lot of consequences.

Summary: An expected utility agent has some way of consistently scoring all the possible outcomes of its actions, and it weighs its actions by estimating the average score of its possible consequences. For example, an action with a 50% chance of leading to an outcome with utility 20, a 25% chance of leading to an outcome with utility 35, and a 25% chance of leading to an outcome with utility 45, would have an expected utility of 30. By assumption, this expected utility summarizes everything the agent cares about regarding the action; the agent is indifferent between any two actions that have the same expected utility. These utilities can potentially reflect any sort of morality or values - selfishness, altruism, or paperclips. Thus, being an expected utility agent shouldn't be confused with being a utilitarian or psychological hedonist. Several famous mathematical theorems suggest that if you can't be viewed as some type of expected utility agent, you must be going in circles, making bad bets, or exhibiting other detrimental behaviors. Several famous experiments show that human beings do exhibit those behaviors, and can't be viewed as expected utility agents.

Technical summary: An agent with a coherent utility function over outcomes and a coherent counterfactual probability function that relates its accessible actions to their probable outcomes. Combining the utility function on outcomes, with the probability function from actions to outcomes, yields an action's expectation of utility. Most such agents treated in the literature are maximizers, but other forms of optimization could also qualify if the decision rule equivalently treated actions of equivalent expected utility. Several famous coherence theorems suggest that any agent not exhibiting stupid behavior must be viewable as an expected utility agent.

An agent whose decision rule treats two actions equivalently whenever they have the same expected utility.

write longer explanation of expected utility, the consequences of the assumption, and an introduction.