Applying utility functions to humans considered harmful

by Kaj_Sotala 5 min read3rd Feb 2010116 comments


There's a lot of discussion on this site that seems to be assuming (implicitly or explicitly) that it's meaningful to talk about the utility functions of individual humans. I would like to question this assumption.

To clarify: I don't question that you couldn't, in principle, model a human's preferences by building this insanely complex utility function. But there's an infinite amount of methods by which you could model a human's preferences. The question is which model is the most useful, and which models have the least underlying assumptions that will lead your intuitions astray.

Utility functions are a good model to use if we're talking about designing an AI. We want an AI to be predictable, to have stable preferences, and do what we want. It is also a good tool for building agents that are immune to Dutch book tricks. Utility functions are a bad model for beings that do not resemble these criteria.

To quote Van Gelder (1995):

Much of the work within the classical framework is mathematically elegant and provides a useful description of optimal reasoning strategies. As an account of the actual decisions people reach, however, classical utility theory is seriously flawed; human subjects typically deviate from its recommendations in a variety of ways. As a result, many theories incorporating variations on the classical core have been developed, typically relaxing certain of its standard assumptions, with varying degrees of success in matching actual human choice behavior.

Nevertheless, virtually all such theories remain subject to some further drawbacks:

(1) They do not incorporate any account of the underlying motivations that give rise to the utility that an object or outcome holds at a given time.
(2) They conceive of the utilities themselves as static values, and can offer no good account of how and why they might change over time, and why preferences are often inconsistent and inconstant.
(3) They offer no serious account of the deliberation process, with its attendant vacillations, inconsistencies, and distress; and they have nothing to say about the relationships that have been uncovered between time spent deliberating and the choices eventually made.

Curiously, these drawbacks appear to have a common theme; they all concern, one way or another, temporal aspects of decision making. It is worth asking whether they arise because of some deep structural feature inherent in the whole framework which conceptualizes decision-making behavior in terms of calculating expected utilities.

One model that attempts to capture actual human decision making better is called decision field theory. (I'm no expert on this theory, having encountered it two days ago, so I can't vouch for how good it actually is. Still, even if it's flawed, it's useful for getting us to think about human preferences in what seems to be a more realistic way.) Here's a brief summary of how it's constructed from traditional utility theory, based on Busemeyer & Townsend (1993). See the article for the mathematical details, closer justifications and different failures of classical rationality which the different stages explain.

Stage 1: Deterministic Subjective Expected Utility (SEU) theory. Basically classical utility theory. Suppose you can choose between two different alternatives, A and B. If you choose A, there is a payoff of 200 utilons with probability S1, and a payoff of -200 utilons with probability S2. If you choose B, the payoffs are -500 utilons with probability S1 and +500 utilons with probability S2. You'll choose A if the expected utility of A, S1 * 200 + S2 * -200 is higher than the expected utility of B, S1 * -500 + S2 * 500, and B otherwise.

Stage 2: Random SEU theory. In stage 1, we assumed that the probabilities S1 and S2 stay constant across many trials. Now, we assume that sometimes the decision maker might focus on S1, producing a preference for action A. On other trials, the decision maker might focus on S2, producing a preference for action B. According to random SEU theory, the attention weight for variable Si is a continous random variable, which can change from trial to trial because of attentional fluctuations. Thus, the SEU for each action is also a random variable, called the valence of an action. Deterministic SEU is a special case of random SEU, one where the trial-by-trial fluctuation of valence is zero.

Stage 3: Sequential SEU theory. In stage 2, we assumed that one's decision was based on just one sample of a valence difference on any trial. Now, we allow a sequence of one or more samples to be accumulated during the deliberation period of a trial. The attention of the decision maker shifts between different anticipated payoffs, accumulating weight to the different actions. Once the weight of one of the actions reaches some critical threshold, that action is chosen. Random SEU theory is a special case of sequential SEU theory, where the amount of trials is one.

Consider a scenario where you're trying to make a very difficult, but very important decisions. In that case, your inhibitory threshold for any of the actions is very high, so you spend a lot of time considering the different consequences of the decision before finally arriving to the (hopefully) correct decision. For less important decisions, your inhibitory threshold is much lower, so you pick one of the choices without giving it too much thought.

Stage 4: Random Walk SEU theory. In stage 3, we assumed that we begin to consider each decision from a neutral point, without any of the actions being the preferred one. Now, we allow prior knowledge or experiences to bias the initial state. The decision maker may recall previous preference states, that are influenced in the direction of the mean difference. Sequential SEU theory is a special case of random walk theory, where the initial bias is zero.

Under this model, decisions favoring the status quo tend to be chosen more frequently under a short time limit (low threshold), but a superior decision is more likely to be chosen as the threshold grows. Also, if previous outcomes have already biased decision A very strongly over B, then the mean time to choose A will be short while the mean time to choose B will be long.

Stage 5: Linear System SEU theory. In stage 4, we assumed that previous experiences all contribute equally. Now, we allow the impact of a valence difference to vary depending on whether it occurred early or late (a primacy or recency effect). Each previous experience is given a weight given by a growth-decay rate parameter. Random walk SEU theory is a special case of linear system SEU theory, where the growth-decay rate is set to zero.

Stage 6: Approach-Avoidance Theory. In stage 5, we assumed that, for example, the average amount of attention given to the payoff (+500) only depended on event S2. Now, we allow the average weight to be affected by a another variable, called the goal gradient. The basic idea is that the attractiveness of a reward or the aversiveness of a punishment is a decreasing function of distance from the point of commitment to an action. If there is little or no possibility of taking an action, its consequences are ignored; as the possibility of taking an action increases, the attention to its consequences increases as well. Linear system theory is a special case of approach-avoidance theory, where the goal gradient parameter is zero.

There are two different goal gradients, one for gains and rewards and one for losses or punishments. Empirical research suggests that the gradient for rewards tends to be flatter than that for punishments. One of the original features of approach-avoidance theory was the distinction between rewards versus punishments, closely corresponding to the distinction of positively versus negatively framed outcomes made by more recent decision theorists.

Stage 7: Decision Field Theory. In stage 6, we assumed that the time taken to process each sampling is the same. Now, we allow this to change by introducing into the theory a time unit h, representing the amount of time it takes to retrieve and process one pair of anticipated consequences before shifting attention to another pair of consequences. If h is allowed to approach zero in the limit, the preference state evolves in an approximately continous manner over time. Approach-avoidance is a spe... you get the picture.



Now, you could argue that all of the steps above are just artifacts of being a bounded agent without enough computational resources to calculate all the utilities precisely. And you'd be right. And maybe it's meaningful to talk about the "utility function of humanity" as the outcome that occurs when a CEV-like entity calculated what we'd decide if we could collapse Decision Field Theory back into Deterministic SEU Theory. Or maybe you just say that all of this is low-level mechanical stuff that gets included in the "probability of outcome" computation of classical decision theory. But which approach do you think gives us more useful conceptual tools in talking about modern-day humans?

You'll also note that even DFT (or at least the version of it summarized in a 1993 article) assumes that the payoffs themselves do not change over time. Attentional considerations might lead us to attach a low value to some outcome, but if we were to actually end up in that outcome, we'd always value it the same amount. This we know to be untrue. There's probably some even better way of looking at human decision making, one which I suspect might be very different from classical decision theory.

So be extra careful when you try to apply the concept of a utility function to human beings.