# 16

The VNM theorem is constructive; given a rational preference relation, it is easy to construct a corresponding utility function. But in may cases, it may be computationally intractable to choose between two lotteries by computing each of their expected utilities and picking the bigger one, even while it may be easy to decide between them in other ways.

Consider a human deciding whether or not to buy a sandwich. There are several factors that go into this decision, like the price of substitutes, the human's future income, and how tasty the sandwich will be. But these factors are still fairly limited; the human does not consider, for instance, the number of third cousins they have, the outcome of a local election in Siberia, or whether there is intelligent life elsewhere in the galaxy. The predictable effects of buying a sandwich are local in nature, and the human's preferences also have an at least somewhat local character, so they need not think too much about too many details of the world in order to make their decision.

Compare this with how difficult it would be for the human to compute the expected utility of the world if they buy the sandwich, and the expected utility of the world if they don't buy the sandwich, each to enough precision that they can tell which is bigger. The way to construct a utility function corresponding to a given rational preference relation is to pick two lotteries as reference points, declare that the preferred lottery has utility 1 and the dispreferred lottery has utility 0, and compute the utility of other lotteries by looking at what the preference relation says about weighted averages of the lottery with the two reference points. But the reference points could differ from the lotteries resulting from buying a sandwich and from not buying a sandwich in many ways; perhaps these reference points differ from the real world in the number of third cousins the human has, the outcome of a local election in Siberia, and the presence of intelligent life elsewhere in this galaxy. In order to compute the expected utility of buying a sandwich and of not buying a sandwich, both to high enough precision that they can tell which is bigger, the human must consider how all of these factors differ from the reference points, and decide how much they care about each of them. Doing this would require heroic feats of computation, introspection, and philosophical progress, which aren't really needed for the simple decision of whether or not to buy a sandwich. They might try picking realistic reference points, but if they learn more about the world since picking reference points, then there might be systematic differences between the reference points and the available options anyway.

Some decisions have much more far-reaching consequences than the decision to buy a sandwich (such as deciding whether or not to quit your job), and this is especially true for especially powerful agents (such as a political leader deciding whether or not to invade another country). And decisions with farther-reaching consequences are more difficult to make. But even the actions of a superintelligent AI only affect its future light-cone (neglecting strategies like acausal trade, which may extend its reach somewhat). And even for very powerful agents whose actions are quite far-reaching, it seems likely that most of their decisions can be broken into components that each have more localized consequences.

For the same reasons a rational agent wouldn't want to use its utility function to make decisions, a utility function might also not be useful for someone else to model the agent's preferences with. Someone trying to predict the behavior of a rational agent doesn't want to do a bunch of unnecessary computation to determine the utilities of each possible action the agent could take any more than the agent does.

A possible counterargument is that one might want to know the magnitude of a preference, rather than just its sign, for the purposes of estimating how likely the preference is to be reversed after further thought, or under random perturbations of the available options. But even then, the difference between the expected utilities of two lotteries combines information about the strength of the preference between those two lotteries with information about the strength of the preference between the two reference points that were used to define utility. If an agent changes its estimate of the difference in expected utility between two lotteries after thinking for longer, that doesn't tell you whether they were thinking more about these two lotteries or the two reference points. So it might be better to model the robustness of a value judgment directly, instead of as a difference of expected utilities. This could be done by estimating how large certain changes in value judgments in expected directions, or how large certain perturbations of the lotteries in expected directions, would have to be in order to reverse the preference between the two lotteries.

# 16

Mentioned in
New Comment

Yeah, it makes sense to study perfect theories separately from approximations. That applies to probability theory too.

Imagine there's a prophet who predicts tomorrow's weather and is always right. Probability theory says you should have a prior over the prophet's possible mechanisms before you decide to trust him. But your mind is too small for that, and after seeing the prophet succeed for ten years in a row you'll start trusting him anyway. That comes from something different from probabilistic reasoning, let's call it "approximate reasoning".

Why is that relevant? Well, I think logical induction is in the same boat. Solving logical uncertainty doesn't mean finding an analogue of perfect probability theory that would work in logic: we already have that, it's called logic! Instead, solving logical uncertainty is finding an analogue of approximate reasoning that would work in logic. It feels like a messy problem for the same reason that approximate reasoning with ordinary probabilities is messy. So if we ever find the right theory of reasoning when your mind is too small to have a prior over everything, it might well apply to both ordinary and logical uncertainty, and sit on top of probability instead of looking like probability.

I think this is related to a general class of mistakes, so I just wrote up a post on it.

This case is a bit different from what that post discusses, in that you're not focused on a non-critical assumption, but on a non-critical method. We can use VNM rationality for decision-making just fine without computing full utilities for every decision; we just need to compute enough to be confident that we're making the higher-utility choice. For that purpose we can use tricks like e.g. changing the unit of valuation on the fly, making approximations (as long as we keep track of the error bars), etc.

This seems like a strawman. There's a naive EU calculation that you can do just based on price, tastiness of sandwich etc that gives you what you want. And this naive EU calculation can be understood as an approximation of a global EU calculation. Of course, we should always use computationally tractable approximations whenever we don't have enough computing power to compute an exact value. This doesn't seem to have anything to do with utility functions in particular.

Regarding the normalization of utility differences by picking two arbitrary reference points, obviously if you want to systematize things then you should be careful to choose good units. QALYs are a good example of this. It seems unlikely to me that a re-evaluation of how many QALYs buying a sandwich is worth would arise from a re-evaluation of how valuable QALYs are, rather than a re-evaluation of how much buying the sandwich is worth.

It seems unlikely to me that a re-evaluation of how many QALYs buying a sandwich is worth would arise from a re-evaluation of how valuable QALYs are, rather than a re-evaluation of how much buying the sandwich is worth.

I disagree with this. The value of a QALY could depend on other features of the universe (such as your lifespan) in ways that are difficult to explicitly characterize, and thus are subject to revision upon further thought. That is, you might not be able to say exactly how valuable the difference between living 50 years and living 51 years is, denominated in units of the difference between living 1000 years and living 1001 years. Your estimate of this ratio might be subject to revision once you think about it for longer. So the value of a QALY isn't stable under re-evaluation, even when expressed in units of QALYs under different circumstances. In general, I'm skeptical that the concept of good reference points whose values are stable in the way you want is a coherent one.

Maybe there is no absolutely stable unit, but it seems that there are units that are more or less stable than others. I would expect a reference unit to be more stable than the unit "the difference in utility between two options in a choice that I just encountered".