[ Question ]

Expected utility and repeated choices

by Marco Discendenti 1 min read27th Dec 20195 comments


Maybe this is a well known kind of problem but I am a novice and it looks puzzling to me.

Here is a lottery: I have these two choices:

  • (a) get 0.5$ for sure
  • (b) win 1$ with probability or nothing with probability

My utility function is .

What should I choose?

Let's compute the expected utilities:

  • expected utility for one single game is for (b) while for (a) is so I have maximized expected utility with choice (a)
  • if I compute expected utility for two games I get a different prescription:
    • utility for chosing (a) two times is
    • the expected utility for chosing (b) two times is

This last computation is equal to which is greater than the utility of double (a) (i.e. 1) so in order to maximize expected utility I should actually prefer to play (b) two times rather than playing (a) two times.

So we have this apparent inconsistency:

  • for one single game it's better to choose (a)
  • for two games it's better to choose (b) both times

This result is puzzling to me because I would expect that utility maximization for one single game should be enough in order to take the decision regardless of what I am allowed to do in future choices. It seems instead that the mere possibility that I could play this same lottery another time changes the convenience of the choices about what to play in the first game. If this is the case then utility theory seems almost useless: I would be forced to put in my computation the whole list of my possible future choices!

Am I missing something or is this an actual problem?

New Answer
Ask Related Question
New Comment

2 Answers

The intuitive result you would expect only holds for utility function which are linear in x (I believe..), since we could then apply the utility function at each step and it would yield the same value as if applied to the whole amount.

Another case would be if you were to receive your utility immediately after playing each game (like in a reinforcement learning algorithm). In those cases is also applied to each outcome separately and would yield the result you would expect.

Also: (b) has a better EV in terms of raw $ and due to law of large numbers we would expect the actual amount of money won by repeatedly playing (b) to approach that EV. So for many games we should expect any monotonic increasing utility function to favor (b) over (a) as the number of games approaches infinity. The only reason your U favors (a) over (b) for a single game is that it is risk-averse, i.e. sub-linear in x. As the amount of games approaches infinity the risk of choosing to play b becomes less and less until it is the choice between (essentially) winning 0.5$ for sure or 0.67$ for sure in every game. If you think about it in these terms it becomes more intuitive why the behaviour observed by you is reasonable.

In other words: Yes! You do have to think about the amount of games you play if your utility function is not linear (or you have a strong discount factor).

<Tried to retract this comment since I no longer agree with it, but it doesn't seem to be working>