Suppose you know what outcomes are better than what other outcomes, but not by how much. Given two possible actions you could take,  and , if you know what probability distributions over outcomes each of them results in, this doesn't necessarily help you pick which is better. Being able to compare outcomes isn't enough for you to be able to compare probability distributions over outcomes.

But what if I told you that the worst outcome that could possibly occur if you pick  is better than the best outcome that could possibly occur if you pick ? Now the choice is very easy. There are no trade-offs, and you don't need to do a risk analysis.  is just better.

Now what if it's not quite true that every possible outcome of  is better than every possible outcome of , but it's almost true, in the sense that there's some threshold such that the probability that  is better than that threshold is almost 1, and the probability that  is worse than that threshold is almost 0? For instance, let's say the 0.01%ile outcome if you pick  is better than the 99.99%ile outcome if you pick . Then it's not quite as clear-cut. Maybe most of the time  is only slightly better than , but the tiny possibility of  being worse than usual involves  being enormously worse than usual, or the tiny possibility of  being better than usual involves  being enormously better than usual. But this is still a pretty big hint that  is better in expectation.  is better unless your expected utility calculations are radically altered by negligible-probability tail events.

When to act like your utility is linear

Let's suppose from now on that the outcomes you're comparing are how much money you have. Of course, how rich you are is far from a perfect metric for measuring the extent to which you're achieving your goals, but that's okay; money is still a thing that people generally want more of, and we can assume that everything important that's not downstream of wealth is being held constant.

Suppose you're given a long sequence of opportunities to make bets. On each step, you can choose among some set of options for what bet to place, and there will be a maximum stakes you'll be able to bet at each time, because whoever or whatever you're betting against will only accept small bets. Assume that only the stakes that will be accepted, and not your own financial resources, constrain what bets you can take (perhaps you have unlimited access to credit, or perhaps the stakes are low enough relative to your starting wealth that you are vanishingly unlikely to go broke), and there is no tendency for the stakes that will be accepted to change over time.

In this case, the optimal strategy is to pick whatever option maximizes the expected value of your wealth on each step. This is because of the central limit theorem: After a large number of steps, the total wealth gained will, with very high probability, be very close to the number of steps times the expected profit per step, so in the long run, whatever strategy maximizes expected profit per step wins. Specifically, if the optimal strategy gets you expected profit  per step, and you instead opt for a strategy that gets you expected profit  per step, then, after  steps, you lose out on  profit on average, with a standard deviation proportional to . Thus the number of standard deviations away from average it takes for your alternative strategy to do better is proportional to . The probability of this approaches  as .

When to act like your utility is logarithmic (the Kelly criterion)

Now suppose you're given a long sequence of opportunities to make bets, where there is no limit to how much you can bet at each step except that you can't put at risk more money than you have (you have no access to credit). Now the range of betting opportunities available to you at each step is proportional to your wealth. So your choices for probability distributions over what factor your wealth gets multiplied by remains constant when your wealth changes. That is, your choices for probability distributions over what gets added to your log wealth remains constant when your wealth changes.

So, on a log scale, we're in the situation described in the previous section, where we're adding the results of a large number of gambles, where the gambles available don't depend on time or your current wealth. Thus making whatever bet maximizes your expected log wealth on each step is optimal when the number of steps is large, for the same reasons as in the previous section.

Some unimportant caveats

In each of the scenarios above, when comparing the optimal strategy to any given other strategy, the optimal strategy gets the better result with probability approaching , but that probability never actually equals  after any finite number of steps. So it is possible for negligible-probability tail events to affect which strategy is actually better in expectation.

For example, if your utility function is linear, and you're given the opportunity to bet all your money on a gamble of positive expected value, you keep doing it over and over again, this seeming better than Kelly betting because, although you almost always end up broke, the expected value of going all in every time is enormous because of the negligible-probability tail event where you win every bet. Trying to take this expected value calculation seriously becomes increasingly difficult as the number of bets increases and the probability of winning all of them goes to , because you don't actually get linear utility from money.

After a moderate number of steps, when the probability of these tail events is merely small rather than negligible, it's not crazy to be moved by them; there are unlikely events that are nonetheless worth planning around because they are tremendously more important than other much more likely events that we also care about to a nontrivial degree. But as the number of steps becomes very large and the probability of the alternative strategy ending up better becomes vanishingly small, letting negligible-probability tail events jerk you around becomes increasingly ridiculous; no one actually pays Pascal's mugger. So I think the assumption that negligible-probability tail events don't have a large effect on expected value calculations is reasonable. This is the case, for instance, if your utility function is bounded, and the derivative of utility with respect to money is non-negligible at likely outcomes of the optimal strategy.

Another issue is that, under the assumptions of either of the previous sections, while for any given alternative strategy, the asymptotically optimal strategy will eventually be better in expectation than the alternative strategy (according to a utility function that doesn't get jerked around by negligible-probability tail events), it is not the case that there is a sufficient number of steps after which the asymptotically optimal strategy is the best in expectation among all strategies (unless your utility function actually is linear, or logarithmic, respectively, like the asymptotically optimal strategy acts like it is on each step). I don't think this is important because as the number of steps approaches infinity, the optimal-in-expectation strategy will approach the asymptotically optimal strategy, so if the number of steps is large, you can just follow the asymptotically optimal strategy and not worry too much about the negligible amounts of expected value you lose by not slightly adjusting your strategy.

Some important caveats

The two models presented above have assumptions, and assumptions don't always hold in every real-world situation. Maybe the number of steps you have to make decisions on just isn't that large, leaving more room for your action utility function to be important. Maybe the constraints you face on what stakes you can take bets at don't match the assumptions in either of the above models. Or maybe you're a dumb human instead of a perfect Bayesian, and your expected profit calculations are systematically biased, and you won't update fast enough for this not to cause problems. Because of these sorts of considerations, fractional Kelly betting is often recommended in situations where the Kelly criterion might look like it applies.

New to LessWrong?

New Comment
14 comments, sorted by Click to highlight new comments since: Today at 3:25 PM

Another reason is that you can laugh at Pascal's Mugger. , and when (the payoff) is enormous and (the probability of winning) is tiny, but not as tiny as is enormous, then (the fraction of your wealth to bet) is close to , i.e. almost nothing. The Kelly bet is never a greater fraction than the probability of winning.

Then it's not quite as clear-cut. Maybe most of the time  is only slightly better than , but the tiny possibility of  being worse than usual involves  being enormously worse than usual, or the tiny possibility of  being better than usual involves  being enormously better than usual. But this is still a pretty big hint that  is better in expectation.  is better unless your expected utility calculations are radically altered by negligible-probability tail events.

But this is exactly the situation of comparing Kelly betting to max-EV betting! When you're betting your whole stack each timestep, what you're doing is squeezing all of the value into that tiny possibility that  is better than usual, such that it outstrips  all on its own. If you assume that this is a bad idea and we shouldn't do it, then sure, something like Kelly betting pops out, but I find it unsatisfying.

I took the main point of the post to be that there are fairly general conditions (on the utility function and on the bets you are offered) in which you should place each bet like your utility is linear, and fairly general conditions in which you should place each bet like your utility is logarithmic. In particular, the conditions are much weaker than your utility actually being linear, or than your utility actually being logarithmic, respectively, and I think this is a cool point. I don't see the post as saying anything beyond what's implied by this about Kelly betting vs max-linear-EV betting in general.

I basically endorse what kh said. I do think it's wrong to think you can fit enormous amounts of expected value or disvalue into arbitrarily tiny probabilities.

Yes, I would agree with this. If we suppose our utility function is bounded, then when given unlimited access to a gamble in our favor, we should basically be asking "Having been handed this enormous prize, how do I maximize the probability that I max out on utility?"

Hm, but that actually doesn't give back any specific criterion, since basically any strategy that never bets your whole stack will win. What happens if you try to minimize the expected time until you hit the maximum?

Optimal play will definitely diverge from the Kelly criterion when your stack is close to the maximum. But in the limit of a large maximum I think you recover the Kelly criterion, for basically the reason you give in this post.

"Having been handed this enormous prize, how do I maximize the probability that I max out on utility?" Hm, but that actually doesn't give back any specific criterion, since basically any strategy that never bets your whole stack will win.

That's not quite true. If you bet more than double Kelly, your wealth decreases. But yes, Kelly betting isn't unique in growing your wealth to infinity in the limit as number of bets increases.

If the number of bets is very large, but due to some combination of low starting wealth relative to the utility bound and slow growth rate, it is not possible to get close to maximum utility, then Kelly betting should be optimal.

Is this just a lot of words to say "many wagers are constrained such that the Kelly-recommended amount is out of range, so it becomes a binary "bet or don't"?  I don't see the linearity there - you're still using logarithmic valuation to make fixed bets (or not, if the minimum is much higher than the Kelly amount).

No. The point of the model where acting like your utility is linear is optimal wasn't that this is a more realistic model than the assumptions behind the Kelly criterion; it's just another simplified model, which is slightly easier to analyze, so I was using it as a step in showing why you should follow the Kelly criterion when it is your wealth that constrains the bet sizes you can make. It's also not true that the linear-utility model I described is still just maximizing log wealth; for instance, if the reason that you're never constrained by available funds is that you have access to credit, then your wealth could go negative, and then its log wouldn't even be defined.

Sure, I'm a fan of simplified calculations.  But it's not either-or.  Kelly simplifies to "bet your edge" for even-money wagers, and that's great.  It simplifies to "bet the max on +EV wagers" in cases where "the max" is a small portion of your net worth. 

It's great, but it's not a different model, it's just a simplified calculation for special cases.

Again, the max being a small portion of your net worth isn't the assumption behind the model; the assumption is just that you don't get constrained by lack of funds, so it is a different model. It's true that if the reason you don't get constrained by lack of funds is that the maximum bets are small relative to your net worth, then this is also consistent with maximizing log wealth on each step. But this isn't relevant to what I brought it up for, which was to use it as a step in explaining the reason for the Kelly criterion in the section after it.

I suspect I'm dense, or at least missing something. If ability to make future bets aren't impacted by losing earlier bets, that implies that you cannot bet more than Kelly.  Or are there other ways to not be constrained by lack of funds?  

An example of a bet you'd make in the linear model, which you wouldn't make in the logarithmic (bankroll-preserving) model, would help a lot.

Access to credit. In the logarithmic model, you never make bets that could make your net worth zero or negative.

Access to credit, presuming it's finite, just moves the floor, it doesn't change the shape.  It gets complicated when credit has a cost, because it affects the EV of bets that might force you to dip into credit for future bets.  If it's zero-interest, you can just consider it part of your bankroll.  Likewise future earnings - just part of your bankroll (though also complicated if future earnings are uncertain).

It is true that in practice, there's a finite amount of credit you can get, and credit has a cost, limiting the practical applicability of a model with unlimited access to free credit, if the optimal strategy according to the model would end up likely making use of credit which you couldn't realistically get cheaply. None of this seems important to me. The easiest way to understand the optimal strategy when maximum bet sizes are much smaller than your wealth is that it maximizes expected wealth on each step, rather than that it maximizes expected log wealth on each step. This is especially true if you don't already understand why following the Kelly criterion is instrumentally useful, and I hadn't yet gotten to the section where I explained that, and in fact used the linear model in order to show that Kelly betting is optimal by showing that it's just the linear model on a log scale.

One could similarly object that since currency is discrete, you can't go below 1 unit of currency and continue to make bets, so you need to maintain a log-scale bankroll where you prevent your log wealth from going negative, and you should really be maximizing your expected log log wealth, which happens to give you the same results when your wealth is a large enough number of currency units that the discretization doesn't make a difference. Like, sure, I guess, but it's still useful to model currency as continuous, so I see no need to account for its discreteness in a model. Similarly, in situations where the limitations on funds available to place bets with don't end up affecting you, I don't think it needs to be explicitly included in the model.