SimonM

Comments

Kelly *is* (just) about logarithmic utility

For sure - both my titles were clickbait compared to what I was saying.

I think if I was trying to explain Kelly, I would definitely talk in terms of time-averaging and maximising returns. I (hope) I wouldn't do this as an "argument for" Kelly. I think if I was to make an argument for Kelly which is trying to persuade people it would be something close to my post. (Whereby I would say "Here are a bunch of nice properties Kelly has + it's simple + there are easy modifications if it seems too aggressive" and try to gauge from their reactions what I need to talk about).

I will definitely be more careful about how I phrase this stuff though. I think if I wrote both posts again I would think harder about which bits were an "argument" and which bits were guides for intuition. 

I actually wouldn't make very much of a defence for the Peters stuff. I (personally) put little stock in it. (At least, I haven't found the "Aha!" moment where what they seem to be selling clicks for me).

I think the most interesting thing about Kelly (which has definitely come through over our posts) is that Kelly is a very useful lens into preferences and utilities. (Regardless of which perspective you come from).

Kelly *is* (just) about logarithmic utility

Thanks for writing this! I feel like we're now much closer to each other in terms of what we actually think. I roughly suspect we agree:

  • Kelly is a litmus test for utilities
  • For a Bayesian with log-utility Kelly is the end of the story

You think the important bit is the utility, I think the important bit is what it says about people's utilities.

What is the VIX?

The simplest product (at least from an understanding point of view) would be VIX futures. These are futures which are (to a first approximation) cash settled to the VIX Index. (You can view the specs here).

One thing to notice is that they expire. This means that if you buy a future to gain exposure, when it expires you lose your exposure. (The same is true options - when they expire, you lose your optionality. (Actually, you lose some optionality on a daily basis which is par of why you can't own / replicate the VIX Index)). This means you have to come up with a strategy to "roll" your exposure before it expires. You can have a look at the term structure of VIX futures here

Another thing to notice is that VIX futures are the expected value of the index - NOT the index. Typically when vol explodes, the VIX Index goes very high, the front future goes high, the next future less high and so on... Depending on which futures you own, you will make money, but not as much as the index will have moved.

Typically retail investors tend to trade the VIX via ETFs. These tend to formalise a strategy of buying and rolling VIX futures. Generally you can find the details in the ETF docs. 

What is the VIX?

The VIX isn't tradeable. 

There are futures which are based off of the VIX. And there are ETFs which own have portfolios of those futures. These products are very different from "buying" the VIX and I would being very careful when "trading" or "investing" in these products. There are lots of products in this space, and they won't necessarily behave like you think they will.

Never Go Full Kelly

Just to be pedantic, I wanted to mention: if we take Fractional Kelly as the average-with-market-beliefs thing, it's actually full Kelly in terms of our final probability estimate, having updated on the market :)

Yes - I absolutely should have made that clearer.

Concerning your first argument, that uncertainty leads to fractional Kelly -- is the idea: 

  1. We have a probability estimate , which comes from estimating the true frequency ,
  2. Our uncertainty follows a Beta distribution,
  3. We have to commit to a fractional Kelly strategy based on our  and never update that strategy ever again

Sort of? 1. Yes, 2. no, 3. kinda.

I don't think it's an argument which leads to fractional Kelly. It's an argument which leads to "less than Kelly with a fraction which varies with your uncertainty". This (to be clear) is not fractional Kelly, where I think we're talking about a situation where the fraction is constant.

The chart I presented (copied from the Baker-McHale paper) does assume a beta distribution, and the "rule-of-thumb" which comes from that paper also assumes a beta distribution. The result that "uncertainty => go sub-Kelly" is robust to different models of uncertainty.

The first argument doesn't really make a case for fractional Kelly. It makes a case for two things:

  • Strong case: you should (unless you have really skewed uncertainty) be betting sub-Kelly
  • Rule-of-thumb: you can approximate how much sub-Kelly you should go using this formula. (Which isn't a fixed

So the graph shows what happens if we take our uncertainty and keep it as-is, not updating on data, as we continue to update?

Yes. Think of it as having a series of bets on different events with the same uncertainty each time.

Also, I don't understand the graph. (The third graph in your post.) You say that it shows growth rate vs Kelly fraction. Yet it's labeled "expected utility". I don't know what "expected utility" means, since the expected utility should grow unboundedly as we increase the number of iterations.

Or maybe the graph is of a single step of Kelly investment, showing expected log returns? But then wouldn't Kelly be optimal, given that Kelly maximizes log-wealth in expectation, and in this scenario the estimate  is going to be right on average, when we sample from the prior?

Yeah - the latter - I will edit this to make it clearer. This is "expected utility" for one-period. (Which is equivalent to growth rate). I just took the chart from their paper and didn't want to edit it. (Although that would have made things clearer. I think I'll just generate the graph myself).

Looking at the bit I've emphasised. No! This is the point. When  is too large, this error costs you more than when it's too small.

I think our confusion is coming from the fact we're thinking about two different scenarios:

Here I am considering  (notice the Kelly fraction depending on  inside the utility but not outside). "What is my expected utility, if I bet according to Kelly given my estimate". (Ans: Not Full Kelly)

I think you are talking about the scenario ? (Ans: Full Kelly) 

I'm struggling to extract the right quotes from your dialogue, although I think there are several things where I don't think I've managed to get my message across:

OTHER: In those terms, I'm examining the case where probabilities aren't calibrated.

I'm trying to find the right Bayesian way to express this, without saying the word "True probability". Consider a scenario where we're predicting a lot of (different) sports events. We could both be perfectly calibrated (what you say happens 20% of the time happens 20% of the time) etc, but I could be more "uncertain" with my predictions. If my prediction is always 50-50 I am calibrated, but I really shouldn't be betting. This is about adjusting your strategy for this uncertainty.

OTHER: So all I'm trying to do is examine the same game. But this time, rather than assuming we know the frequency of success from the beginning, I'm assuming we're uncertain about that frequency.

BAYESIAN: Right... look, when I accepted the original Kelly argument, I wasn't really imagining this circumstance where we face the exact same bet over and over. Rather, I was imagining I face lots of different situations. So long as my probabilities are calibrated, the long-run frequency argument works out the same way. Kelly looks optimal. So what's your beef with me going "full Kelly" on those estimates?

No, my view were always closer to BAYESIAN here. I think we're looking at a variety of different bets but where my probabilities are calibrated but uncertain. Being calibrated isn't the same as being right. I have always assumed here that you are calibrated.

BAYESIAN: Not precisely, but I could put more work into it if I wanted to. Is this your crux? Would you be happy for me to go Full Kelly if I could show you a perfect x=y line on my calibration graph? Are you saying you can calculate the  value for my fractional Kelly strategy from my calibration graph?

OTHER: ... maybe? I'd have to think about how to do the calculation. But look, even if you're perfectly calibrated in terms of past data, you might be caught off guard by a sudden change in the state of affairs.

No, definitely not. Your calibration graph really isn't relevant to me here.

BAYESIAN (who at this point regresses to just being Abram again): See, that's my problem. I don't understand the graph. I'm kind of stuck thinking that it represents someone with their hands tied behind their back, like they can't perform a Bayes update to improve their estimate , or they can't change their  after the start, or something.

This is almost certainly "on me". I really don't think I'm talking about a person who can't update their estimate and I advocate people adjusting their fraction. I think there's something which I've not made clear but I'm not 100% I know we've found what it is yet.

The strawman of your argument (which I'm struggling to understand where you differ) is. "A Bayesian with log-utility is repeatedly offered bets (mechanism for choosing bets unclear) against an unfair coin. His prior is that the coin comes up heads is uniform [0,1]. He should bet Full Kelly with p = 1/2 (or slightly less than Full Kelly once he's updated for the odds he's offered)". I don't think he should take any bets. (I'm guessing you would say that he would update his strategy each time to the point where he no longer takes any bets - but what would he do the first time? Would he take the bet?)

Never Go Full Kelly

I linked several papers, is there one in particular you are referring to and a section I could make clearer?

What is the VIX?

Roughly speaking, it's about "when" you take square roots and what that means for the product you are trading. Here is a handy guide on a zoo of vol/var swap/forward/future products.

The key thing is less about what "volatility" and "variance" have been. (Realized volatility is the square-root of realised variance). We're talking about the expectation for the next month's volatility or variance. 

The "mathematician" way to think about this (although I think this is a little unhelpful) is . If "X" is (future) realised variance (as yet unknown), then the former is "volatility" and the latter is "square root of variance" (what I call "variance in vol units"). Therefore "expected volatility" is lower than "square root expected variance". The difference is what needs compensating

The more practical way to think about this, is that variance is being dominated much more by the tails (or volatility of volatility). When you trade a variance, you need a premium over volatility to compensate you for these tails (even if they don't realise very often).

Another way to think about this, is there is "convexity" in variance (when measured in units of volatility). If you are long and volatility goes up, you much more (because it's squared), but if it goes down, you aren't making as much less.

What is the VIX?

What unit of information does the VIX track? the volatility of the S&P 500 index over the next 30 days, annualized. What does this mean?

VIX tracks the variance not volatility of the S&P. (Slightly more subtly, it measures the variance in vol units). (This twitter thread does a decent job of explaining the difference and why it matters)

Kelly isn't (just) about logarithmic utility

This was fascinating. Thanks for taking the time to write it. I agree with the vast majority of what you wrote, although I don't think it actually applies to what I was trying to do in this post. I don't disagree that a full-Bayesian finds this whole thing a bit trivial, but I don't believe people are fully Bayesian (to the extent they know their utility function) and therefore I think coming up with heuristics is valuable to help them think about things. 

So, similarly, I see the Peters justification of Kelly as ultimately just a fancy way of saying that taking the logarithm makes the math nice. You're leaning on that argument to a large extent, although you also cite some other properties which I have no beef with.

I don't really think of it as much as an "argument". I'm not trying to "prove" Kelly criterion. I'm trying to help people get some intuition for where it might come from and some other reasons to consider it if they aren't utility maximising.

It's interesting to me that you brought up the exponential St Petersburg paradox, since MacLean, Thorpe, Ziemba claim that Kelly criterion can also handle it although I personally haven't gone through the math.

Kelly isn't (just) about logarithmic utility

Yeah, I think I'm about to write a reply to your massive comment, but I think I'm getting closer to understanding. I think what I really need to do is write my "Kelly is Black-Scholes for utility" post.

I think that (roughly) this post isn't aimed at someone who has already decided what their utility is. Most of the examples you didn't like / saw as non-sequitor were explicitly given to help people think about their utility. 

Load More