If board members have an obligation not to criticize their organization in an academic paper, then they should also have an obligation not to discuss anything related to their organization in an academic paper. The ability to be honest is important, and if a researcher can't say anything critical about an organization, then non-critical things they say about it lose credibility.
Yeah, I wasn't trying to claim that the Kelly bet size optimizes a nonlogarithmic utility function exactly, just that, when the number of rounds of betting left is very large, the Kelly bet size sacrifices a very small amount of utility relative to optimal betting under some reasonable assumptions about the utility function. I don't know of any precise mathematical statement that we seem to disagree on.
Well, we've established the utility-maximizing bet gives different expected utility from the Kelly bet, right? So it must give higher expected utility or it wouldn't be utility-maximizing.
Right, sorry. I can't read, apparently, because I thought you had said the utility-maximizing bet size would be higher than the Kelly bet size, even though you did not.
Yeah, I was still being sloppy about what I meant by near-optimal, sorry. I mean the optimal bet size will converge to the Kelly bet size, not that the expected utility from Kelly betting and the expected utility from optimal betting converge to each other. You could argue that the latter is more important, since getting high expected utility in the end is the whole point. But on the other hand, when trying to decide on a bet size in practice, there's a limit to the precision with which it is possible to measure your edge, so the difference between optimal bet and Kelly bet could be small compared to errors in your ability to determine the Kelly bet size, in which case thinking about how optimal betting differs from Kelly betting might not be useful compared to trying to better estimate the Kelly bet.
Even in the limit as the number of rounds goes to infinity, by the time you get to the last round of betting (or last few rounds), you've left the n→∞ limit, since you have some amount of wealth and some small number of rounds of betting ahead of you, and it doesn't matter how you got there, so the arguments for Kelly betting don't apply. So I suspect that Kelly betting until near the end, when you start slightly adjusting away from Kelly betting based on some crude heuristics, and then doing an explicit expected value calculation for the last couple rounds, might be a good strategy to get close to optimal expected utility.
Incidentally, I think it's also possible to take a limit where Kelly betting gets you optimal utility in the end by making the favorability of the bets go to zero simultaneously with the number of rounds going to infinity, so that improving your strategy on a single bet no longer makes a difference.
I think that for all finite n, the expected utility at timestep n from utility-maximizing bets is higher than that from Kelly bets. I think this is the case even if the difference converges to 0, which I'm not sure it does.
Why specifically higher? You must be making some assumptions on the utility function that you haven't mentioned.
I do want to note though that this is different from "actually optimal"
By "near-optimal", I meant converges to optimal as the number of rounds of betting approaches infinity, provided initial conditions are adjusted in the limit such that whatever conditions I mentioned remain true in the limit. (e.g. if you want Kelly betting to get you a typical outcome of money≈1 in the end, then when taking the limit as the number N of bets goes to infinity, you better have starting money r−N, where r is the geometric growth rate you get from bets, rather than having a fixed starting money while taking the limit N→∞). This is different from actually optimal because in practice, you get some finite amount of betting opportunities, but I do mean something more precise than just that Kelly betting tends to get decent outcomes.
The reason I brought this up, which may have seemed nitpicky, is that I think this undercuts your argument for sub-Kelly betting. When people say that variance is bad, they mean that because of diminishing marginal returns, lower variance is better when the mean stays the same. Geometric mean is already the expectation of a function that gets diminishing marginal returns, and when it's geometric mean that stays fixed, lower variance is better if your marginal returns diminish even more than that. Do they? Perhaps, but it's not obvious. And if your marginal returns diminish but less than for log, then higher variance is better. I don't think any of median, mode, or looking at which thing more often gets a higher value are the sorts of things that it makes sense to talk about trading off against lowering variance either. You really want mean for that.
Correct. This utility function grows fast enough that it is possible for the expected utility after many bets to be dominated by negligible-probability favorable tail events, so you'd want to bet super-Kelly.
If you expect to end up with lots of money at the end, then you're right; marginal utility of money becomes negigible, so expected utility is greatly effected by neglible-probability unfavorable tail events, and you'd want to bet sub-Kelly. But if you start out with very little money, so that at the end of whatever large number of rounds of betting, you only expect to end up with ∼1 money in most cases if you bet Kelly, then I think the Kelly criterion should be close to optimal.
(The thing you actually wrote is the same as log utility, so I substituted what you may have meant). The Kelly criterion should optimize this, and more generally E(log(money)x) for any x>0, if the number of bets is large. At least if x is an integer, then, if log(money) is normally distributed with mean μ and standard deviation σ, then E(log(money)x) is some polynomial in μ and σ that's homogeneous of degree x. After a large number N of bets, μ scales proportionally to N and σ scales proportionally to √N, so the value of this polynomial approaches its μN term, and maximizing it becomes equivalent to maximizing μ, which the Kelly criterion does. I'm pretty sure you get something similar when x is noninteger.
It depends how much money you could end up with compared to x. If Kelly betting usually gets you more than x at the end, then you'll bet sub-Kelly to reduce tail risk. If it's literally impossible to exceed x even if you go all-in every time and always win, then this is linear, and you'll bet super-Kelly. But if Kelly betting will usually get you less than x but not by too many orders of magnitude at the end after a large number of rounds of betting, then I think it should be near-optimal.
If there's many rounds of betting, and Kelly betting will get you money≈x as a typical outcome, then I think Kelly betting is near-optimal. But you might be right if money<<x.
If you bet more than Kelly, you'll experience lower average returns and higher variance.
No. As they discovered in the dialog, average returns is maximized by going all-in on every bet with positive EV. It is typical returns that will be lower if you don't bet Kelly.
The Kelly criterion can be thought of in terms of maximizing a utility function that depends on your wealth after many rounds of betting (under some mild assumptions about that utility function that rule out linear utility). See https://www.lesswrong.com/posts/NPzGfDi3zMJfM2SYe/why-bet-kelly
For two, your specific claims about the likely confusion that Eliezer's presentation could induce in "laymen" is empirically falsified to some degree by the comments on the original post: in at least one case, a reader noticed the issue and managed to correct for it when they made up their own toy example, and the first comment to explicitly mention the missing unitarity constraint was left over 10 years ago.
Some readers figuring out what's going on is consistent with many of them being unnecessarily confused.
I don't think this one works. In order for the channel capacity to be finite, there must be some maximum number of bits N you can send. Even if you don't observe the type of the channel, you can communicate a number n from 0 to N by sending n 1s and N-n 0s. But then even if you do observe the type of the channel (say, it strips the 0s), the receiver will still just see some number of 1s that is from 0 to N, so you have actually gained zero channel capacity. There's no bonus for not making full use of the channel; in johnswentworth's formulation of the problem, there's no such thing as some messages being cheaper to transmit through the channel than others.