Sorted by New

Wiki Contributions



As a minor addendum, I asked microcovid what it thought about spending 4 hours a day indoors unmasked with 20 people, but with a community incidence of 0.5%, attempting to simulate 2019 living with broad acquired immunity. It thinks this is 19,000 microcovids, suggesting it would still lead to infection in 50 days. This is depressing, I had hoped for a lot more gain than that.


Trevor Bedford took a crack at estimating the steady state back in October (so pre-Omicron). He came up with estimates of 20-30% of the population infected annually and deaths of 40K-100K per year in the US. . Unfortunately, he didn't show enough of his work for me to understand where the 20-30% number comes from. Deaths is just multiplying number of infections by IFR. The big question mark here is whether high risk people will continue to get boosters; Bedford is guessing yes.

Here is my own attempt to estimate number of infections. I googled "how long does covid acquired immunity last" and looked for useful studies. My impression is that no one really knows, but experts generally give numbers in a range of 6 months - 5 years, and tend to center their guesses on 2-3 years. (Since this is presumably an exponential decay process, it isn't crazy to think that we could have estimates on the half life of immunity, even though the disease has only been around 2 years.) A very naive guess would then be to expect people to get infected every 2-3 years, so 30-50% of the population each year.

But life is better than this! The infectious period is about 5 days, so if the natural immunity period is 2 years then one can't expect more than (5 days)/(2 years) = 0.6% of the population to be infected at a time. In a population with only 0.6% infected, one could live a 2019 lifestyle for months or probably years -- without getting infected. And this forms a virtuous cycle where the small number of infected people means one can go on longer without being infected. I didn't attempt to solve for equilibrium in the mathematical sense, but 3-5 years between infections doesn't seem crazy.

If people kept getting boosters, the numbers would be much better. And we could start timing them to the seasonality of the disease, just like with the flu. To me personally, it is clear that the cost-benefit of getting a COVID booster every Fall for the rest of my life is worth it. But most people I talk to, even in my liberal bubble, think that is crazy, so I'm going to assume this is not a significant effect.

I'd love to see more intelligent attempts to solve for the steady state.


I am curious what kind of analysis you plan to run on the calibration questions. Obvious things to do:

For each user, compute the correlation between their probabilities and the 0-1 vector of right and wrong answers. Then display the correlations in some way (a histogram?).

For each question, compute the mean (or median) of the probability for the correct answers and for the wrong answers, and see how separated they are.

But neither of those feels like a really satisfactory measure of calibration.


"Naively in the actual Newcombe's problem if omega is only correct 1/999,000+epsilon percent of the time…"

I'd like to argue with this by way of a parable. The eccentric billionaire, Mr. Psi, invites you to his mansion for an evening of decision theory challenges. Upon arrival, Mr. Psi's assistant brings you a brandy and interviews you for hours about your life experiences, religious views, favorite philosophers, ethnic and racial background … You are then brought into a room. In front of you is a transparent box with a $1 bill in it, and an opaque box. Mr. Psi explains:

"You may take just the solid box, or both boxes. If I predicted you take one box, then that box contains $1000, otherwise it is empty. I am not as good at this game as my friend Omega, but out of my last 463 games, I predicted "one box" 71 times and was right 40 times out of 71; I picked "two boxes" 392 times and was right 247 times out of 392. To put it another way, those who one-boxed got an average of (40$1000+145$0)/185 = $216 and those who two-boxed got an average of (31$1001+247$1)/278=$113. "

So, do you one-box?

"Mind if I look through your records?" you say. He waves at a large filing cabinet in the corner. You read through the volumes of records of Mr. Psi's interviews, and discover his accuracy is as he claims. But you also notice something interesting (ROT13): Ze. Cfv vtaberf nyy vagreivrj dhrfgvbaf ohg bar -- ur cynprf $1000 va gur obk sbe gurvfgf naq abg sbe ngurvfgf. link.

Still willing to say you should one-box?

By the way, if it bothers you that the odds of $1000 are less than 50% no matter what, I also could have made Mr. Psi give money to 99/189 one boxers (expected value $524) and only to 132/286 two boxers (expected value $463) just by hfvat gur lrne bs lbhe ovegu (ROT13). This strategy has a smaller difference in expected value, and a smaller success rate for Mr. Psi, but might be more interesting to those of you who are anchoring on $500.


A few years ago, I tried to write a friendly introduction to this technical part.


The grammar of the sentence is a bit hard to follow. When I am presenting this paradox to friends (I have interesting friends), I hand them a piece of paper with the following words on it:

Take another piece of paper and copy these words:

"Take another piece of paper and copy these words: "QQQ" Then replace the three consecutive capital letters with another copy of those words. The resulting paragraph will make a false claim."

Then replace the three consecutive capital letters with another copy of those words. The resulting paragraph will make a false claim.

I urge you to carry out the task. You should wind up with a paper that has the exact same words on it as the paper I gave you.

If you believe that the statement on my paper is true, then you should believe that the statement on your paper is false, and vice versa. Yet they are the same statement! Assuming that you think truth or falsehood is a property of grammatical sentences, independent of where they are written, this should bother you. Moreover, unlike the standard liar paradox, the paper I gave never talks about itself, it only talks about a message you will write on some other piece of paper (which does not, in turn, talk about the original message) when you perform some simple typographical operations.

Quine constructed this example to demonstrate the sort of subtleties that come up in order to invent a mathematical formalism that can talk about truth, and can talk about manipulating symbols, without bringing in the liar paradox. (To learn how this problem is solved, take a course on mathematical logic and Goedel's theorem.)


Well, I was trying to make the simplest possible example. We can of course add the monkey to our pool of experts. But part of the problem of machine learning is figuring out how long we need to watch an expert fail before we go to the monkey.


Suppose there are two experts, and two horses. Expert 1 always predicts horse A, expert 2 always predicts horse B, the truth is that the winning horse cycles ABABABABABA... The frequentist randomizes choice of expert according to weights; the Bayesian always chooses the expert who currently has more successes, and flips a coin when the experts are tied. (Disclaimer: I am not saying that this is the only possible strategies consistent with these philosophies, I am just saying that that these seem like the simplest possible instantiations of "when I actually choose which person to follow on a given round, I randomize according to my weights, whereas a Bayesian would always want to deterministically pick the person with the highest expected value.")

If the frequentist starts with weights (1,1), then we will have weights (1/2^k, 1/2^k) half the time and (1/2^k, 1/2^{k+1}) half the time. In the former case, we will guess A and B with equal probability and have a success rate of 50%; in the latter case, we will guess A (the most recent winner) with probability 2/3 and B with probability 1/3, for a success rate of 33%. On average, we achieve 5/12 = 41.7% success. Note that 0.583/0.5 = 1.166... < 1.39, as predicted.

On half the other horses, expert 1 has one more correct guess than expert 2, so the Bayesian will lose everyone of these bets. In addition, the Bayesian is guessing at random on the other horses, so his or her total success rate is 25%. Note that the experts are getting 50% success rates. Note that 0.75/0.5 = 1.5 < 2.41, as we expect.

As is usually the case, reducing the penalty from 1/2 to (for example) 0.9, would yield to slower convergence but better ultimate behavior. In the limit where the penalty goes to 1, the frequentist is achieving the same 50% that the "experts" are, while the Bayesian is still stuck at 25%.

Now, of course, one would hope that no algorithm would be so Sphexish as to ignore pure alternation like this. But the point of the randomized weighted majority guarantee is that it holds (up to the correctness of the random number generator) regardless of how much more complicated reality may be than the experts' models.


I thought it was interesting too. As far as I can tell, your result is special to the situation of two bettors and two events. The description I gave describes a betting method when there are more than two alternatives, and that method is strategy proof, but it is not fair, and I can't find a fair version of it.

I am really stumped about what to do when there are three people and a binary question. Naive approaches give no money to the person with the median opinion.


Here is another attempt to present the same algorithm, with the goal of making it easier to memorize:

"Each puts in the square of their surprise, then swap."

To spell this out, I predict that some event will happen with probability 0.1, you say it is 0.25. When it happens, I am 0.9 surprised and you are only 0.75 surprised. So I put down (0.9)^2 D, you put down (0.75)^2 D, and we swap our piles of money. Since I was more surprised, I come out the loser on the deal.

"Square of the surprise" is a quantity commonly used to measure the failure rate of predicative agents in machine learning; it is also known as Brier score. So we could describe this rule as "each bettor pays the other his or her Brier score." There was some discussion of the merits of various scoring systems in an earlier post of Coscott's.

Load More