Previously: Evaluating Predictions in Hindsight

Epistemic Status: Having fun

Evaluating predictions is hard, especially about the future. Let’s do it.

The most frustrating part of predictions is defining them carefully. A lot of Scott’s 2020 predictions seem like they have a high enough probability of a disputed outcome that they’d require clarification before betting on them. A bunch of others say they’re explicitly Scott’s decision. Thus, I’ll try to clarify how I interpret such proposals as part of my evaluation.

I’ll be looking at the predictions as if they were markets, and asking whether I would buy (bet on the thing happening at those odds plus some fee), sell (bet against the thing happening at those odds plus some fee) or hold (not inclined to wager), and about where I’d put my fair. Note that this doesn’t mean I’d bet against Scott because Scott believes the prices are fair. So we’d have to give him good enough odds that he’d be willing to bet.

First up, we have the Coronavirus predictions. You’d pay to know what you really think! Hence, betting markets.

1. Bay Area lockdown (eg restaurants closed) will be extended beyond June 15: 60%

Sell to 40%, if I’m interpreting this correctly. I’m reading this as “no major relaxation of lockdown conditions” with things extended as they are or harsher. Certainly allowing restaurants to open at any level of capacity would mean it fails.

Right now, California is running in place at very low levels. Almost no herd immunity is being built, and most hospital capacity is not being used at all. The economy is being sacrificed in the hopes that conditions improve, but how long can that continue? How long should it continue? How long would people continue to abide it under such conditions, with no end in sight?

This is soon enough that there’s a decent chance that these realizations have not yet come at that time. And there’s some chance that there’s a treatment or vaccine that looks sufficiently promising that ‘tough it out until the end’ becomes reasonable. But I’m guessing, as I noted last time, that a partial reopening does little or no damage if done wisely, and I expect California to end up doing something of that type.

2. …until Election Day: 10%

Hold. If anything, that seems high, assuming it means continuous lockdown until then rather than being locked down on election day. This has to both be necessary and sufficient to large enough extents to justify waiting an incredibly long time. But there’s also a chance that this happens without a good justification. We interpret California’s actions as ‘good decision making’ and that is a possible explanation but it can also be seen as ‘abundance of caution’ or ‘California is really good at telling people they can’t do things’ which would point in a different direction when the right decision goes the other way.

There’s also the argument that, if it holds through June, that’s kind of a decision to hold indefinitely so the conditional chance it lasts a lot longer can’t be that low.

3. Fewer than 100,000 US coronavirus deaths: 10%

Sell a lot. The official count is 57,000 now and we are not substantially past the peak outside of a few areas. The way down won’t be faster than the way up. Under 100,000 is close to a Can’t Happen even if we get a best case style scenario.

4. Fewer than 300,000 US coronavirus deaths: 50%

Sell to 30%. This is in 2020 only, and official counts, and would require lower than current levels on average for the rest of the year. Right now, hospitals are not overwhelmed and states are looking to reopen soon. We’d need to hit this level to have substantial overall help from herd immunity. We’d need to make a lot of progress on many fronts, or have a strong treatment or vaccine quickly, to have a burn slow enough to stay under this number.

5. Fewer than 3 million US coronavirus deaths: 90%

Hold. Given what I think is the IFR, killing almost 1% of the population requires full system collapse. New York managed to get through a fifth of its population in a month without seeing a spike in the IFR, so I think this is worse than the high-death-rate scenarios that I think are plausible, especially given this is presumably the official death count. In the scenario where we get close, I expect a severe undercount. I expect a lot of people to be able to protect themselves even in a full out-of-control scenario (and in fact, in that scenario it makes more sense for people to take extreme measures and burn through savings and create debt to do so) and I expect herd immunity effects to protect us by 50% infection or so at most under realistic conditions. Giving this a 10% chance therefore seems like a lot, but betting at long odds on ‘not a complete disaster’ requires more confidence than I’d be willing to display here.

6. US has highest official death toll of any country: 80%

Buy to 90%. Realistically who is it going to be if it isn’t us? China or India. No one else has a big enough population. India seems not that vulnerable due to physical conditions and likely won’t be able to track things properly even when things get bad. So this comes down to how often China ends up with a higher death count than we have by end of year, and they admit it. Given what would happen to China if this did happen to them sufficiently that they’d be forced to admit it, I see such a scenario as highly unlikely.

7. US has highest death toll as per expert guesses of real numbers: 70%

Buy to 80%. Logic above applies. I can see assigning 10% to ‘China gets a real problem bigger than ours and refuses to admit it, but expert guesses realize this’ but it seems more like 5% to me, because it’s a narrow window where it actually is sufficiently bigger that experts pick up on it, but it’s not so much bigger that they cannot hide it.

8. NYC widely considered worst-hit US city: 90%

Buy to 95%. Widely considered is a strange term. The story is that NYC is the place that got hit, and that’s likely to stay the same even if something worse later happened to another city. Or if another city already has been harder hit (for example New Orleans or Chicago) but it’s smaller and less visible. Plus NYC is larger than these other cities, so even if in percentage terms they get hit harder, it won’t change the narrative unless it’s a huge difference. And I don’t think it’s easy to get hit that much harder than NYC already has been, because you can only be at most 100% infected.

9. China’s (official) case number goes from its current 82,000 to 100,000 by the end of the year: 70%

Sell to 40%. They seem committed to not admitting this. Not going lower because 100,000 is only 18,000 more cases, so they could go that high without losing much face, but it still doesn’t seem likely to me. Also worth noting that in the scenarios where China can’t keep up face here, it seems clear that USA is over 300,000 dead. Otherwise, what forced China’s hand?

10. A coronavirus vaccine has been approved for general use and given to at least 10,000 people somewhere in the First World: 50%

Sell to 40% but stop there. If you had asked me this before Oxford announced it had a timeline that would make this work I would have sold down to 20%. The first world has proven time and again it is unwilling to do such things. Civilization made it clear it would rather die, in both economic and literal terms, before bending its rules in such ways. But perhaps a way has or can be found, and I do expect us to be in dire need. 10,000 people isn’t a lot so this could be one small country defying the general suicide consensus and doing it anyway. Indeed do many things come to pass. Note that I wouldn’t buy this unless it was much lower than 40%.

11. Best scientific consensus ends up being that hydroxychloroquine was significantly effective: 20%

Sell to 15% or so, while noting that I think the chance of it actually being effective is much higher than that. I am cynical enough to think that scientific consensus is looking to declare this ineffective, or at least avoid declaring it effective, because of who would stand to benefit. There’s also a good chance that it stays ‘we don’t know’ indefinitely. The reason I think it has a higher chance of actually being effective is anecdotal based on people I am aware of who have used it.

12. I personally will get coronavirus (as per my best guess if I had it; positive test not needed): 30%

Sell to 20% at least, and also what the hell? Is this Scott thinking he will be paranoid and think he had the virus when he hasn’t had the virus? Let’s set that aside for now and assume Scott would simply get an antibody test in such a world, which should be easy to get by December. So despite living in Berkeley, and being unusually scrupulous, he expects a 30% chance to personally be infected. That sounds a lot like he thinks there’s a mean infection rate for that area a lot above 30%. But he thinks we’re only 50% to have 300,000 deaths in the United States, which represents less than a 10% overall infection rate, and California is doing way better than other areas. This one doesn’t make sense to me, unless it’s implicitly endorsing a high probability that Covid-19 has a substantially-sub-1% IFR and a ton of mild cases, and even then it’s tough.

13. Someone I am close to (housemate or close family member) will get coronavirus: 60%

Sell to 40%. Secondary household attack rates have not been that high, and Scott presumably has multiple close family members that count for this, so if he was 30% to get infected, the chance of at least one infection in this category would be well above 60%. The reason I go the other way is that there’s sufficient uncertainty in the overall infection rate. If there are worlds where the USA is 3% infected and worlds where it’s 75% infected, then extra exposures add much less in relative terms. In the worlds where infection rates stay low, neither group is at much risk. In worlds where infection rates go high, Scott is likely infected and someone is all but certain to get it. But I don’t think there are enough worlds where the rate is that high contributing to this, and I think that Scott is reasonably likely to stay negative even in worlds with 75% infection rates, so this number likely should be double or more of the previous number.

14. General consensus is that we (April 2020 US) were overreacting: 50%

15. General consensus is that we (April 2020 US) were underreacting: 20%

General consensus will be that we were reacting stupidly. We reacted wrong. That’s an easy call. The question is, will that be widely seen as an underreaction, an overreaction, something that’s neither, or will there be a lack of consensus? What does it take to get a ‘consensus’? Who counts?

My guess is that there flat out won’t be consensus. There will be an argument. Partisan lines will be drawn. The public and the scientists will have different interpretations. And there will be those who think we reacted in the wrong ways rather than too much or too little. We’re clearly underreacting in the sense that we are not doing enough to expand testing or tracing capacity, and we’re not doing enough experimentation or data collection, and we’re not doing enough to get vaccines ready quickly or prepare for potential variolation. I expect some of that to become part of the consensus view, to the extent one exists. I also presume we’re overreacting in the sense that some of our lockdown tactics are ineffective or even counterproductive, and I expect us to realize that too. And so on.

Then again, it could be that this is simple – if death counts are higher than we expect we’ll be thought of as having ‘underreacted’ whether or not that cashes out into action. If things are contained by July and there’s no second wave, the ‘consensus’ will be that we ‘overreacted’ regardless of whether or not that makes any sense. That’s another way to look at this.

I don’t think we can be seen as by consensus overreacting unless things get contained and stay contained soon, and don’t see that as especially likely, so I’m going to sell the overreacting contract down to 30%, but stop there because people are bad at such things and find ways to rewrite history to suit their narratives. I’m going to hold the 20% on underreacting, because I expect things to be worse than the current general expectation, but I don’t see how doing more similar things (“reacting more”) is going to look like a great alternative. But it’s all murky.

16. General consensus is that summer made coronavirus significantly less dangerous: 70%

Hold, because it takes so little change to make things ‘significantly’ less dangerous, and there are a lot of ways to get to this ‘consensus’ without it being true.

17. …and there is a catastrophic (50K+ US deaths, or more major lockdowns, after at least a month without these things) second wave in autumn: 30%

That’s a very low bar for catastrophic but a high bar for how much things cleared up. It requires things to get fully better, then for them to get worse again, within the year, so I think that’s too many conditional things and I’m selling this down to 20%.

18. I personally am back to working not-at-home: 90%

Sell to 80%. There’s a 10% chance by Scott’s own prediction that there’s a lockdown preventing this (if it lasts until November the chance it lasts through December is very high, as it’s only getting colder at that point and absent a very specific vaccine timeline the length should follow Lindy rules). I’d assume there are plenty of worlds where restaurants are open but Scott keeps working from home. That’s the world I think we should be in now, as I think reopening restaurants at reduced capacity is probably net positive.

19. At least half of states send every voter a mail-in ballot in 2020 presidential election: 20%

Sell a little, maybe to 15%. That seems a bit high but I’m too anchored to know for sure, unfortunately. To get over half we need either red states to do this by choice, letting Democrats get a big boost, or to get this made mandatory via a congressional deal. I don’t see that as likely on either end, but if things are sufficiently bad there might be no choice. Note that there’s a big gap between everyone-gets-a-ballot and everyone-can-request-a-ballot.

20. PredictIt is uncertain (less than 95% sure) who won the presidential election for more than 24 hours after Election Day. 20%

Sell to 10%. This is based on the last few elections being very close. That seems less likely this year. Covid-19 will have big effects. Those effects could go either way, but it’s really hard for there to be serious doubt about who won a day after the polls close. The election has to be close, and there have to be a lot of mail ballots that prevent the count from working, or it be so close that a ‘recount’ actually might turn things around like in 2000, but that requires a very, very narrow window. Alternatively, in theory, there could be accusations of fraud, or Amash could have carried a few states. I still see this as unlikely.

21. Democrats nominate Biden, and he remains nominee on Election Day: 90%

Hold. Biden is trading at 78 right now to be the Democratic nominee. This market is completely insane. You should buy him. Also, Michelle Obama is still at 9% to run, and you should sell that. Hillary Clinton is at 13% to run, and you should sell that too. Also note that Biden is 43% to win the general election and Trump is 50% to win the general, which implies an 86% chance Biden gets the nomination while giving 0% to him withdrawing after nomination and 0% to third parties. Arbitrage ho!

22. Balance of evidence available on Election Day supports (as per my opinion) Tara Reade accusation: 90%

Hold, based on Scott being able to predict Scott’s evaluations of such evidence better than I can, and not expecting things to change much.

23. Conditional on me asking about Reade on SSC survey, average survey-taker’s credence in her accusation is greater than 50%: 70%

24. …greater than 75%: 10%
25. …greater than credence in Kavanaugh accusation asked in the same format: 40%

I think that given the nature of who is being asked, >50% isn’t that high a bar, and I think that Scott asks mainly in the worlds where we should expect a >50% answer. And I think all the anti-Biden people on both sides will answer super high regardless of the strength of the evidence and the pro-Biden people will evaluate the evidence, so I’m going to buy to 80%, and buy the >75% up to 40% for similar reasons, again without having looked at the evidence.

On the greater than Kavanaugh question, it’s really weird. I think people have a lot of cognitive dissonance, so asking both questions together will cause weird things to happen and people will remember the Kavanaugh situation in light of the current one and not give the same answers they’d have given before. So here I have to model who is answering, what their politics are, and lots of other things. 40% is probably fine, maybe a little low? Because, again, I expect asymmetric partisan adjustments.

26. Trump is re-elected President: 50%

Hold. Agree it’s roughly this.

27. Democrats keep the House: 70%

Hold. That’s moderately lower than the odds at PredictIt, and I give that market some credit so I’m not inclined to mess with it, but it seems too high to me. Conditional on Trump winning re-election, it seems hard (although definitely not impossible) to hold the house.

28. Republicans keep the Senate: 50%

Buy to 60%. I want to be on the other side of PredictIt here. The Senate seems harder than the Presidency.

29. Trump approval rating higher than 43% on June 1: 30%

Buy to 40%. This is one month from now and it’s currently 43.3%. It takes a while after reopenings for things to get worse even if they are going to get worse. So I do think things looking worse is more likely than things looking better, but I’m getting an 0.2% head start (I’m assuming 43.1% counts as higher than 43%) and that counts for a lot given how little things move.

30. Biden polling higher than Trump on June 1: 70%

Buy to 80%. Not much is going to happen between now and then that can plausibly change this, and he’s substantially ahead in polls right now.

31. At least one new Supreme Court Justice: 20%

Hold rather than check actuarial tables, but check the tables. I don’t think this happens much before the election short of that. Right wing justices are not old enough to quit, left wing justices aren’t going anywhere by choice.

32. I vote Democrat for President: 80%

Buy to 90%. Scott explained this being so low on Tumblr, but I’m not buying it given his general outlook. He’s not going to vote for Amash. He’s essentially 0% to vote Trump. The ‘no vote at all’ isn’t 0%, but I think he cares too much for it to be very high, he believes in voting. Biden is the obviously correct choice for Scott given Scott’s preferences in outcomes, and not voting for him because of an accusation when he’s running against Donald f***** Trump? Yeah, I don’t buy it.

33. Boris still UK PM: 90%

Sell to 80%. Tenures aren’t that long and no one likes him.

34. No new state leaves EU: 90%

Hold, because while I do see a lot of ways for there to be a crisis, the chances that it will take less than a year to figure out how to actually leave seems pretty low.

35. UK, EU extend “transition” trade deal: 80%

Hold. Neither alternative, failing to extend or reaching a true deal, seem all that likely, so this seems like a reasonable estimate.

36. Kim Jong-Un alive and in power: 60%

Buy to 80%. Tenures aren’t that short in such systems, and he’s not that old. This seems super optimistic.

37. Dow is above 25,000: 70%
38. …above 30,000: 20%

I don’t think a 50% chance for the 25-30k range is reasonable. Dow was flirting with 50k before. In the 50% of scenarios where Trump wins re-election (presumably good for stocks) we also presumably have good Covid-19 situations most of the time (also good for stocks) and large cap stocks have overperformed throughout. There’s therefore a good chance of Dow 30,000 and a net gain on the year. Buy that to 30%. By contrast, what’s the chance it’s higher than today (it’s close to 25k now)? I’m going to say more like 60%. This rally seems suspicious, but the downside risk is bigger than the upside potential, so things are still probably a favorite to be net positive. There’s just a lot of variance. The more interesting question is Dow 20,000 or Dow 15,000, which I’m going to give maybe 30% and 10% to respectively?

However, given that options markets exist, I’m not going to trade at any prices that are worse than the implied prices from options, so don’t ask.

39. Bitcoin is above $5,000: 70%

40. …above $10,000: 20%

Bitcoin is trading at $8700. Being only 20% to be above $10,000 seems vaguely consistent with that price being fair, especially if we’re 70% to stay above $5,000, but the implied fat tail here doesn’t seem that fat, so it’s no longer clear that Scott should be going long Bitcoin. I’d likely sell the 5,000 binary call option down to 60% or so. I wouldn’t buy the above $10,000 option because I think you can just buy Bitcoins instead and that’s a better play.

41. I have bought a Surface Book 3 laptop: 60%

Scott knows Scott’s mind and is generally well-calibrated, so pass on this one.

42. Crew Dragon reaches orbit: 80%
43. Starship reaches orbit: 40%

I’m selling both of these on the principle that space travel is more risky than people would generally realize, but I have no domain knowledge so that’s a super weak opinion.

44. I do another Nootropics Survey this year: 70%
45. I do another SSC Survey this year: 90%
46. I start a Reader SSC Survey this year: 60%
47. I start a SSC Book Review Contest this year: 70%
48. I run another Adversarial Collaboration Contest this year: 10%
49. I publish [redacted]: 20%
50. I publish [redacted]: 50%
51. I publish [redacted]: 60%
52. I publish [redacted]: 80%
53. …conditional on being published, it gets at least 40,000 pageviews: 10%
54. I publish [redacted]: 60%
55. …conditional on being published, it gets at least 40,000 pageviews: 50%
56. More hits this year than last: 70%
57. Most hits ever this year: 20%
58. I finish Unsong revision this year: 40%
59. New co-blogger with more than 3 posts: 10%

From past estimates I’m going to say that Scott overestimates his big project chances, so I’m selling the Unsong revision. I’m selling the co-blogger because I don’t think that ever happens. The other stuff seems like I’m not in a position to evaluate.

60. No new long-term (1 month +) residents at group house by the end of the year: 70%
61. Koios has said his first clear comprehensible word: 50%

Obviously can’t evaluate anything redacted. Weakly buying on Koios speaking their first word because I expect that to happen scary fast in such a house reasonably often.

Pretty big buyer on no new long-term residents at group house, given current conditions. It doesn’t seem all that likely even if things were normal, and things are very not normal.

72. I’ve gotten at least one new patient to do a full wake therapy protocol: 60%
73. I have specific, set-in-motion plans to quit work / start my own business: 5%
74. I work the same schedule and locations I did before the coronavirus: 80%
75. I get a bonus for 2020: 20%

I’m confused how #74 can be this high, given the chances of continued lockdown and the general sense that everything changes. Probably we’re interpreting that one differently.

For #73, especially in light of the estimate on #75, Scott, you should start your own business as soon as things are normal again, if not sooner. Seriously. You’d be able to work less and make more money and have more control over who you see, so you could choose patients you find interesting and who you believe you can help. You would have zero shortage of clients. Not that I think Scott will do that.

79. I travel to Alaska this year: 60%

94. I travel outside the country at least once: 10%

Sell Alaska down to 30%. Again, this does not seem compatible with how the world looks. And given travel outside the country was already down to 10%, probably that’s still somewhat too high.

82. I go on at least three dates with someone I haven’t met yet: 20%

Based on other similar estimates not reflecting goings-on in the world, I’m guessing this is high, but I don’t know the baseline well enough to be sure.

86. I try one biohacking project per month x at least 5 of the last 6 months of 2020: 30%

87. I find at least one new supplement I take or expect to take regularly x 3 months: 20%.

88. Not eating meat at home: 40%
89. Weight below 200: 50%
90. Weight below 190: 10%

95. I get back into meditating seriously (at least ten minutes a day, five days a week) for at least a month: 10%

Going to pass on all these as basically calibration exercises for Scott, so I don’t know if he’s adjusted his calibrations properly.

96. At least ten tweets in 2020: 80%

Sell this a bit because last year we expected Twitter to beat out Facebook yet Facebook won, and I see a lot of intertia in such things. This assumes he has yet to Tweet in 2020.

97. I eat at/from Sliver more than any other restaurant in Q4 2020: 50%

Given the substantial chance that things have changed a lot or there is equal amounts of eating at all restaurants, I’ll sell this to 30%.

99. I do pushups and situps at least 3 days/week in average week of Q4 2020: 60%

Good luck! Not gonna jinx it.

100. I write the post scoring these predictions before 2/1/21: 70%

This is one of those self-fulfilling prophecy type of predictions. No bet.

I want to thank Scott once again for putting himself out there and doing these each year, no matter how late they come out. I certainly haven’t done the same and I’m sure others could pick mine apart if I put in the same level of effort that Scott puts into his.

Note on actual betting: Due to the logistical annoyances of betting plus the adverse selection effects, I’m not looking to actually wager on anything. It’s a thought experiment. But, if I was offered very different odds than the ones I’m showing here, and the logistics were acceptable, all things are possible.