Last year, I looked at Scott's forecasts for 2021 and compared them to the market forecasts. Today I went through those forecasts (and Zvi's * - a buy/hold/sell exercise done on Scott's estimates) added the resolutions and calculated a Brier score and a log-score.

 Results were as follows:

 BrierLog
Scott0.201.24
Zvi0.160.93
Market0.140.90

So in summary "market" about as good as Zvi and both better than Scott . (Albeit on a pretty small sample of 19 questions). (Lower is better for Brier score and log-score)

Full details can be found here

QuestionScottZviMarketResult
Biden approval rating (as per 538) is greater than 50%80%80%61%0
Court packing is clearly going to happen (new justices don’t have to be appointed by end of year)5%1%5%0
Yang is New York mayor80%70%70%0
Newsom recalled as CA governor5%5%7%0
Tokyo Olympics happen on schedule70%80%77%1
Major flare-up (significantly worse than anything in past 5 years) in Russia/Ukraine war32%15%16%0
Netanyahu is still Israeli PM40%25%22%0
Prospera has at least 1000 residents30%30%18%0
GME >$100 (Currently $170)50%50%60%1
Bitcoin above 100K40%23%23%0
Ethereum above 5K50%30%11%0
Ethereum above 0.05 BTC70%55%33%1
Dow above 35K90%50%50%1
…above 37.5K:70%20%20%0
Unemployment above 5%40%50%37%0
Starship reaches orbit60%60%50%0
Greater than 66% of US population vaccinated against COVID50%60%77%1
Vitamin D is generally recognized (eg NICE, UpToDate) as effective COVID treatment30%20%25%0
US approves AstraZeneca vaccine20%20%37%0

* I made a couple of assumptions when calculating Zvi's probabilities for things where he wasn't super explicit about his numbers. I will of course update these if asked.

New to LessWrong?

New Comment
12 comments, sorted by Click to highlight new comments since: Today at 3:28 PM
[-]Zvi2y230

OK, so I am obviously biased but I'll look to see if I think this is fair.

First of all, I didn't look at market prices for a lot of the things (where I did, I mentioned it). If I had done this more I would have done considerably better. Instead, I was saying whether I would trade on Scott's markets based on my current knowledge level. Does that count as predicting that number when comparing to the market? That's up to you to decide. 

You could of course just say 'should have done the research.' You could also say 'I'm comparing your ability to predict to what a market would do, on arbitrary questions, so tough that you only had Scott's prediction' or something. Again, not my call?

Second of all, the procedure for deciding what I meant seems to not match the way I was making predictions. In general, it would be fair to say that 'buy to X%' is actually saying 'it's at least X%' so my 'fair' must be higher than that ,and reverse for selling. 

But it's pretty bad to be doing this now, in hindsight, if we want to do Briar we need to specify those numbers at that time. 

So for e.g. Biden's approval, 80% was a dumb prediction and I should have sold it down somewhat. But Starlink I would strongly push back. Basic summary:

  1. Biden approval: Giving me 80% outright is a tad unfair, but I take the L on this. Dumb.
  2. Court packing: Meh.
  3. Yang: My fault for looking at the prediction market, ironically. Should have been lower.
  4. Newsome: Actually did make money selling this btw.
  5. Tokyo: Not convinced I was right to buy this but I got away with it. Probably went too high.
  6. Russia/Ukraine: I am confused why my 'maybe buy to 10' got interpreted as 15 here, whereas my 'sell down to X' in other places got interpreted as X and compared to a lower market. I think being lower than market was right here, although 15 was likely a better prediction than 10.
  7. Netanyahu: Getting punished for this one feels wrong - I basically said sell while I'm above market, that's not exactly a statement that I should be higher than market - except that my explanation was that I was going to be slightly higher than market by default. I could argue 25 is fair. Can't really judge but I feel like I'd slightly buy this again at the new market fair if we ran another Everett branch?
  8. Prospera: This is a 'trust Scott because I know nothing and there's no market' rather than anything else. In hindsight, people who get excited enough to write posts about X are a little too excited about X so I should have been moderately lower and sold a bit. Whoops. But note that if there had been a market I would have mostly defaulted to it.
  9. GME: Yeah, I still dunno what to think of this, no further comments at this time.
  10. Bitcoin: I noticed there was an easy arbitrage here, I think market was being dumb. Couldn't hedge the way this was scored, but notice that what I did was "Sell January 1 BTC 100k calls and buy spot BTC" and that trade seems like it does fine depending on the ratio, was definitely good. I think you gotta give me the market price or lower.
  11. Ether: My trade here is "Sell ETH January 1 5k calls and buy ETH at 2300" and that's... a very very good trade and damn I shoulda done that. Feels weird to penalize me on that trade. But then again, if you'd told me ETH calls were trading at 11%, yeah, probably would have bought some cause that's too low, so maybe I lose anyway? Again, it's weird. 
  12. Dow: Yeah, I think this is exactly right, I get market prices here. I'm not challenging the EMH on this one.
  13. Unemployment: Definitely taking the L. Bigger L in terms of my fair at the time but would have come down a bit if I'd seen market. I am surprised.
  14. Starship: I literally said 'no idea' and didn't trade, which is another way of saying market. 
  15. Vaccination: 62% of the US population is 'fully vaccinated' which is lower than 66%, and the Metaculus market currently predicts January 22, 2022. I think 77% was clearly too high and also it's not clear it happened. 
  16. Vitamin D: See my edit on 4/27, I did not predict 50%, that was me adjusting to a false understanding of Scott's prediction and therefore not selling this as far down as all that. E.g.

EDITED VERSION 4/27: I updated a lot on Scott being at 30% for this (e.g. 70% for this being recognized) in the original, and moved it to 50%. With Scott at 70% instead, we’re much closer, but I think I still want to nudge a little higher and buy this to 75%, instead of moving 30% to 50%. This is a sign of how much I’m reluctant to move a reasonable person’s odds in this type of exercise; if you’d asked me before seeing Scott’s number, I’d have said recognition is very unlikely, and put it at something like 85%-90%, and my true probability is still likely 80% or so.

I think when I say my 'true probability is 80% for not happening' you need to give me a 20% for happening.

17. Astrazeneca: Probably was actually slightly lower having only seen Scott, but seeing market would have undone that. 20 seems fine.

The big adjustment is that I took a big knock for the 50% on Q16, and that's just a misread, should be 20%. 

I'll let Simon decide what to do with the rest. I also find it super weird to be punished vs. market for when I said "this is the wrong price, do an arbitrage' in the correct direction, and made money even vs. market prices doing the trade, but hey.

So I'm a little worried we've used different sources for your forecasts, but to explain where we differ:

  1. We agree
  2. We agree
  3. We agree
  4. Happy to change your number, although your forecast was: "Depending on what counts as ‘recalled’ this is either at least 10%, or it’s damn near 0%. I don’t see how you get 5%. Once you get an election going, anything can happen. Weird one, I’d need more research."  Which I averaged to 5%. Happy to change to 1%?
  5. We agree
  6. "It’s definitely a thing that can happen but there isn’t that much time involved, and the timing doesn’t seem attractive for any reason. I’ll sell to at least 15% on reasonable priors. " I used the 15% from the place I linked to here?
  7. "I’m going to sell this down to 30% even though I have system 1 intuitions he’s not going anywhere. Math is math. " Is why I used 30%. Happy to change to 25% 
  8. Happy to flip you to the market if you'd rather
  9. We agree
  10. Fair enough, will give you the market prices for that
  11. I think you're between Scott and the market on that, which seems fair. (33% vs 50% and 11%). Let me know what you'd rather have
  12. We agree
  13. We agree
  14. Will switch you to market. (Where you didn't disagree with Scott I gave you Scott)
  15. Happy to change the resolution depending on how Scott resolves his
  16. Ah, I was looking at "Vitamin D is good and important, you should be taking it, but I’m skeptical that such sources will recognize this in the future if they haven’t done so by now. Conditional on (I haven’t checked) the sources that matter not having made this call yet, I’d sell it to 50%, while saying that I definitely would use it to treat Covid if I had the choice. "
  17. We agree

 

I think the major disagreement seems to be I've used your LW post when I should have used a different blog post. Would you mind linking me to the right one?

[-]Zvi2y90

https://thezvi.wordpress.com/2021/04/27/scott-alexander-2021-predictions-buy-sell-hold/ is the canonical version. Surprised the differences were this big. The struggle on knowing when to update all versions is real, especially now that there's 3x.  

Then beyond that your decisions seem fine.

And no need to apologize for doing the exercise, it's good to check things, long as it's clear what's being done. 

When/if I do predictions for 2022 I'll see what I can do about also including explicit fairs (and ideally, where I'd call BS on a market, and where I wouldn't). 

OK, so I am obviously biased but I'll look to see if I think this is fair.

 

Yeah, this is definitely my bad. I didn't ask you (or Scott) whether or not you were happy with me comparing your comments to market forecasts. I apologise. I also didn't intend to make this as normative as it sounds. (FWIW in the past I have gone to bat for your forecasting skills and given your forecast and a market forecast most of the time I would expect to update away from the market and towards you)

I'll let Simon decide what to do with the rest. I also find it super weird to be punished vs. market for when I said "this is the wrong price, do an arbitrage' in the correct direction, and made money even vs. market prices doing the trade, but hey.

I do disagree that you should get "better than market" for some of the things where you would arbitrage Scott. If Scott was putting up $ on his forecasts then I would agree, but afaik his forecasts aren't there to be traded with.

Thanks for doing this!

[-]gjm2y100

I think the post should mention the fact that Zvi's forecasts were made after reading Scott's.

I'll add that note

It looks to me as though your computation of log scores in the Google Sheet are wrong, and it’s not just a sign error: The correct log-score (Y log p + (1-Y) log (1-p), where Y is the outcome and p is the prediction) should be 0 for a perfect prediction (e.g. p=0 and the event didn’t happen) and should approach -infinity as the prediction becomes more and more confidently wrong. However, in the formula you used (-Y log (1-p) - (1-Y) log p), as we approach a perfect prediction, our score becomes infinitely large, whereas at the other extreme, the score is just 0. This can’t be a proper scoring rule, because the guesser would be incentivized to always predict p=0 or p=1.

Thanks for flagging, fixed

Woop great post and great thread!

If anyone can figure out how to format that table, I would appreciate it, thanks!

[-]Zvi2y20

I have been trying to format tables on LW for a while have up and started using images.