steven0461

Comments

A framework for thinking about single predictions

I agree, of course, that a bad prediction can perform better than a good prediction by luck. That means if you were already sufficiently sure your prediction was good, you can continue to believe it was good after it performs badly. But your belief that the prediction was good then comes from your model of the sources of the competing predictions prior to observing the result (e.g. "PredictIt probably only predicted a higher Trump probability because Trump Trump Trump") instead of from the result itself. The result itself still reflects badly on your prediction. Your prediction may not have been worse, but it performed worse, and that is (perhaps insufficient) Bayesian evidence that it actually was worse. If Nate Silver is claiming something like "sure, our prediction of voter % performed badly compared to PredictIt's implicit prediction of voter %, but we already strongly believed it was good, and therefore still believe it was good, though with less confidence", then I'm fine with that. But that wasn't my impression.

edit:

Deviating from the naive view implicitly assumes that confidently predicting a narrow win was too hard to be plausible

I agree I'm making an assumption like "the difference in probability between a 6.5% average poll error and a 5.5% average poll error isn't huge", but I can't conceive of any reason to expect a sudden cliff there instead of a smooth bell curve.

Did anybody calculate the Briers score for per-state election forecasts?

Yes, that looks like a crux. I guess I don't see the need to reason about calibration instead of directly about expected log score.

Scoring 2020 U.S. Presidential Election Predictions

Most closely contested states went to Biden, so vote share is more in Trump's favor than you'd expect based on knowing only who won each state, and PredictIt generally predicted more votes for Trump, so I think PredictIt comes out a lot better than 538 and the Economist.

Did anybody calculate the Briers score for per-state election forecasts?

Data points come in one by one, so it's only natural to ask how each data point affects our estimates of how well different models are doing, separately from how much we trust different models in advance. A lot of the arguments that were made by people who disagreed with Silver were Trump-specific, anyway, making the long-term record less relevant.

It's like taking one of Scott Alexander's 90% bets that went wrong and asking, "do you admit that, if we only consider this particular bet, you would have done better assigning 60% instead?"

If we were observing the results of his bets one by one, and Scott said it was 90% likely and a lot of other people said it was 60% likely, and then it didn't happen, I would totally be happy to say that Scott's model took a hit.

Did anybody calculate the Briers score for per-state election forecasts?

I agree that, if the only two things you consider are (a) the probabilities for a Biden win in 2020, 65% and 89%, and (b) the margin of the win in 2020, then betting markets are a clear winner.

My impression from Silver's internet writings is he hasn't admitted this, but maybe I'm wrong. I haven't seen him admit it and his claim that "we did a good job" suggests he's unwilling to. Betting markets are the clear winner if you look at Silver's predictions about how wrong polls would be, too. That was always the main point of contention. The line he's taking is "we said the polls might be this wrong and that Biden could still win", but obviously it's worse to say that the polls might be that wrong than to say that the polls probably would be that wrong (in that direction), as the markets implicitly did.

Did anybody calculate the Briers score for per-state election forecasts?

Looking at states still throws away information. Trump lost by slightly over a 0.6% margin in the states that he'd have needed to win. The polls were off by slightly under a 6% margin. If those numbers are correct, I don't see how your conclusion about the relative predictive power of 538 and betting markets can be very different from what your conclusion would be if Trump had narrowly won. Obviously if something almost happens, that's normally going to favor a model that assigned 35% to it happening over a model that assigned 10% to it happening. Both Nate Silver and Metaculus users seem to me to be in denial about this.

A Parable of Four Riders

That's their mistake in the case of the fools, but is the claim that they're also making it in the case of the wise men?

MikkW's Shortform

I don't think there's any shortcut. We'll have to first become rational and honest, and then demonstrate that we're rational and honest by talking about many different uncertainties and disagreements in a rational and honest manner.

A Parable of Four Riders

Is the claim that the superiors are making the same mistake in judging the wise men that they're making in judging the fools?

What risks concern you which don't seem to have been seriously considered by the community?

On the other hand, to my knowledge, we haven't thought of any important new technological risks in the past few decades, which is evidence against many such risks existing.

Load More