From Ezra Klein:

If Mitt Romney wins on election day, it doesn’t mean Silver’s model was wrong. After all, the model has been fluctuating between giving Romney a 25 percent and 40 percent chance of winning the election. That’s a pretty good chance! If you told me I had a 35 percent chance of winning a million dollars tomorrow, I’d be excited. And if I won the money, I wouldn’t turn around and tell you your information was wrong. I’d still have no evidence I’d ever had anything more than a 35 percent chance.

Okay, technically, winning the money would be very weak Bayesian evidence that the initial probability estimate was wrong. Still a very good quote.

New Comment
24 comments, sorted by Click to highlight new comments since: Today at 6:11 AM

If you told me I had a 35 percent chance of winning a million dollars tomorrow, I’d try to sell you my chance for 349 thousand dollars.

I'd first look for a multi-millionaire to whom to make the offer.

I'd undercut you! Sub-linear utility and all that.

I'd try to find out what my chance of winning really was, before attempting to trade.

I think the bigger question is what exactly it means for the probability estimate to be wrong. The best I can figure is that it's not whatever it's called where exactly x% of the predictions that you're x% sure of are correct. In that case, Romney winning is evidence that it's more extreme than it should be, and Obama winning is evidence that it's less extreme. Whether or not it's evidence that it's wrong depends on whether you thought it was more likely to be more or less extreme beforehand.

Usually probability prediction quality are broken into calibration and discrimination.

If you buy the Bayesian argument (e.g. in Jaynes) that there is a single correct Pr(A|I) where I is your state of information and A is any proposition in question, then p, an estimate of Pr(A|I), is wrong if and only if p != Pr(A|I). In practice, we virtually never know Pr(A|I), so we can't make this check. But as far as a conceptual understanding goes, that's it - if, as I said, you buy the argument.

In practice, we check the things gwern mentioned.

I am not seeing the rationality in Klein's analysis.

"If you told me I had a 35 percent chance of winning a million dollars tomorrow, I’d be excited." The difference is that presumably he [the speaker] could have a near infinite number of things happen to him. So picking one and giving it much greater odds than one could reasonably expect (given most contests that grant that kind of monetary rewards) does siginify a pretty unusual situation which he should be thrilled about.

However, for all intents and purposes, the winner of the election is a binary choice. (Let's say we have 999 of a 1000 units of probability to distribute between two candidates). I think it is a given that both candidates at the level of the general election are pretty excited to be there, given that they have radically greater odds than the rest of the eligible population.

But all that should be taken for granted by any adult with any familiarity with the system. Thus, a model that predicts one candidate over the other at 60/40, let alone 75/25, odds/confidence level is not any kind of good, exciting news for the other guy [Romney in Silver's model]. Of course Silver isn't ruling out the possibility. And the stakes are high, certainly greater than a million dollars. But nonetheless, 35% to win a million dollars when you didn't know the possibility existed is different than 35% to win the presidency when you are one of only two candidates.

Edit: Just thought of a better way to phrase the above--Whether news of 35% odds is good & exciting or bad & dispiriting depends on one's priors. I would assume the challenger in a pretty divided country would have had 40-45% odds to begin with and wouldn't be excited to update downwards.

It's not a question of whether it's "good news", but whether it's a plausible chance of occurring (or rather, whether it's a "big-enough-feeling" probability).

From the quote, it sounds like its a question of whether odds lower than one's prior should increase or decrease excitement when the stakes are high.

The quote seems to me to be about how, if you predict something is only about 35% likely, and that thing happens, that's not sufficient evidence to assume you predicted wrong or to throw out your methodology. The line about the million dollars looks like an example to back up the prior sentence, "that's a pretty good chance". Other than that example, the quote doesn't seem to be about excitement at all really.

Okay, that may be the intent of the argument. Not sure I agree with that, either, though. Silver's model is presumably built from several factors. If in the end it gives a prediction that doesn't come true, then there are likely factors that were considered incorrectly or left out. The "70 % odds" is basically saying "I'm 30% confident that the outcome of the model is wrong." If Obama ends up losing, that doesn't mean Silver knows nothing, but it is evidence that the model was flawed in some meaningful way, as he now suspects. That is, we should update on 'Silver's model was off' slightly more than 'Silver's model was accurate', despite the fact that he had less than 100% confidence in it.

Remember that Silver is running a monte-carlo type model. In his case, what his 'odds' mean are that when he runs the simulation N times, 70% or so of the times, Obama wins, 30% or so Romney wins. So its not "I'm 30% confident the outcome of the model is wrong" its that "30% of the time, the model outputs a Romney victory."

Okay (though to me that sounds like he has many related models that differ based on certain variables he isn't certain about... maybe that is being pointlessly pedantic) but would you agree that a R victory would be evidence that the model needs adjustment, stronger evidence than that the model was was reliable as is? If not, what if it was 99 to 1, instead of 60 to 40? Just trying to clarify my own thinking here.

I don't know much about the internals of his model, but I would say 'it depends.' I'm sure you can use his model to make predictions of the form 'given Romney victory, what should the final electoral map look like'?, etc, but I'm not sure if the public has that kind of access. Certainly questions like that can be used to probe the model after a Romney or Obama win. If either candidate wins 'in the wrong way' (i.e. carries the wrong states), its obviously stronger evidence the model is wrong than we could get from just Romney winning.

'given Romney victory, what should the final electoral map look like'?

He sometimes selected such maps in blog posts, and generally has one with all of the most likely outcomes, but these maps are not always-and-automatically publicly accessible, so far as I know.

Sure. But it's not a huge amount of evidence, Silver predicts that in 10 such elections there will be 3 such "surprises".

Not being American, I don't know how much Ezra Klein is important in politics/media, but he has a page on Wikipedia, so I guess he has nonzero importance.

With this assumption, this is more than just a quote. Quote is just something someone once said or wrote. But this is an example that someone with nonzero importance understands what a mathematical concept means, and can explain it today in a context of political discussion.

Relatively to the usual low levels of the sanity waterline, this is pretty awesome. (Yeah, I wish we instead lived in a world where quotes like this don't deserve attention, but only "man, everybody knows that" reaction.)

If you told me I had a 35 percent chance of winning a million dollars tomorrow, I’d be excited.

If you told me I had a 1 percent chance of winning a million dollars tomorrow, I'd be excited.

If you told me I had a 1/3 chance of winning a million dollars tomorrow, I'd be excited and I'd spend at least 1/3 of the day working out plans for using that money, with a clear conscience.

(I can't find the Sequences post that needs to be linked here, the one about how a TV anchor should allocate the time writing about the two possible outcomes for an event that will be decided the next day.) ETA: that's it, thanks!

The fun idea for scoring guessing games is to wait till the result is known, and then give each contestant points according to the logarithm of the probability that they had previously given the to outcome that actually occurred.

(Hmm, all the scores are negative; some are more negative than others. Perhaps the score keeper needs to give log n - log p points if he wants the numbers to be human-friendly. (The point of log n - log p is that if there are n outcomes you can give each a probability of 1/n and that scores zero. You hope to do better and make a positive score, but you can do worse (if something you think really unlikely actually happens)))

The clever mathematics behind this is Gibbs' inequality. When you come up with your probability distribution q, your expect score (relevant if the game has many rounds) is where p is the true but unknown distribution. Gibbs' inequality tells you that this will always be less than . In the short run you can get a higher score in a particular play of the game by combining overconfidence and good luck :-)

In the long run, you best hope for a high score is to do a good job of guessing p. Scoring log q sets up the game that way.

On the other end of the spectrum we have Elspeth Reeve coming to Nate Silver's defense while giving too much evidential weight to a Romney victory.

But Silver [takes a weighted average] because some pollsters have a better track record than others, and some have a clear partisan tilt, left or right. If his weighting is wrong, we'll know next week.

It's a little jarring, if not surprising, to see such a defense punctuated with such an off-base statement.

On a lighter note, Silver has publicly offered to bet Joe "It's A Tossup" Scarborough on the outcome of the election. I wish this sort of thing occurred more often.

The quote isn't talking about a Romney victory at all. If Silver's pollster weights are correct, then they will be negatively correlated with the pollsters' state-level absolute errors -- that is, more weight will correspond to smaller absolute errors. And since each swing state that a pollster surveys provides a datum, there will be quite a bit of data with which to estimate the correctness of Silver's weight scheme. In other words, if Silver's weighting is wrong, we'll know by next week.

(The partisan tilt thing is a distraction -- it's easy to correct so-called "house effects", leaving only whatever systematic bias affects the polls as a group.)