Are calibration and rational decisions mutually exclusive? (Part one)

I don't get it.

I admit my math background is limited to upper-division undergraduate, and I admit I could have tried harder to make sense of the jargon, but after reading this a few times, I really just don't get what your point is, or even what kind of thing your point is supposed to be.

[-]Cyan16y30

The short short version of this part of the argument reads:

What Bayesians call calibration, frequentists call valid confidence coverage. Bayesian posterior probability intervals do not have valid confidence coverage in general; priors that can guarantee it do not exist.

[-][anonymous]16y30

Suppose the actual frequentist probability of an event is 90%. Your prior distribution for the frequentist probability of the event is uniform. Your Bayesian probability of the event will start at 50% and approach 90%; in the long run, the average will be less than 90%.

If the post is getting at more than this, I understand as little as you do. My answer to the title question was "no, they can't be" going in, and if the post is trying to say something I haven't understood, then I hope to convince the author e's wrong through sheer disagreement.

[-]Cyan16y30

Try rephrasing your first paragraph when the quantity of interest is not a frequency but, say, Avogadro's number, and you're Jean Perrin trying to determine exactly what that number is.

A frequentist would take a probability model for the data you're generating and give you a confidence interval. A billion scientists repeat your experiments, getting their own data and their own intervals. Among those intervals, the proportion that contain the true value of Avogadro's number is equal to the confidence (up to sampling error).

A Bayesian would take the same probability model, plus a prior, and combine them using Bayes. Each scientist may have her own prior, and posterior calibration is only guaranteed if (i) all the priors taken as a group were calibrated, or, (ii) everyone is using the matching prior if it exists (these are typically improper, so prior calibration cannot be calculated).

[-]cousin_it16y50

Please provide an example where frequentists get exact answers and Bayesians get only approximations, all from the same data. This looks highly improbable to me. Or did you mean something else?

[-]Cyan16y00

No, this is more-or-less what I meant. I equivocate on "exact," because I regard the Bayesian answer as exactly what one actually wants, and perfect frequentist validity as a secondary consideration. To provide the example you requested, I'll have to go searching for one of the papers that set off this line of thought -- the bloody thing's not online, so it might take a while.

[-]Vladimir_Nesov16y30

Could you state your point with math? I don't understand what you are saying.

[-]Cyan16y00

You can find some of the math, and pointers into the literature, in this paper

[-]Venu14y20

I came to this post via a Google search (hence this late comment). The problem that Cyan's pointing out - the lack of calibration of Bayesian posteriors - is a real problem, and in fact something I'm facing in my own research currently. Upvoted for raising an important, and under-discussed, issue.

[-]PhilGoetz16y10

"The upshot is that we have good reason to think that Bayesian posterior intervals will not be perfectly calibrated in general."

This seems to be the main point of your post; and nothing in the post seems to be connected to it.

[-]Cyan16y10

The ideas of the post are: calibration seems to me to be equivalent to confidence coverage (second and third paragraphs); in general, Bayesian posterior intervals do not have valid confidence coverage (fourth paragraph). The sentence you quote above follows from these two ideas.

[-]PhilGoetz16y10

Okay, that helps. My problem is that, on re-reading, I still don't know what the 4th paragraph means.

This similarity suggests an approach for specifying non-informative prior distributions

Why would anybody want non-informative distributions?

by and large, posterior intervals can at best produce only asymptotically valid confidence coverage.

I don't know what it means for a confidence interval to be asymptotically valid, or why posterior intervals have this effect. This seems like an important point that should be justified.

if your model of the data-generating process contains more than one scalar parameter, you have to pick one "interest parameter" and be satisfied with good confidence coverage for the marginal posterior intervals for that parameter alone

You lost me entirely.

[-]Cyan16y00

Why would anybody want non-informative distributions?

To have a prior distribution to use when very little is known about the estimand. It's meant to somehow capture the notion of minimal prior knowledge contributing to the posterior distribution, so that the data drive the conclusions, not the prior.

I don't know what it means for a confidence interval to be asymptotically valid.

The confidence coverage of a posterior interval is equal to the posterior probability mass of the interval plus a term which goes to zero as the amount of data increases without bound.

if your model of the data-generating process contains more than one scalar parameter...

E.g., a regression with more than one predictor. Each predictor has its own coefficient, so the model of the data-generating process contains more than one scalar parameter.

[-]Eliezer Yudkowsky16y10

Is this a standard frequentist idea? Is there a link to a longer explanation somewhere? Well-calibrated priors and well-calibrated likelihood ratios should result in well-calibrated posteriors.

[-]Cyan16y20

Valid confidence coverage is a standard frequentist idea. Wikipedia's article on the subject is a good introduction. I've added the link to the post.

The problem is exactly: how do you get a well-calibrated prior when you know very little about the question at hand? If your posterior is well-calibrated, your prior must have been as well. So, seek a prior that guarantees posterior calibration. This is the "matching prior" program I described above.

[-]PhilGoetz16y00

This sounds like Gibbs sampling or expectation maximization. Are Gibbs and/or EM considered Bayesian or frequentist? (And what's the difference between them?)

[-]Cyan16y-10

Gibbs sampling and EM aren't relevant to the ideas of this post.

Neither Gibbs sampling nor EM is intrinsically Bayesian or frequentist. EM is just a maximization algorithm useful for certain special cases; the maximized function could be a likelihood or a posterior density. Gibbs sampling is just a MCMC algorithm; usually the target distribution is a Bayesian posterior distribution, but it doesn't have to be.

[-]PhilGoetz16y20

You said, "seek a prior that guarantees posterior calibration." That's what both EM and Gibbs sampling do, which is why I asked.

[-]Cyan16y-10

You and I have very different understandings of what EM and Gibbs sampling accomplish. Do you have references for your point of view?

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

7

Are calibration and rational decisions mutually exclusive? (Part one)

7

7