What confidence interval should one report?

4Gordon Seidoh Worley

3Dagon

1Jsevillamol

6gwern

New Answer

New Comment

2 Answers sorted by

I often find the thing that is easier find the confidence interval for a result rather than the result that fits a confidence interval, so rather than targeting a particular confidence interval I find it more natural to say what I want to say and then specify the level of confidence in it.

Depends on your audience, but generally I see these distinct sources of uncertainty split out in papers - report the modeled confidence (output of the model based on data), AND be clear about the likelihood of model error.

It's good to know that this a extended practice (do you have handy examples to see how others approach this issue?)

However to clarify my question is not whether those should be distinguished, but rather what should be the the confidence interval I should be reporting, given we are making the distinction between model predection and model error.

One thing you can try if you have enough forecasts (which may be the case in technological forecasting) is empirical recalibration of CIs: your model-based CIs only report sampling error without any kind of model or other error and will be overconfident. So you can expand the CIs by a certain amount corresponding to how bad that turns out to be. A particularly relevant forecasting example of doing is in "Disentangling Bias and Variance in Election Polls", Shirani-Mehr et al 2018, where they observe that polling upsets like Brexit or Donald Trump are indeed surprising if you relied solely on sampling error CIs (which turn out to be only half the width of the total error CI including the systematic errors), but are normal overall.

TL;DR: How to choose the width of the confidence intervals one reports?

I am working on a paper on statistical technological forecasting.

Our model produces confidence intervals for the date when a particular milestone will be achieved. I am now thinking about how to report the results.

One line of reasoning is that you want the results to be as informative as you can, so we would report 90% confidence intervals.

Another one is that you dont want to overstate your results, specially on a domain such as the one I am working on where model error could be as high as 1/3 of the models. So your confidence interval should be commesurate with the expected base rate of model error (eg 70% confidence intervals in this case).

I think I am not framing this problem correctly, so any thoughts would be appreciated.