You probably already know that you can incentivise honest reporting of probabilities using a proper scoring rule like log score, but did you know that you can also incentivize honest reporting of confidence intervals?
To incentize reporting of a confidence interval, take the score , where is the size of your confidence interval, and is the distance between the true value and the interval. is whenever the true value is in the interval.
This incentivizes not only giving an interval that has the true value of the time, but also distributes the remaining 10% equally between overestimates and underestimates.
To keep the lower bound of the interval important, I recommend measuring and in log space. So if the true value is and the interval is , then is and is for underestimates and for overestimates. Of course, you need questions with positive answers to do this.
To do a confidence interval, take the score .
This can be used to make training calibration, using something like Wits and Wagers cards more fun. I also think it could be turned into app, if one could get a large list of questions with numerical values.
In the discrete case log scoring still works, it generalizes past the binary case.
That is, if S is the set of possible outcomes of the test, Bob elicits from Alice a probability distribution q(s) on S, then Alice takes the test and gets some outcome s∈S, then Bob rewards Alice logq(s). (This number is unfortunately always negative; you can add a positive constant to it if you want.)
Alice's expected payoff according to her true probability distribution p(s) is
∑s∈Sp(s)logq(s)
also known as the (negative of the) cross entropy between p and q. And you can do a computation, e.g. with Lagrange multipliers, which will verify that for fixed p, the optimal value of q is q=p. I do this calculation in this blog post.
A test isn't a good example to use because the outcome of the test is under Alice's control, so she can e.g. throw the test and predict this fact. This procedure is best used to elicit Alice's prediction of something which she cannot influence in any way.