Scoring Rules are ways to score answers on a test, a prediction, or any other performance.
Of special interest are proper scoring rules - rules such that the strategy for maximizing the expected score coincides with noting down your true beliefs about the question.
Forecasting rules and their flaws
The Log score (sometimes called surprisal) is a strictly proper scoring rule[1] used to evaluate how good forecasts were. A forecaster scored by the log score will, in expectation, obtain the best score by providing a predictive distribution that is equal to the data-generating distribution. The log score therefore incentivizes forecasters to report their true belief about the future.
All Metaculus scores are types of log score[2].
The log score is usually computed as the negative logarithm of the predictive density evaluated at the observed value , log , where is the predicted probability density function. Usually, the natural logarithm is used, but the log score remains strictly proper for any base used for the logarithm.
In the formulation presented above, the score is negatively oriented, meaning that smaller values are better. Sometimes the sign of the log score is inversed and it is simply given as the log predictive density. If this is the case, then larger values are better.
The log score is applicable to binary outcomes as well as discrete or continuous outcomes. In the case of binary outcomes, the formula above simplifies to
,
where is the probability assigned to the binary outcome . If a forecaster for example assigned 70% probability that team A would win a soccer match, then the resulting log score would be if team A wins and if team A doesn't win.
Related Pages: Calibration, Forecasting & Prediction, Skill / Expertise Assessment, Prediction Markets