My question concerns the semantics of making future predictions such as 'the probability of X winning the election is 70%'. There are programs, e.g., Good Judgment, IARPA ACE, aimed at excelling at this kind of predictions.
In the classic mathematical interpretation of probability we can say 'the probability that one gets 8 heads in a row when flipping an unbiased coin is 1/256'. This statement can be derived mathematically from the assumption that the coin is unbiased and it can be verified empirically by performing the experiment iteratively and counting the number of successes. In the Bayesian statistics the things get slightly more fuzzy but still we can make a reasoning as follows. We model our knowledge as 'the value of the radius of Saturn in km is a random variable with normal distribution with mean 60.000 and variance 0.1'. Then we make several independent measurements, possibly burdened with some inaccuracy, and refine our prior distribution to have mean = 59.300 and variance = 0.01. This does not make sense in the previous interpretation but we can still attach some clear semantics to the sentence above by treating this random variable as a result of a measurement which is a repetitive random event. If one does not agree with such a statement, they must either question the choice of the prior distribution or the mathematical derivation.
Now suppose that two forecasters were asked in 2018 the questions below.
a) Will Donald Trump win the 2020 election?
b) Will USD/EUR exchange rate drop below 0.8 in 2020?
c) Will Sumatran orangutan become extinct by 2020?
d) Will humans land on Mars by 2040?
The first forecaster provided the following probability scores for these events: 43%, 90%, 45%, 42%. The other one gave numbers: 54%, 50%, 52%, 43%. We already know that the events (a,b,c) did not occur. First, how can we settle who has been a better forecaster so far? Secondly, their forecasts for the event (d) differ slightly. What kind of argumentation can the first forecaster make to convince the other one that 42% is the 'correct' answer? And what does this numerical value actually mean, as landing on Mars is not a repetitive random event nor it is a quantity which we can try measuring like the radius of Saturn? If one believes the 42% is a better estimation than 43%, how can it help making any choices in the future?