What is the semantics of assigning probabilities to future events?

Apr 01, 2021

Probability is in the mind. It's relative to the information you have.

In practical terms, you typically don't have good enough resolution to get individual percentage point precision, unless it's in a quantitative field with well understood processes.

crl826

Apr 03, 2021

First, how can we settle who has been a better forecaster so far?

The first forecaster thought it was less likely that 2 out of 3 things that didn't occur - wouldn't. The second forecaster thought it was more likely that 2 out of 3 things that didn't occur - would. So I think that the first forecaster has got a pretty easy case on this one.

I think the rest of your questions seem to be thinking that the percentages are measuring something in the real world. They are a measure of the predictor's confidence. A way to tell the world how seriously they think you should take their prediction.

What kind of argumentation can the first forecaster make to convince the other one that 42% is the 'correct' answer?

I don't think he can. He is technically a little less sure that humans that will land on the Mars than second forecaster. (or, if you prefer, a little more sure that they won't) And a 1% difference is functionally 0 difference in this situation.

If they had vastly different levels of confidence, they could discuss the gaps in the optimism/pessimism, but at 1% difference....that's just personal preference

And what does this numerical value actually mean, as landing on Mars is not a repetitive random event nor it is a quantity which we can try measuring like the radius of Saturn?

To repeat self, They are a measure of the predictor's confidence. A way to tell the world how seriously they think you should take their prediction.

If one believes the 42% is a better estimation than 43%, how can it help making any choices in the future?

Even if you had predictors with so many predictions that you could actually take a 1% difference seriously....I still don't know when that 1% would matter much.

Dmitriy Vasilyuk

Apr 02, 2021

I find this question really interesting. I think the core of the issue is the first part:

First, how can we settle who has been a better forecaster so far?

I think a good approach would be betting related. I believe different reasonable betting schemes are possible, which in some cases will give conflicting answers when ranking forecasters. Here's one reasonable setup:

Let A = probability the first forecaster, Alice, predicts for some event.
Let B = probability the second forecaster, Bob, assigns (suppose B > A wlog).
Define what's called an option: basically a promissory note to pay 1 point if the event happens, and nothing otherwise.
Alice will write and sell N such options to Bob for price P each, with N and P to be determined.
Alice's EV is positive if P > A (she expects pay out A points/option on average).
Bob's EV is positive if P < B (he expects to be paid B points/option on average).

A specific scheme can then stipulate the way to determine N and P. After that comparing forecasters, after a number of events, would just translate to comparing points.

As a simple illustration (without claiming it's great), here's one possible scheme for P and N:

Alice and Bob split the difference and set P = 1/2 (A + B).
N = 1.

One drawback of that scheme is that it doesn't punish too much a forecaster who erroneously assigns a probability of 0% or 100% to an event.

A different structure of the whole setup would involve not two forecasters betting against each other, but each forecaster betting against some "cosmic bookie". I have some ideas how to make that work too.

And what does this numerical value actually mean, as landing on Mars is not a repetitive random event nor it is a quantity which we can try measuring like the radius of Saturn?

I don't see how we could assign some canonical meaning to this numerical value. For every forecaster there can always be a better one in principle, who takes into account more information, does more precise calculations, and happens to have better priors (until we reach the level of Laplace's demon, at which point probabilities might just degenerate into 0 or 1).

If that's true then such a numerical value would seem to just be a subjective property specific to a given forecaster, it's whatever that forecaster assigns to the event and uses to estimate how many points (or whatever other metrics she cares about) she will have in the future.

[-]Michal5y30

I like the idea of defining a betting game 'forecasters vs cosmic bookie'. Then saying 'the probability that people will land on Mars by 2040 is 42%' translates into semantics 'I am willing to buy an option for Y<42 cents that would be worth $1 if we land on Mars by 2040 or $0 otherwise'.

To compare several forecasters we can consider a game in which each player is offered to buy some options of this kind. Suppose that for each x in {1, \dots, 99} each player is allowed to buy one option for x cents. If one believes that the probability of an event is 30... (read more)

1Dmitriy Vasilyuk5y

That's perfect, I was thinking along the same lines, with a range of options available for sale, but didn't do the math and so didn't realize the necessity of dual options. And you are right of course, there's still quite a bit of arbitrariness left. In addition to varying the distribution of options there is, for example, freedom to choose what metric the forecasters are supposed to optimize. It doesn't have to be EV, in fact in real life it rarely should be EV, because that ignores risk aversion. Instead we could optimize some utility function that becomes flatter for larger gains, for example we could use Kelly betting.

paladim

Jan 12, 2022

That's a very interesting question and it is unfortunate that it did not get more traction, because I think I could learn a lot by reading more answers. In no way what follows is a definitive answer, it is just my own take.

First, how can we settle who has been a better forecaster so far?

A naive answer would be, let's pick the forecaster who has the lowest cross-entropy, i.e. the same way when we train a binary classifier which outputs probabilities, we pick the model which minimises the cross-entropy. I say this answer is naive because if we take the question at face value and we want to pick the best forecaster, I would say that 4 predictions of disparate events is barely enough information to make a conclusive decision. Relying on the model analogy again, we rarely can judge a model on a single prediction, we need an aggregate of those*.

What kind of argumentation can the first forecaster make to convince the other one that 42% is the 'correct' answer?

Each forecaster will have built their model their own way. In a black box approach, forecaster A could ask B to use his model to predict a series similar events for which we know the outcome, e.g. "predict the probability that man will land in the moon by 1970" and use those results as an argument to convince B that the model is miscalibrated. The other alternative is to have forecaster A examine B's model and see which assumptions, parts of the model they disagree.

And what does this numerical value actually mean, as landing on Mars is not a repetitive random event nor it is a quantity which we can try measuring like the radius of Saturn?

One school of thought, and as stated by others, relate to bets. If I assign probability p to a future event, I am willing to sell a promise of paying $1 if the event takes place at the price of p dollars and I am also willing to be the one buying the promise (see the Dutch book argument). I don't find this argument satisfying. While this view makes a lot of sense in prediction markets, I don't think other practitioners would be backing the output of every single model with a bet. Also, the utility of money is non-linear which makes things more complicated.

In my day to day, I think of Bayesian stats as a formal theory (a powerful and useful one) which allows me to combine evidence and produces a score/a number quantifying how likely is a proposition to be true. On top of that, Cox's theorem tells me that if I accept some assumptions, any theory for scoring statements will satisfy the axioms of probability, in other words, the equations of this theory are not arbitrary, they have some grounding. Not everybody agrees on the assumptions made by Cox's theorem, if we reject some of them, we can end up with something different, e.g. Dempster-Shafer theory.

If one believes the 42% is a better estimation than 43%, how can it help making any choices in the future?

If both the forecasters are willing to back up their predictions with bets, there is an opportunity for arbitrage opportunity for a 3rd party. One can sell a promise to B for $0.43 and then buy a promise from A for $0.42. No matter what is the outcome, you will have positive benefit of $0.01.

From a scoring perspective, 1% difference at this range (close to 50%) doesn't mean much. If range considered was closer to the extremes 0% or 100%, this difference would be considered more significant. E.g. one model assigns 0.1% and the other 1.1%, it might be worth examining the difference between both models that lead to this discrepancy.

*An example of a model that can be judged with a single prediction. Say that I have a coin and model predicts probability of tails . We flip the coin and tails come out. We are more inclined to think the model is wrong than to believe that we witnessed an extremely rare event.

Carlos Javier Gil Bellosta

Apr 02, 2021

This is a topic I have found myself thinking a lot lately as well. I have found it useful to decompose a non-repeatable event (will X will the elections?) in two parts: one consisting of a combination of repeatable events and a "specific residual".

Let's start with a coin toss. It is, in a very Heraclitean sense, a one time event which we can decompose as a throw of an ideal coin plus a tiny, negligible "specific residual".

Let's go back to the problems in question. You could decompose them into combinations of events for which we have historical frequencies (how many times an incumbent politician...? how many times an election during an economic crises...? how do the probabilities of wining an election relate to the polls three months before...=), plus conceivably larger "specific residual" given the particularities of the question.

This approach is more useful than vague considerations on "probability is in your head" or "it just relates to information". It is actually how predictors work: decomposing the question into subquestions on which frequency considerations are easier to elicit, recombining them, and adding an extra layer on uncertainty on top.

LESSWRONG
LW

LESSWRONG
LW

12

[ Question ]

What is the semantics of assigning probabilities to future events?

12

12

5 Answers sorted by
top scoring

Apr 01, 2021

Apr 03, 2021

Apr 02, 2021

Jan 12, 2022

Apr 02, 2021

12

[ Question ]

What is the semantics of assigning probabilities to future events?

12

12

5 Answers sorted by top scoring

Apr 01, 2021

Apr 03, 2021

Apr 02, 2021

Jan 12, 2022

Apr 02, 2021

5 Answers sorted by
top scoring