Thank you for pointing this out!
I have a sense that that log-odds are an underappreciated tool, and this makes me excited to experiment with them more - the "shared and distinct bits of evidence" framework also seems very natural.
On the other hand, if the Goddess of Bayesian evidence likes log odds so much, why did she make expected utility linear on probability? (I am genuinely confused about this)
I had not realized, and this makes so much sense.
Paul Christiano has explored the framing of interactive proofs before, see for example this or this.
I think this is a exciting framing for AI safety, since it gets to the crux of one of the issues as you point out in your question.
It's good to know that this a extended practice (do you have handy examples to see how others approach this issue?)
However to clarify my question is not whether those should be distinguished, but rather what should be the the confidence interval I should be reporting, given we are making the distinction between model predection and model error.
I do not understand prediction 86.
In other words, the difference between those "productively" engaged and those who are not is not always clear.
As context, prediction 84 says
While there is sufficient prosperity to provide basic necessities (secure housing and food,among others) without significant strain to the economy, old controversies persistregarding issues of responsibility and opportunity.
And prediction 85 says
The issue is complicated by thegrowing component of most employment's being concerned with the employee's ownlearning and skill acquisition.
What is Kurzweil talking about? Is this about whether we can tell when employees are doing useful work and when they are shirking?
Sorry for being dense, but how should we fill it?
By default I am going to add a third column with the prediction, is that how you want to receive the data?
Sure sign me up, happy to do up to 10 for now, plausibly more later depending on how hard it turns out to be
Brier scores are scoring three things:
Note that in Tetlock's research there is no hard cutoff from regular forecasters to superforecasters - he arbitrarily declared that the top 2% were superforecasters, and showed that 1) the top 2% of forecasters tended to remain in the top 2% between years and 2) that some of the techniques they used for thinking about forecasts could be shown in an RCT to improve the forecasting accuracy of most people.
Sadly I have not come across many definitions of heavy tailedness that are compatible with finite support, so I dont have any ready examples of action relevance AND finite support.
Another example involving a momentum-centric definition:
Distributions which are heavy tailed in the sense of not having a finite moment generating function in a neighbourhood of zero heavily reward exploration over exploitation in multi armed bandit scenarios.
See for example an invocation of light tailedness to simplify an analysis at the beginning of this paper, implying that the analysis does not carry over directly to heavy tail scenarios (disclaimer, I have not read the whole thing).
The point you are making - that distributions with infinite support may be used to represent model error - is a valid one.
And in fact I am less confident about that one that point relative to others.
I still think that is a nice property to have, though I find it hard to pinpoint exactly what is my intuition here.
One plausible hypothesis is because I think it makes a lot of sense to talk about frequency of outliers in bounded contexts. For example, I expect that my beliefs about the world are heavy tailed - I am mostly ignorant about everything (eg, "is my flatmate brushing their teeth right now?"), but have some outlier strong beliefs about reality which drives my decision making (eg, "after I click submit this comment will be read by you").
Thus if we sample the confidence of my beliefs the emerging distribution seems to be heavy tailed in some sense, even though the distribution has finite support.
One could argue that this is because I am plotting my beliefs in a weird space, and if I plot them with a proper scale like odd-scale which is unbounded the problem dissolves. But since expected value is linear with probabilities, not odds, this seems a hard pill to swallow.
Another intuition is that if you focus on studying asymptotic tails you expose yourself to Pascal's mugging scenarios - but this may be a consideration which requires separate treatment (eg Pascal's mugging may require a patch from the decision-theoretic side of things anyway).
As a different point, I would not be surprised if allowing finite support requires significantly more complicated assumptions / mathematics, and ends up making the concept of heavy tails less useful. Infinites are useful to simplify unimportant details, as with complexity theory for example.
TL;DR: I agree that infinite support can be used to conceptualize model error. I however think there are examples of bounded contexts where we want to talk about dominating outliers - ie heavy tails.