Eric Neyman

I'm a 3rd year PhD student at Columbia. My academic interests lie in mechanism design and algorithms related to the acquisition of knowledge. I write a blog on stuff I'm interested in (such as math, philosophy, puzzles, statistics, and elections):


Pseudorandomness Contest

Wiki Contributions


Hi! I just wanted to mention that I really appreciate this sequence. I've been having lots of related thoughts, and it's great to see a solid theoretical grounding for them. I find the notion that bargaining can happen across lots of different domains -- different people or subagents, different states of the world, maybe different epistemic states -- particularly useful. And this particular post presents the only argument for rejecting a VNM axiom I've ever found compelling. I think there's a decent chance that this sequence will become really foundational to my thinking.

Note that this is just the arithmetic mean of the probability distributions. Which is indeed what you want if you believe that P is right with probability 50% and Q is right with probability 50%, and I agree that this is what Scott does.

At the same time, I wonder -- is there some sort of frame on the problem that makes logarithmic pooling sensible? Perhaps (inspired by the earlier post on Nash bargaining) something like a "bargain" between the two hypotheses, where a hypothesis' "utility" for an outcome is the probability that the hypothesis assigns to it.

The aggregation method you suggest is called logarithmic pooling. Another way to phrase it is: take the geometric mean of the odds given by the probability distribution (or the arithmetic mean of the log-odds). There's a natural way to associate every proper scoring rule (for eliciting probability distributions) with an aggregation method, and logarithmic pooling is the aggregation method that gets associated with the log scoring rule (which Scott wrote about in an earlier post). (Here's a paper I wrote about this connection:

I'm also exited to see where this sequence goes!

Thanks for the post! Quick question about your last equation: if each h is a distribution over a coarser partition of W (rather than W), then how are we drawing w from h for the inner geometric expectation?

How much should you shift things by? The geometric argmax will depend on the additive constant.

Thanks for the post -- I've been having thoughts in this general direction and found this post helpful. I'm somewhat drawn to geometric rationality because it gives more intuitive answers in thoughts experiments involving low probabilities of extreme outcomes, such as Pascal's mugging. I also agree with your claim that "humans are evolved to be naturally inclined towards geometric rationality over arithmetic rationality."

On the other hand, it seems like geometric rationality only makes sense in the context of natural features that cannot take on negative values. Most of the things I might want to maximize (e.g. utility) can be negative. Do you have thoughts on the extent to which we can salvage geometric rationality from this problem?

I wonder if the effect is stronger for people who don't have younger siblings. Maybe for people with younder siblings, part of the effect kicks in when they have a younger sibling (but they're generally too young to notice this), so the effect of becoming a parent is smaller.

"My probability is 30%, and I'm 50% sure that the butterfly probability is between 20% and 40%" carries useful information, for example. It tells people how confident I am in my probability.

I often talk about the "true probability" of something (e.g. AGI by 2040). When asked what I mean, I generally say something like "the probability I would have if I had perfect knowledge and unlimited computation" -- but that isn't quite right, because if I had truly perfect knowledge and unlimited computation I would be able to resolve the probability to either 0 or 1. Perfect knowledge and computation within reason, I guess? But that's kind of hand-wavey. What I've actually been meaning is the butterfly probability, and I'm glad this concept/post now exists for me to reference!

More generally I'd say it's useful to make intuitive concepts more precise, even if it's hard to actually use the definition, in the same way that I'm glad logical induction has been formalized despite being intractable. Also I'd say that this is an interesting concept, regardless of whether it's useful :)

The Bayesian persuasion framework requires that the set of possible world states be defined in advance -- and then the question becomes, given certain utility functions for the expert and decision-maker, what information about the world state should the expert commit to revealing?

I think that Bayesian persuasion might not be the right framework here, because we get to choose the AI's reward function. Assume (as Bayesian persuasion does) that you've defined all possible world states.[1] Do you want to get the AI to reveal all the information -- i.e. which particular world state we're in -- rather than a convenient subset (that it has precommitted to)? That seems straightforward: just penalize it really heavily if it refuses to tell you the world state.

I think the much bigger challenge is getting the AI to tell you the world state truthfully -- but note that this is outside the scope of Bayesian persuasion, which assumes that the expert is constrained to the truth (and is deciding which parts of the truth they should commit to revealing).

  1. ^

    "World states" here need not mean the precise description of the world, atom by atom. If you only care about answering a particular question ("How much will Apple stock go up next week?" then you could define the set of world states to correspond to relevant considerations (e.g. the ordered tuple of random variables (how many iPhones Apple sold last quarter, how much time people are spending on their Macs, ...)). Even so, I expect that defining the set of possible world states to be practically impossible in most cases.

Load More