This is a linkpost for https://docdro.id/PEehlsN

Untangling Infrabayesianism: A redistillation [PDF link; ~12k words + lots of math]

5Lorxus

1Lorxus

4Kenoubi

4cubefox

3Kenoubi

1Lorxus

2noam

2Lorxus

2Gurkenglas

1noam

1Lorxus

1noam

1Lorxus

2noam

1Lorxus

1Lorxus

New Comment

16 comments, sorted by Click to highlight new comments since: Today at 10:07 AM

Let's say that I can understand neither the original IB sequence, nor your distillation. I don't have the prerequisites. (I mean, I know some linear algebra - that's hard to avoid - but I find topology loses me past "here's what an open set is" and I know nothing about measure theory.)

I think I understand what non-realizability is and why something like IB would solve it. Is all the heavy math actually necessary to understand how IB does so? I'm very tempted to think of IB as "instead of a single probability distribution over outcomes, you just keep a (convex[1]) set of probability distributions instead, and eliminate any that you see to be impossible, and choose according to the minimum of the expected value of the ones you have left". But I think this is wrong, just like "a quantum computer checks all the possible answers in parallel" is wrong (if that were right, a classical algorithm in P would directly translate into a quantum algorithm in NP, right? I still don't actually get quantum computation, either.) And I don't know why it's wrong or what it's missing.

[1] That just means that for any and in the set, and any , is also in the set, right?

Is there anything better I can do to understand IB than first learn topology and measure theory (or other similarly sized fields) in a fully general way? And am I the only person who's repeatedly bounced off attempts to present IB, but for some reason still feels like maybe there's actually something there worth understanding?

I think what's really needed would be a short single page introduction. Sort of an elevator pitch. Alternatively a longer non-technical explanation for dummies, similar to Yudkowsky's posts in the sequences.

This would get people interested. It's unlikely to be motivated to dive into a 12k words math heavy paper without any prior knowledge of what the theory promises to accomplish.

I can actually sort of write the elevator pitch myself. (If not, I probably wouldn't be interested.) If anything I say here is wrong, someone please correct me!

Non-realizability is the problem that *none* of the options a real-world Bayesian reasoner is considering is a *perfect* model of the world. (It actually information-theoretically can't be, if the reasoner is itself part of the world, since it would need a perfect self-model as part of its perfect world-model, which would mean it could take its own output as an input into its decision process, but then it could decide to do something else and boom, paradox.) One way to explain the sense in which the models of real-world reasoners are imperfect is that, rather than a knife-edge between bets they'll take and bets on which they'll take the other side, one might, say, be willing to take a bet that pays out 9:1 that it'll rain tomorrow, and a bet that pays out 1:3 if it *doesn't* rain tomorrow, but for anything in between, one wouldn't be willing to take either side of the bet. A lot of important properties of Bayesian reasoning depend on realizability, so this is a serious problem.

Infra-Bayesianism purports to solve this by replacing the single probability distribution maintained by an ideal Bayesian reasoner by a certain kind of set of probability distributions. As I understand it, this is done in a way that's "compatible with Bayesianism" in the sense that if there were only one probability distribution in your set, it would act like regular Bayesianism, but in general the thing that corresponds to a probability is instead the *minimum* of the probability across all the probability distributions in your set. This allows one to express things like "I'm at least 10% confident it'll rain tomorrow, and at least 75% confident it *won't* rain tomorrow, but if you ask me whether it's 15% or 20% likely to rain tomorrow, I just don't know."

The case in which this seems most obviously useful to me is adversarial. Those offering bets should - if they're rational - be systematically better informed about the relevant topics. So I should (it seems to me) have a range of probabilities within which the fact that you're offering the bet is effectively telling me that you appear to be better informed than I am, and therefore I shouldn't bet. However, I believe Infra-Bayesianism is intended to more generally allow agents to just not have opinions about every possible question they could be asked, but only those about which they actually have some relevant information.

Consider the following example for the interval X = (0, 1) (which is homeomorphic to R). Suppose we wanted to assign measures to *all* of its subsets, and do so in accordance with the ordinary desiderata of sigma-additivity and m(X) = 1.

Now partition the interval into an uncountable family of countable sets X_i such that two numbers live in the same subset iff they differ by a rational number. (Make sure you fully understand this construction before continuing!)

What measure should we assign to any such X_i? We can quickly see that the X_i are all of equal cardinality (that of Aleph-null) and even have natural maps to each other (given by adding irrationals mod 1).

We can't assign them measure 0 - by sigma-additivity that gives us m(X) = 0. We can't assign them positive measure - again by sigma-additivity that gives us m(X) >> 1.

Thus we cannot assign such subsets any measure, so we must have been wrong from the start that 2^X was a reasonable sigma-algebra to pick as the foundation of our measure in X.

Though now that I think about it, if the difference is some **irrational** number then this seems to work, as any set would contain exactly one unique rational number. Now they each have the cardinality of R, and the family has the cardinality of Q. And then it all seems to work.

Does that seem right?

If you thought this was too hard or too technical or too weird, I recommend that you take a look at https://www.lesswrong.com/posts/Een2oqjZe6Gtx6hrj/an-elementary-introduction-to-infra-bayesianism , which is intended as a companion piece to mine for those with less in the way of mathematical chops, or who'd simply rather have a less technical overview.

[Epistemic status: improved redistillation of the infrabayesianism sequence.]

So you want to understand infrabayesianism, to hack to the center of that thorny wood and seek out and recover any treasures hidden there? You've come to a correct creature for a guide. If you want to journey there, make sure you've already got the necessary tools well in hand: some simple decision theory, the basics of topology and linear algebra, and a little measure theory - for that last, if you know how a Lebesgue integral is defined and why no reasonable σ-algebra can encompass the full power set, then you're already doing fine. If you find yourself struggling with such things, reach out to me on Discord or in PMs here and I'll see what we can do.

Infrabayesianism seems like exactly what we might need as alignment researchers: a way to discuss all of our usual decision-theoretic questions while also getting to account for uncertainty about the world, compensate for policy-dependent environments and adversarial selection, and even talk about UDT puzzles. It does this by fundamentally being a decision theory that has explicit reasonable machinery for handling Knightian uncertainty about the environment due to nonrealizable or nonlearnable hypotheses while still permitting nontrivial inference and planning.

Three major brambly hedges block the way between you and understanding: the prickly snagging of the frequently unclear, unintuitive, or just plain lacking notation used in the original infrabayesian sequence; thorny philosophical tangles up front and scattered throughout; and math and its accompanying density of concept and notation getting thicker as we go deeper in. Follow me, though, and we'll slip right through them with barely a scratch, and eat a couple of delicious berries from right off their vines. In fact, I can tell you up front that if you haven't read the original infrabayesianism sequence too closely and aren't that familiar with its notation... that's an active benefit, because we won't need most of it here. We won't be cleaving perfectly to its choices of notation or terminology, though I will eventually provide a dictionary between the two as a postscript.

(I

reallydon't feel like trying to port over 28 PDF pages' worth of TeX and dense math writing here, let alone figure out how to divide it up into several posts, so rather than be redundant I've decided to link to the pdf instead. Please imagine that it was reproduced here in its entirety, and comment accordingly. This is a first-draft redistillation, of the kind that I might have been satisfied with as a minimum viable submission to the Infra-Bayes bounty. I may well edit, reorganize, revise, or remove parts of it as I see fit. If I feel like it, I might even turn this into an ArXiv submission or sequence of carefully-ordered posts of my own.)