Hi all,

I've written up some of my take-away thoughts from the MIRI workshop.



New Comment
8 comments, sorted by Click to highlight new comments since:

It's great that MIRI had some success with probabilistic reflection. But I'm still not sure if it matches the pattern of "waging war on an established result". Lob's theorem, like Russell's paradox, is one of those counterintuitive problems where the answer we "feel" must be right is provably wrong. When someone tells me such a problem can be solved with probabilities but not with certainties, I get suspicious. Why would we be so lucky when going against the grain?

Sorry for the unhelpful comment. I do really wish the best of luck to such efforts.

What's your current opinion on whether Sawin's proved statement that "any logically coherent, approximable probability distribution over statements in arithmetic which assigns probability 1 to true pi-1 statements will assign probability 0 to some true pi-2 statements" is a special case of Theorem 3.7 of Gaifman & Snir (1982)? Alex Mennen looked at it and thinks "probably not," but I'm not in a position to tell.

I agree with "probably not" from my reading so far, but I don't feel I've understood enough of the preceding sections in the paper to be confident. I would not be too surprised if some other theorem in that paper was relevant, either; I haven't gotten through enough to say, except that it's definitely exploring related questions.

I would not be too surprised if some other theorem in that paper was relevant, either; I haven't gotten through enough to say, except that it's definitely exploring related questions.

The statement Sawin proved refers to computable probability distributions over logical statements, and Gaifman & Snir only considered a class of probability distributions which are necessarily undefinable (by Tarski's theorem, since their probability distributions are certain about the truth-values of all statements in L0, which is expressive enough to implement Peano Arithmetic). So I actually would be fairly surprised if the paper ended up containing something relevant to it.


But I can't help but think that there's some more elegant way to formulate the "chance that an unknown consistent logical theory would contain statement A" than by generating a billion apparently-consistent logical theories and counting how many have A. Then again, maybe not. shrug

I do see another problem - it seems like the amount of resources used when generating and checking all those logical theories is more than you'd need to just prove A and remove any need for probability. After all, if you sample a thousand logical theories, and the logical probability of A is 0.7, then I think the Monte Carlo algorithm has just proved A 700 times, among all its other work.

EDIT: More thoughts. How would updating on a pool of trusted statements work here? E.g. what if we want to condition our logical probability on the fact that 1+1=2? The obvious method is just to start with "1+1=2" when randomly generating a new logical theory, and toss out anything inconsistent with it.

But this becomes really troublesome if you want to start with a complete set of axioms. E.g. what if you know how to do arithmetic, you just don't know the last digit of the trillionth prime number? The rules of arithmetic are statements like "for all numbers a and b, ab = (a-1)b+b", which should fit into the pool of statements defining a possible logical theory just fine. But if you start with a complete theory already in the pool, he space you're sampling is just a point, and checking whether the random logical theory contains A collapses to being a theorem-prover for a known theory.

So what do you do? The "know arithmetic, don't know all the consequences" situation should be right in logical probability's wheelhouse. Do you only allow arithmetic hierarchy level 0 statements to enter the pool, maybe? So you can demand that 1+1=2, but you can't demand that for all x, x+x=2x. But on the other hand, what if you want the probability given arithmetic of something really easy, like "does 2+2=4?", and can start with "1+1=2" in the pool, but not any of the heirarchy-level-1 statements that actually define addition. Then you end up assigning 2+2=4 some probability depressingly close to 50% just because you can't handle too much starting knowledge.

The monte-carlo algorithm is only one way of viewing the distribution. You can also imagine expanding the tree of possibilities in a more static way, computing the probabilities down each branch in a way that reflects the probabilistic generation.

A sufficiently clever implementation could save a lot of work.

Knowing arithmetic but not knowing the trillionth prime is fairly simple; this means that you are conditioning the probability distribution on the peano axioms, but your consistency check isn't going deep enough to get you the information about the trillionth prime. You'll get a "reasonable" distribution over the digits of the trillionth prime, IE, obeying any laws your consistency check does know, and giving simple theories high probability.

Presumably the consistency check will go deep enough to know that 2+2=4. :)

Oh, so the point of gradual failure as you go to limited computing power is that the theories you postulate get more and more inconsistent? I like it :D But I still feel like the part of the procedure that searches for your statement A among the logical theories still doesn't fail gracefully when resources are limited. If you don't have much time, it's just not going to find A by any neutral or theorem-proving sort of search. And if you try just tossing in A at the last minute and checking for consistency, a situation where your consistency-checker isn't very strong will just lead to your logical probability depending on the pattern with which you generate and add in A (presumably this would be set up to return P=0.5).

Or to put it another way, even if the search proves that "the trillionth prime ends in 0" has probability 0, it might not be able to use that to show that "the trillionth prime ends in 3" has any probability other than 0.5, if it's having to toss it in at the end and do a consistency check against already-added theorems. Which is to say, I think there are weaknesses to doing sampling rather than doing probabilistic logic.

For a shameless plug of what I think is a slightly clever implementation, I wrote a post :P (It's only slightly clever because it doesn't do pattern-recognition, it only does probabilistic logic - the question is why and how pattern recognition helps) I'm still thinking about whether this corresponds to your algorithm for some choice of the logical-theory-generator and certain resources.

Thanks for the pointer to the post. :) I stress again that the random sampling is only one way of computing the probability; it's equally possible to use more explicit probabilistic reasoning.