Logical Uncertainty as Probability

by gRR1 min read29th Apr 201222 comments

4

Logical InductionLogical Uncertainty
Personal Blog

This post is a long answer to this comment by cousin_it:

Logical uncertainty is weird because it doesn't exactly obey the rules of probability. You can't have a consistent probability assignment that says axioms are 100% true but the millionth digit of pi has a 50% chance of being odd.

I'd like to attempt to formally define logical uncertainty in terms of probability. Don't know if what results is in any way novel or useful, but.

Let X be a finite set of true statements of some formal system F extending propositional calculus, like Peano Arithmetic. X is supposed to represent a set of logical/mathematical beliefs of some finite reasoning agent.

Given any X, we can define its "Obvious Logical Closure" OLC(X), an infinite set of statements producible from X by applying the rules and axioms of propositional calculus. An important property of OLC(X) is that it is decidable: for any statement S it is possible to find out whether S is true (S∈OLC(X)), false ("~S"∈OLC(X)), or uncertain (neither).

We can now define the "conditional" probability P(*|X) as a function from {the statements of F} to [0,1] satisfying the axioms:

Axiom 1: Known true statements have probability 1:

    P(S|X)=1  iff  S∈OLC(X)

Axiom 2: The probability of a disjunction of mutually exclusive statements is equal to the sum of their probabilities:

    "~(A∧B)"∈OLC(X)  implies  P("A∨B"|X) = P(A|X) + P(B|X)

From these axioms we can get all the expected behavior of the probabilities:

    P("~S"|X) = 1 - P(S|X)

    P(S|X)=0  iff  "~S"∈OLC(X)

    0 < P(S|X) < 1  iff  S∉OLC(X) and "~S"∉OLC(X)

    "A=>B"∈OLC(X)  implies  P(A|X)≤P(B|X)

    "A<=>B"∈OLC(X)  implies  P(A|X)=P(B|X)

          etc.

This is still insufficient to calculate an actual probability value for any uncertain statement. Additional principles are required. For example, the Consistency Desideratum of Jaynes: "equivalent states of knowledge must be represented by the same probability values".

Definition: two statements A and B are indistinguishable relative to X iff there exists an isomorphism between OLC(X∪{A}) and OLC(X∪{B}), which is identity on X, and which maps A to B.
[Isomorphism here is a 1-1 function f preserving all logical operations:  f(A∨B)=f(A)∨f(B), f(~~A)=~~f(A), etc.]

Axiom 3: If A and B are indistinguishable relative to X, then  P(A|X) = P(B|X).

Proposition: Let X be the set of statements representing my current mathematical knowledge, translated into F.  Then the statements "millionth digit of PI is odd" and "millionth digit of PI is even" are indistinguishable relative to X.

Corollary:  P(millionth digit of PI is odd | my current mathematical knowledge) = 1/2.

 

4

22 comments, sorted by Highlighting new comments since Today at 12:29 AM
New Comment

I don't know much about logical uncertainty, but have ya'll seen Gaifman's paper on the subject?

Gaifman (2004), Reasoning with limited resources and assigning probabilities to arithmetical statements.

From taking a quick look at page 8 above (numbered 104 in the document), if gRR used X instead of OLC(X), it would result in something like the above article. The point is in having consistent probability distribution over held beliefs, not over some infinite closure of beliefs.

You're sidestepping the whole point of cousin_it, which is that your mathematical knowledge is certainly enough to determine whether the millionth digit of pi is odd or even. One of these two statements is a trivial consequence of the Peano axioms and some well-known representation of pi as an infinite series. It's just that its being a trivial consequence is witnessed by a very long (and also trivial) proof which you're not aware of, and you don't know which of the two statements is backed by such a long proof.

I don't think I'm sidestepping the issue. The point of cousin_it's comment, as I understood it, was that assigning probabilities to "logically uncertain" statements results in inconsistencies. What I tried to show is that for probabilistic assignments to be consistent, it is only necessary to be logically omniscient at propositional calculus, not at full-power PA. And this is an important difference, because propositional calculus is decidable.

Ah, well, if you're only closing under propositional tautologies, then you're not doing anything interesting. OLC(X) is for practical purposes the same as X (not just because it's decidable, as you say, but more importantly because it's so weak). So your suggestions boils down to assigning P=1 to axioms, P=0 to their negations, and trying to figure out non-trivial probabilities for everything else by constraining on propositional consistency. But propositional consistency is merely a very thin veneer over X.

Because propositional inference isn't going to be able to break down quantifiers and "peer" inside their clauses, any quantified statement that's not already related to X [in the sense of participating in X either as a member or as a clause of a Boolean member] is opaque to X. So if I write A = (exists Y)(Y=2) and B = (exists Z)(Z=2 and Z=3), you'll be forced to deduce P(A)=P(B) under your axioms, or pre-commit to include in your axioms A, not(B) and everything else in a smoothly growing (complexity-wise) list of arithmetical truths I can come up with. That doesn't seem very useful.

But propositional consistency is merely a very thin veneer over X.

That was my goal - to come up with a minimum necessary for consistency, but still sufficient to prove the 1/2 probability for digits of PI :) If you wish to make OLC stronger, you're free to do so, as long as it remains decidable. For example, you can define OLC(X) to be {everything provable from X by at most 10 steps of PA-power reasoning, followed by propositional calculus closure}.

In your scheme you have P=1/2 for anything nontrivial and its negation that's not already in X. It just so happens that this looks reasonable in case of the oddity of a digit of pi, but that's merely a coincidence (e.g. take A="a millionth digit of pi is 3" rather than "...odd").

No, a statement and its negation are distinguishable, unless indeed you maliciously hide them under quantifiers and throw away the intermediate proof steps.

I agree with what you're trying to do, but I don't think this proposed construction does it. There are a lot of really complicated statements of propositional calculus which turn out to be either tautologically true or tautologically false, and I'd like to be able to speak of uncertainty of those statements as well.

Constructions like this (or like fuzzy logic) don't capture the principle that I take to be self-evident when discussing bounded agents, that new deductions don't instantly propagate globally: if I've deduced A and also deduced (A implies B), I may not yet have deduced B. (All the more so when we make complicated examples.)

I don't think the construction actually requires instant propagation. It requires a certain calculation to be made when you wish to assign a probability to a particular statement, and this calculation is provably finite.

In your example, you are free to have X contain "A" and "A=>B", and not contain "B", as long as you don't assign a probability to B. When you wish to do so, you have to do the calculation, which will find that B∈OLC(X), and so will assign P(B)=1. Assigning any other value would indeed be inconsistent for any reasonable definition of probability, because if you know that A=>B, then you have to know that P(A)≤P(B), and then if P(A)=1, then P(B) must also be 1.

As written axiom 1 is just false:

P(S|X)=1 iff S∈OLC(X)

But given any finite set (X) of true statements in Peano arithmetic I can invent a statement in Peano arithmetic that isn't in that set but that you would instantly assign p=1.

Edit: And it's false in the other direction too! Humans aren't logically omniscient in the propositional calculus!

Anyway, I think logical uncertainty becomes demystified pretty fast when you start thinking about it in terms of computing time and computational complexity (though that isn't to say I have a solution for the problem of representing it Bayesian calculus).

But given any finite set (X) of true statements in Peano arithmetic I invent statement in Peano arithmetic that isn't in that set but that you would instantly assign p=1.

It is entirely possible for P(S|X) to be < 1, although P(S|X∪{S}) = 1. There's nothing inconsistent about updating on new evidence.

So then it isn't true that P(S|X) = 1 if and only if S∈OLC(X).

If X = { "2*2=4" }, then P("2*2=4 ∨ 0=1" | X) = 1, because "2*2=4 ∨ 0=1" ∈ OLC(X).
However, P("2*3=6" | X) < 1, although P("2*3=6" | X ∪ {"2*3=6"}) = 1.

What does this method give us for p(P = NP) or p(Goldbach's conjecture) or p(45897 * 69012 = 3167444764)?

The three axioms do not fix the probabilities uniquely for most of the statements. There are many different consistent probability assignments for any given X.

Let X be the set of statements representing my current mathematical knowledge, translated into F. Then the statements "millionth digit of PI is odd" and "millionth digit of PI is even" are indistinguishable relative to X.

It isn't obvious to me that this is true (and under which assumptions about X).

A natural number n can be even or odd: i.e. n%2=0 or n%2=1.

If X = {n is natural number} then you showed that we can use P(n%2=0|X) + P(n%2=1|X) = 1 and P(n%2=0|X) = P(n%2=1|X) together to get P(n%2=0|X) = 1/2.

The same logic works for the three statements n%3=0,n%3=1,n%3=2 to give us P(n%3=0|X) = P(n%3=1|X) = P(n%3=2|X) = 1/3.

But then the same logic also works for the two indistinguishable statements n%3=0,n%3=1 \/ n%3=2 to give us P(n%3=0|X) = P(n%3=1 \/ n%3=2) = 1/2.

But 1/2 = 1/3 is a contradiction, so we find that axiom 3 leads to inconsistencies.

[This comment is no longer endorsed by its author]Reply

n%3=0 is distinguishable from n%3=1∨n%3=2. If A="n%3=0", B="n%3=1", and C="n%3=2", then an isomorphism f that maps B∨C to A must satisfy f(B∨C) = f(B)∨f(C) = A, which is impossible.

I understand, what I wrote was wrong. What if we use n%3=0 and ~(n%3=0) though?

I can't say I see the problem. Resource limitation leads to uncertaintly about logical propositions, and that can be translated into probability in just the same way as any other form of uncertainty.