Walkthrough of "Definability of Truth in Probabilistic Logic"

[-]benkuhn12y60

I'm having a hard time showing that P maps equivalent statements to the same value, or that it's bounded. In particular, I'm having difficulty showing P(p && q) = P(q && p); it seems like none of the axioms lets you switch the order of two atoms. What's the trick I'm missing?

[-]Manfred12y10

How about if axiom 1 read "P(x) = P(x && y) + P(¬y && x)"?

Then we could say P(q && p) = P(q && p && ¬p) + P(p && q && p) = P(p && q && p) by axioms 1' and 3, and similar for P(p && q).

So P(q && p) = P(p && q && p) = P(p && q).

[-]benkuhn12y40

I think your proof secretly turns ¬¬p into p in the first step, which isn't legit. However, your modification does give equality on logically equivalent sentences, in the following way:

Suppose p <=> ~q. Then P(p) = P(p && q) + P(~q && p) = P(~q && p) by 3 P(~q) = P(~q && p) + P(~p && ~q) = P(~q && p) by 3

Hence if p is equivalent to q and at least one starts with a ~ sign, then P is equal on them. Now suppose p <=> q with neither one starting with a ~ sign: we have

P(p) = P(p) = P(q) = P(q)

and we're done.

[-]Manfred12y00

I think your proof secretly turns ¬¬p into p in the first step, which isn't legit.

It does, drat.

P(p) = P(~~p)

Ok. So to fix/complete my proof we need P(¬¬p && q && p) = P(p && q && p). Hm. Ok. So to prove they're equal we just add on another term using axiom 1 and then rule out the contradiction in such away that we show they're both equal to the same thing.

[-]benkuhn12y00

Or, once you have equality for logically equivalent sentences, note that (p && q) <=> (q && p) and hence we have directly that P of the two sides are equal.

[-]nshepperd12y10

Yes, the axioms seem incomplete, or perhaps it was simply meant to be implied that "P(p) = P(q & p) + P(¬q & p)" also. Otherwise as far as I can tell there's no axiom that lets you relate any expression containing "P(p" to an expression containing "P(q", unless the arguments of P(·) are each a tautology or contradiction (which is unhelpful).

[-]benkuhn12y00

Well, a tautology can be made up of non-tautological things; we could conceivably have some sentence phi(p, q) that's a tautology if p <=> q, such that P(p) = f(P(phi(p,q))) = f(P(phi(q,p))) = P(p). I think this is what ygert is trying to do. I don't have much hope for this approach, though.

[-]ygert12y00

I'd think that the way you'd prove it is with the fact that (p && q)<=>(q && p) is a tautology. I don't have an exact proof at the moment; let me work on it.

[-]ygert12y00

After working on the problem, I am convinced we also need an order-swapped version of axiom 1. If we had that, we could prove that any pair of equivalent statements have that same value: the general case of the problem benkuhn posed.

(If A and B are equivalent, then P(A&~B)=P(B&~A)=0 as contradictions, and so by axiom 1:

P(A)=P(A&B)+P(A&~B)=P(A&B)+0=P(A&B)

P(B)=P(B&A)+P(B&~A)=P(B&A)+0=P(B&A)

So close. If only we could swap the orders, we'd have P(A)=P(B).)

[-]TheMajor12y20

I tried applying the proof of the theorem to the problem, as P(q) = P(p) for equivalent statements p,q is clear from the claim P(q) = mu({M: M|= q}) so our desired result should be part of the proof of Theorem 1. However it seems that our desired result is implicitly assumed in the statement "Axiom 1 implies". Setting P(A|B) = P(B && A)/P(B) (note the reversal of order, without being allowed to replace equivalent statements inside P the mess is even greater if we pick the natural order. Also we want P(B) =/= 0) and writing psi for phi j we can write the claim as:

P(Ti && phi)/P(Ti) = P(Ti && psi && phi)/P(Ti && psi) P(Ti && psi)/P(Ti) + P(Ti && ~psi && phi)/P(Ti && ~psi) P(Ti && ~psi)/P(Ti)

(Here I ignored some annoying brackets, actually there's more of a mess as its not clear if P((A && B) && C) = P(A && (B && C))). Multiplying both sides by P(Ti) and simplifying now gives:

P(Ti && phi) = P(Ti && psi && phi) + P(Ti && ~psi && phi)

which does not follow from Axiom 1 as the added statements are in the middle of our statement. I am convinced that our claim (P(p) = P(q) for equivalent statements p,q) is required for the proof in the paper and should be added as an extra axiom.

On a sidenote:unde the assumption above (equivalent statements get equal probabilities) axiom 3 is a consequence of axioms 1 and 2. This can be seen as follows: Let q be a contradiction, so ~q is a tautology. By axioms 1 and 2 we have P(q) = P(q && q) + P(q && ~q). But ~q is a tautology, so p && ~q and p are equivalent for every p. In particular we have P(q && ~q) = P(q). But q and (q && q) are also equivalent, so the above can be written as P(q) = P(q) + P(q) so P(q) = 0.

[-]So8res12y00

I, too, am now doubtful that axioms 1-3 are sufficient. I've updated the post accordingly.

[-]benkuhn12y00

Yeah, I couldn't find anything either.

As Manfred and I showed above, replacing axiom 1 with "P(x) = P(x && y) + P(¬y && x)" gives a sufficient condition, though.

[-]Shmi12y30

For if a language has access to its own truth predicate, it can express the liar's paradox: G ⇔ True('G').

You probably mean G ⇔ True('¬G')

[-]So8res12y20

Indeed I do. Fixed, thanks.

[-]Dan_Weinand12y20

Nitpick, the link in the first sentence reads "Definability of Truth in Probabilistic Locic" rather than logic.

[-]So8res12y20

Fixed, thanks.

[-]brahmaneya12y20

You have mentioned the weakened reflection principle as being the following: ∀φ∈L'. ∀a,b∈Q. a≤P(φ)≤b ⇒ P(a<P('φ')<b)=1

This seems to be a typo, it should be ∀φ∈L'. ∀a,b∈Q. a<P(φ)<b ⇒ P(a<P('φ')<b)=1

[-]So8res12y20

Right you are. Fixed, thanks.

[-]benkuhn12y10

I'm confused by a couple minor points here, also:

The paper asks for a "probability distribution over models of L". In fact, for many languages L, models of L form a proper class. Does this cause measure-theoretic difficulties? It seems like this might force mu to be zero on all sufficiently large models (otherwise you can do some sort of transfinite induction to get sets of unbounded measure) but I'm not very good at crazy set theory stuff.
At one point the authors state "We would like P(forall phi in L' )". I thought we were in a first-order language and therefore couldn't quantify over propositions?
It's not immediately clear to me that this actually constructs a measure on the set of theories: that is, if S is the set of all complete consistent theories, it's not clear to me that for the mu we construct by martingale, mu(S) = 1 (or even that mu(S) != 0). Mightn't additivity break when we take the limit and get a whole theory rather than just an incomplete bag of axioms?

[-]pengvado12y20

Can we instead do "probability distribution over equivalence classes of models of L", where equivalence is determined by agreement on the truth-values of all first order sentences? There's only 2^ℵ₀ of those, and the paper never depends on any distinction within such an equivalence class.

[-]benkuhn12y10

Yes, though we should just call it a "probability distribution over complete consistent theories" in that case (it's exactly the same).

[-]So8res12y20

There are definitely some probability distributions over proper classes that are useful (for example, a measure that assigns .5 to one model, .5 to another, and zero to the rest). No model would ever be forced to have measure 0, as you can always construct the measure that assigns 1 to that particular model and 0 to all the others. But as to whether or not there are other difficulties with defining a probability measure over a proper class, I have no idea. I, too, lack skill with crazy set theory.
You're referring to page 7? I believe that it means to say "we would like a P that obeys the axiom schema P(forall a, b in Q ... phi ...) for all phi in L". You're right, though, this is somewhat ambiguous.
I don't completely understand your question. Are you questioning whether T=UTi is actually complete and consistent? Compactness guarantees that it is consistent, and the enumeration of sentences guarantees it is complete.

[-]benkuhn12y00

I meant that (conjecturally) for every measure, there exists a cardinal kappa such that mu({M: |M| > kappa}) = 0. Anyway, I guess as you've demonstrated the set/class thing isn't a big problem, but it is something to watch out for.
Okay, that makes sense.
No, I was observing the following: mu is countably additive, and the set of theories is countable. Hence the measure of the total space is the sum of the measures of the theories, so the measures of the theories must sum to 1. Now it's clear that at every step i of the process, the sum of the measures of the (incomplete) theories so obtained is 1. But it's not immediately clear to me that this holds in the limit.

However, I just realized my mistake, which is that the set of theories isn't always countable (there are countably many sentences, but a theory is a subset of the sentences; for instance, consider the language with countably many unary relations and a constant symbol). In particular, I believe it's countable if and only if the sum is preserved in the limit, so we're fine.

[-]Quinn12y10

For a countable language L and theory T (say, with no finite models), I believe the standard interpretation of "space of all models" is "space of all models with the natural numbers as the underlying set". The latter is a set with cardinality continuum (it clearly can't be larger, but it also can't be smaller, as any non-identity permutation of the naturals gives a non-identity isomorphism between different models).

Moreover this space of models has a natural topology, with basic open sets {M: M models phi} for L-sentences phi, so it makes sense to talk about (Borel) probability measures on this space, and the measures of such sets. (I believe this topology is Polish, actually making the space Borel isomorphic to the real numbers.)

Note that by Lowenheim-Skolem, any model of T admits a countable elementary substructure, so to the extent that we only care about models up to some reasonable equivalence, countable models (hence ones isomorphic to models on the naturals) are enough to capture the relevant behavior. (In particular, as pengvado points out, the Christiano et al paper only really cares about the complete theories realized by models, so models on the naturals suffice.)

[-]Bakkot12y00

Great post! For anyone reading this who isn't familiar with model theory, by the way, the bit about

sentence G ⇔ P('G')<1. Then

may not be obvious. That is, we want a sentence G which is true iff P('G') < 1 is true. The fact that you can do this is a consequence of the diagonal lemma, which says that for any reasonable predicate 'f' in a sufficiently powerful language, you can find a sentence G such that G is true iff f(G) is true. Hence, defining f(x) := P('x') < 1, the lemma gives us the existence of G such that G holds iff f(G) holds, ie, iff P('G') < 1 as desired.

Mostly I bring this up because the diagonal lemma is among the most interesting results in early model theory. It has a simple statement and is how self-reference is constructed, which is what gives us the incompleteness theorems. If anyone is interested in getting in to model theory, looking up the proof and background for the proof would be a great place to start.

[-]Manfred12y00

The empty set is tautological because P({}) = P({} and something) + P({} and not-something) = P(something) + P(not-something). Hm, but that's using axioms 1 and 2. Can we get it using just axiom 3 as the paper claims?

[-]benkuhn12y20

When you say that P({} and something) = P(something), you suppose the hypothesis (in addition to using several nontrivial consequences of coherence that So8res mentioned, like P mapping equivalent statements to the same thing).

More importantly, "{} and something" isn't a syntactically correct sentence. I don't think most authors consider the empty sentence syntactically correct either. (Marker, the textbook I used, doesn't.)

[-]Manfred12y00

Whoops, you're right.

[-]Sniffnoy12y00

This should be finitely additive probability measures, right? Just saying "probability measure" usually means countably additive.

[-]So8res12y20

I'll defer to the paper here, which states

Denition 1 (Coherence). We say that P is coherent if there is a probability measure over models of L such that P(phi)=mu({M:M models phi})

That said, I'm not aware of a reason why we should require P be backed by a finitely additive probability measure.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

24

Walkthrough of "Definability of Truth in Probabilistic Logic"

24

24

Motivation

Preliminaries

Probability Predicates

Reflective Probability

Finding P

Knowing your limits

Discussion

Compiled Notes