Review
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Related to: Löb's Theorem for implicit reasoning in natural language: Löbian party invitations

tl;dr: Löb's Theorem is much easier to grok if you separate the parts of the proof that use the assumption   from the parts that don't.  The parts that don't use  can be extracted as a stand-alone result, which I hereby dub "Löb's Lemma".

Here's how it works.

Key properties of  and 

Here is some standard notation.  In short,  is our symbol for talking about what PA can prove, and  is shorthand for PA's symbols for talking about what (a copy of) PA can prove.

  • " 1+1=2" means "Peano Arithmetic (PA) can prove that 1+1=2".  No parentheses are needed; the "" applies to the whole line that follows it.  Also,  does not stand for an expression in PA; it's a symbol we use to talk about what PA can prove.
  • "" basically means the same thing.  More precisely, it stands for a numerical expression within PA that can be translated as saying " 1+1=2".  This translation is possible because of something called a Gödel numbering which allows PA to talk about a (numerically encoded version of) itself. 
  • " " is short for "" in cases where "" is just a single character of text.
  • "" means "PA, along with X as an additional axiom/assumption, can prove Y". In this post we don't have any analogous notation for .

The proofs here will use the following standard properties of  and  (source: Wikipedia), which respectively stand for provability and provability encoded within arithmetic.  

  1. (necessitation) From , conclude .  Informally, this says that if A can be proven, then it can be proven that it can be proven (by just writing out and checking the proof within arithmetic).  Note that from  we cannot conclude , because  still means "PA can prove", and not "PA+X can prove".  
  2. (internal necessitation)  . If A is provable, then it is provable that it is provable (basically the same as the previous point).
  3. (box distributivity) . This rule allows one to apply modus ponens inside the provability operator. If it is provable that A implies B, and A is provable, then B is provable.
  4. (deduction theorem) From , conclude : if assuming  is enough to prove , then it's possible to prove under no assumptions that 

Point 4 is helpful and pretty intuitive, but for whatever reason isn't used in the main Wikipedia article on Löb's Theorem.

Löb's Lemma

Claim: Assume  and  are any statements satisfying .  Then .

Intuition: By assumption, the sentence  is equivalent to saying "If this sentence is provable, then ".  Intuitively,  has very little content, except for the  part at the end, so it makes sense that  boils down to nothing more than  in terms of logical equivalence. 

Reminder: this does not use the assumption  from Löb's Theorem at all.

Proof:

Let's do the forward implication first:

  1.                    by internal necessitation ().
  2.        using box distributivity on the assumption, 
                                             with  and .
  3.                         from 1 and 2 by box distributivity.
  4.                    from 3 by the deduction theorem.

Now for the backwards implication, which isn't needed for Löb's Theorem, but is handy anyway:

  1.               is a tautology.
  2.        by box distributivity on 1.
  3.       by box distributivity on the assumption.
  4.                        by 2 and 3.

I like this result because both directions of the proof are fairly short, it doesn't use the assumption  at all, and the conclusion itself is also fairly intuitive.  The statement  just turns out to have no content except for  itself, from the perspective of writing proofs.

Löb's Theorem, now in just 6 lines

If you can remember Löb's Lemma, you can write a very straightforward proof of Löb's Theorem in just 6 lines:

Claim: If  is any sentence such that , then 

Proof:

Let  be any sentence satisfying , which exists by the existence of modal fixed points (or by the Diagonal Lemma).

  1.     by Löb's Lemma.
  2.          by assumption.
  3.         by 1 and 2 combined.
  4.                     by 3 and the defining property of 
  5.                  by necessitation.
  6.                       by 3 and 5.

«mic drop»

New to LessWrong?

New Comment
17 comments, sorted by Click to highlight new comments since: Today at 6:16 AM

Why is it OK to use deduction theorem, though? In standard modal logics like K and S5 the deduction theorem doesn't hold (otherwise you could assume P, use necessitation to get []P, and then use deduction theorem to get P -> []P as a theorem). 

Well, the deduction theorem is a fact about PA (and, propositional logic), so it's okay to use as long as means "PA can prove".  

But you're right that it doesn't mix seamlessly with the (outer) necessitation rule.  Necessitation is a property of "", but not generally a property of "".  When PA can prove something, it can prove that it can prove it.  By contrast, if PA+X can prove Y, that does mean that PA can prove that PA+X can prove Y (because PA alone can work through proofs in a Gödel encoding), but it doesn't mean that PA+X can prove that PA can prove Y.  This can be seen by example, by setting ".

As for the case where you want  to refer to K or S5 instead of PA provability, those logics are still built on propositional logic, for which the deduction theorem does hold. So if you do the deduction only using propositional logic from theorems in  along with an additional assumption X, then the deduction theorem applies.  In particular, inner necessitation and box distributivity are both theorems  of  for every  and  you stick into them (rather than meta theorems about , which is what necessitation is).  So the application of the deduction theorem here is still valid.

Still, the deduction theorem isn't safe to just use willy nilly along with the (outer) necessitation rule, so I've just added a caveat about that:

Note that from  we cannot conclude , because  still means "PA can prove", and not "PA+X can prove".  

Thanks for calling this out.

It might be worth being explicit that ⊢ has lower precedence than the other operators (which in some sense are part of a different language). I, like maybe Gurkenglas, spent some time wondering why wasn't just a special case of necessitation.

I'm confused by your use of the deduction theorem. It's only used in the forward implication argument, and seems unnecessary to me. (The linked wiki article doesn't mention it.) More precisely, it only seems necessary to move things left-to-right across a turnstyle, because you've previously moved them right-to-left in a way that isn't obviously-to-me valid.

(I mean, it's obvious to me that if we can prove that A implies B, then assuming A lets us prove B. But it would also be obvious to me that if assuming A lets us prove B, we can prove that A implies B, yet you dedicated a theorem to allowing that. And I don't trust things that are obvious to me to be actually "true", let alone valid moves here.)

We can just make the whole argument without doing that and it seems fine to me? Looking at it in more depth:

(This does require that we can do logic with the things we've proved. E.g. if we have and , we can conclude . I think that's okay, but seems worth being explicit about.)

  1. by internal necessitation ().

This is just doing the thing I question.

  1. using box distributivity on the assumption, with and .

The assumption is , i.e. . Weaken it and use necessitation to get . Box distributivity is . From those two, we can get . Expanding again, this is .

  1. from 1 and 2 by box distributivity.

Box distributivity lets us go from to . So using 2, we have . And internal necessitation gives us , so combining those, .

  1. from 3 by the deduction theorem.

No need for this, we already finished.

It's true that the deduction theorem is not needed, as in the Wikipedia proof.  I just like using the deduction theorem because I find it intuitive (assume , prove , then drop the assumption and conclude ) and it removes the need for lots of parentheses everywhere.

I'll add a note about the meaning of  so folks don't need to look it up, thanks for the feedback!

Thanks, that makes sense. And the added explanation helps a lot, I can see the argument for going from to now:

  • PA proves ;
  • So PA + certainly proves , since it can prove everything PA can;
  • Also, PA + proves ;
  • So we have ;
  • Which in turn gives us .

Can you explain more formally what is the difference between  and ? I've looked in Wikipedia and in Cartoon Guide on Löb's theorem, but still can't get it.

I've now fleshed out the notation section to elaborate on this a bit.  Is it better now?

In short,  is our symbol for talking about what PA can prove, and  is shorthand for PA's symbols for talking about what (a copy of) PA can prove.

  • " 1+1=2" means "Peano Arithmetic (PA) can prove that 1+1=2".  No parentheses are needed; the "" applies to the whole line that follows it.  Also,  does not stand for an expression in PA; it's a symbol we use to talk about what PA can prove.
  • "" basically means the same thing.  More precisely, it stands for a numerical expression within PA that can be translated as saying " 1+1=2".  This translation is possible because of something called a Gödel numbering which allows PA to talk about a (numerically encoded version of) itself. 
  • " " is short for "" in cases where "" is just a single character of text.
  • "" means "PA, along with X as an additional axiom/assumption, can prove Y". In this post we don't have any analogous notation for .

Thank you, it's much more clear now.

A set of axioms  within a vanilla first-order-logic language does not necessarily allow for the ability to speak about what statements are provable from  within that language. Consider the axioms scheme . There's not enough strength in  to create encodings of statements, which is neccesary to make formal statements in the language which could be interpreted in some standard model to mean "Statement X is provable". Seriously, try doing it. It should clear things up the need to build in some notion of provability into the syntax and semtantics of a logic if you want to talk about provability within a language. Then read upto section 3 in this SEP article.

EDIT: That is assuming you know stuff about provability predicates already, and things like arithmetization. If you don't, then the above exercise is way too hard. In that case, I'd just say that  and  represent "axioms  proves statement " and "axioms  proves that  is provable in some set of axioms T". T is a "sufficiently strong" set of axioms, like Peano arithmetic, which can do interesting things.

It seems important to clarify the difference between From ⊢A, conclude ⊢□A and ⊢□A→□□A. I don't feel like I get why we don't just set conclude := → and ⊢:=□.

Well,  is just short for , i.e., "(not A) or B".  By contrast,  means that there exists a sequence of (very mechanical) applications of modus ponens, starting from the axioms of Peano Arithmetic  (PA) with  appended, ending in .  We tried hard to make the rules of  so that it would agree with  in a lot of cases (i.e., we tried to design  to make the deduction theorem true), but it took a lot of work in the design of Peano Arithmetic and can't be taken for granted.

For instance, consider the statement .  If you believe Peano Arithmetic is consistent, then you believe that , and therefore you also believe that .  But PA cannot prove that  (by Gödel's Theorem, or Löb's theorem with ), so we don't have .

To check, are you reading the second as ? It's meant to be .

How would the semantics for that work out?

⊢ already is to □ as □ is to quoted □. If we redo the post one □ further in, we never need external necessitation. Not sure if that answers your question.

[+][comment deleted]1y20

I think this factoring hides the computational content of Löb's theorem (or at least doesn't make it obvious).  Namely, that if you have , then Löb's theorem is just the fixpoint of this function.

Here's a one-line proof of Löb's theorem, which is basically the same as the construction of the Y combinator (h/t Neel Krishnaswami's blogpost from 2016):

where  is applying internal necessitation to , and .fwd (.bak) is the forward (reps. backwards) direction of the point-surjection .