Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is a linkpost for https://unstableontology.com/2023/08/16/a-proof-of-lobs-theorem-using-computability-theory/

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This is a linkpost for https://unstableontology.com/2023/08/16/a-proof-of-lobs-theorem-using-computability-theory/

Mentioned in

A Proof of Löb's Theorem using Computability Theory

New Comment

Löb’s Theorem states that, if PA⊢□PA(P)→P, then PA⊢P. To explain the symbols here:

I'm not going to discuss the significance of Löb's theorem, since it has been discussed elsewhere; rather, I will prove it in a way that I find simpler and more intuitive than other available proofs.

## Translating Löb's theorem to be more like Gödel's second incompleteness theorem

First, let's compare Löb's theorem to Gödel's second incompleteness theorem. This theorem states that, if PA⊢¬□PA(⊥), then PA⊢⊥, where ⊥ is a PA statement that is trivially false (such as A∧¬A), and from which anything can be proven. A system is called inconsistent if it proves ⊥; this theorem can be re-stated as saying that if PA proves its own consistency, it is inconsistent.

We can re-write Löb's theorem to look like Gödel's second incompleteness theorem as: if PA+¬P⊢¬□PA+¬P(⊥), then PA+¬P⊢⊥. Here, PA+¬P is PA with an additional axiom that ¬P, and □PA+¬P expresses provability in this system. First I'll argue that this re-statement is equivalent to the original Löb's theorem statement.

Observe that PA⊢P if and only if PA+¬P⊢⊥; to go from the first to the second, we derive a contradiction from P and ¬P, and to go from the second to the first, we use the law of excluded middle in PA to derive P∨¬P, and observe that, since a contradiction follows from ¬P in PA, PA can prove P. Since all this reasoning can be done in PA, we have that □PA(P) and □PA+¬P(⊥) are equivalent PA statements. We immediately have that the conclusion of the modified statement equals the conclusion of the original statement.

Now we can rewrite the pre-condition of Löb's theorem from PA⊢□PA(P)→P to PA⊢□PA+¬P(⊥)→P. This is then equivalent to PA+¬P⊢¬□PA+¬P(⊥). In the forward direction, we simply derive ⊥ from P and ¬P. In the backward direction, we use the law of excluded middle in PA to derive P∨¬P, observe the statement is trivial in the P branch, and in the ¬P branch, we derive ¬□PA+¬P(⊥), which is stronger than □PA+¬P(⊥)→P.

So we have validly re-stated Löb's theorem, and the new statement is basically a statement that Gödel's second incompleteness theorem holds for PA+¬P.

## Proving Gödel's second incompleteness theorem using computability theory

The following proof of a general version of Gödel's second incompleteness theorem is essentially the same as Sebastian Oberhoff's in "Incompleteness Ex Machina". See also Scott Aaronson's proof of Gödel's first incompleteness theorem.

Let L be some first-order system that is at least as strong as PA (for example, PA+¬P). Since L is at least as strong as PA, it can express statements about Turing machines. Let Halts(M) be the PA statement that Turing machine M (represented by a number) halts. If this statement is true, then PA (and therefore L) can prove it; PA can expand out M's execution trace until its halting step. However, we have no guarantee that if the statement is false, then L can prove it false. In fact, L can't simultaneously prove this for all non-halting machines M while being consistent, or we could solve the halting problem by searching for proofs of Halts(M) and ¬Halts(M) in parallel.

That isn't enough for us, though; we're trying to show that L can't simultaneously be consistent and prove its own consistency, not that it isn't simultaneously complete and sound on halting statements.

Let's consider a machine Z(A) that searches over all L-proofs of ¬Halts(‘‘┌A┐(┌A┐)") (where ‘‘┌A┐(┌A┐)" is an encoding of a Turing machine that runs A on its own source code), and halts only when finding such a proof. Define a statement G to be ¬Halts(‘‘┌Z┐(┌Z┐)"), i.e. Z(Z) doesn't halt. If Z(Z) halts, then that means that L proves that Z(Z) doesn't halt; but, L can prove Z(Z) halts (since it in fact halts), so this would show L to be inconsistent.

Assuming L is consistent, G is therefore true. If L proves its own consistency, all this reasoning can be done in L, so L⊢G. But that means L⊢¬Halts(‘‘┌Z┐(┌Z┐)"), so Z(Z) finds a proof and halts. L therefore proves ¬G, but L also proves G, making it inconsistent. This is enough to show that, if L proves its own consistency, it is inconsistent.

## Wrapping up

Let’s now prove Löb’s theorem. We showed that Löb’s theorem can be re-written as, if PA+¬P⊢¬□PA+¬P(⊥), then PA+¬P⊢⊥. This states that, if PA+¬P proves its own consistency, it is inconsistent. Since PA+¬P is at least as strong as PA, we can set L=PA+¬P in the proof of Gödel’s second incompleteness theorem, and therefore prove this statement which we have shown to be equivalent to Löb’s theorem.

I consider this proof more intuitive than the usual proof of Löb’s theorem. By re-framing Lob's theorem as a variant of Gödel's second incompleteness theorem, and proving Gödel's second incompleteness theorem using computability theory, the proof is easy to understand without shuffling around a lot of math symbols (especially provability boxes).