How to cheat Löb's Theorem: my second try

5AlexMennen

3wedrifid

1Benya

0AlexMennen

0Benya

0AlexMennen

0Benya

0Benya

4Kindly

-1CronoDAS

2Benya

2Kindly

1Benya

1CronoDAS

0Manfred

0Benya

0Decius

2Benya

0Decius

0Manfred

0Benya

1Pentashagon

0[anonymous]

0Eliezer Yudkowsky

4Benya

0Benya

New Comment

I got lost while trying to follow this (at least in my case, you were very wrong about this being more comprehensible and intuitive than your previous post), and eventually I figured out that I didn't even understand your notation.

Is #[stuff] just the inverse of repr? If so, why do you keep using "#[&stuff]" instead of just "stuff"? If not, what is it?

And in "For all x: If K>0, and PPT.3 |- #[&F](&x), then #[down(F('x'))]", I assume that " 'x' " means something different from "&x", "#[&x]", or "x" because otherwise you just would have used whichever of those was appropriate. Later on, you say " "#[down(F('x'))]" splices in a formula with free variable 'x' ", but I'm not sure what you mean by that. 'x' obviously isn't a free variable because it appears inside the quantifier "for all x: ...".

Thanks for your feedback, and for trying to follow the proof! Clearly I underestimated the inferential distance when I wrote that "..." and #[...] work just like Lisp's backquote and comma, and then more or less left it at that :-)

Let's try whether some examples help -- let's introduce some functions for constructing Gödel numbers explicitly, and then see what the quote/splice notation desugars to. We already know repr(n), written &n, which returns the Gödel number of the representation of n as a unary numeral. Let ge(x,y) be a function taking the Gödel numbers of two expressions, and returning the formula saying that the first expression is >= the second expression. For example,

```
ge(&3, &7) = "3 >= 7".
```

Similarly, define plus(x,y) as taking the Gödel numbers of two expressions, and returning the Gödel number of another expression, so that

```
ge(plus(&2, &2), &3) = "(2+2) >= 3".
```

(I hope that made sense?) -- In both of these cases, the quotation on the right-hand side is actually *syntactic sugar for the left-hand side*.

Now, let's look at an example of what a quoted expression with a splice desugars to.

```
"(2+2) >= #[x]" = ge(plus(&2, &2), x).
```

So, here x is the Gödel number of an expression that gets inserted on the right-hand side of the >=. For example, if you define a function f(x) := "(2+2) >= #[x]", and then apply that to the value "1+1" = plus(&1, &1), then you get

```
f("1+1") = ge(plus(&2, &2), plus(&1, &1)) = "(2+2) >= (1+1)".
```

Now, sometimes you don't want to insert an arbitrary expression, but you have a particular value you want to insert. Perhaps you have a number n, and you want to construct an expression saying that n is at least 7. Now what you do is to use & to turn n into the Gödel number of an expression, and then splice in that expression:

```
"#[&n] >= 7" = ge(&n, &7).
```

Does that help with the question of what #[...] means? This comment is getting long so I'll move on to the second question, but please ask more if it's still unclear.

In the F(e) notation, F is the Gödel number of an expression that may contain blanks, like F = "_ < 12", and e is the Gödel number of some expression, like e = "2+2". Then, F(e) is the Gödel number of the expression obtained by substituting e for the blanks: in the example, F(e) = "2+2 < 12".

Let's stick with our example F. 'x' is the Gödel number of the variable x, so F('x') = "x < 12". On the other hand, &x denotes the unary representation of the current value of x, so if x=17, then F(&x) = "17 < 12". Back to the first hand, if x=17, then you still have F('x') = "x < 12", because 'x' is just a fixed Gödel number, it doesn't depend on the value of the variable x. (Just as in Python, the string literal 'x' has the same value no matter what the variable x is bound to.) F(x) is only useful if the current value of x is the Gödel number of an expression; if x="1+1", then F(x) = "1+1 < 12". Finally, F(#[&x]) wouldn't make sense in this place, because the #[...] notation can only appear inside a string. Well, it may *look* as if the place you quoted is inside the string, but it actually appears inside another #[...]: consider the syntactically incorrect "#[#[&e]] >= 7", which would have to be equal to ge(#[&e], &7), so you would have a #[...] appear without quotes around it, which is why it's a syntax error.

Now, about the free variables question. Consider the formula A := "For all n: n >= 0". That one does not have a free variable. However, it has the subformula B := "n >= 0", and *that* one *does* have n as a free variable -- which gets bound when the subexpression is inserted into the larger one. Note that we can write A as "For all n: #[B]". Now let F := "_ >= 0"; then B = F('n'), and A = "For all n: #[F('n')]". So #[F('n')] splices in the formula B, which has a free variable, which then gets bound, producing A, which doesn't have a free variable -- and qualitatively the same thing happens in the stuff you quoted.

-- Was that helpful? Is there something here I should expand on, or something else that would be helpful for me to try to explain? (And thanks again for taking the time to try to figure this out.)

Thanks. So it sounds like #[...] is the inverse operation of &..., but can only be used within quotes. So in "#[&n] >= 7" = ge(&n, &7), why do you have "#[&n]" instead of just "n"? Is that to make it clear that you're using the numeric value of n rather than the variable n?

Also, what does "... #[&F](& #[&m]) ..." mean? From my understanding, that would get unpacked as subst(repr(F), repr(#[repr(m)])), but that is a syntax error.

So it sounds like #[...] is the inverse operation of &..., but can only be used within quotes.

No, that doesn't make sense to me. The inverse of & would be a computable function that takes the Gödel number of a unary numeral and returns the number represented by that numeral. #[...] is syntactic sugar somewhat similar to the $NAME in e.g. Perl's "Hello, $NAME!". But--

So in "#[&n] >= 7" = ge(&n, &7), why do you have "#[&n]" instead of just "n"? Is that to make it clear that you're using the numeric value of n rather than the variable n?

Well, it's because "n >= 7" would *mean* an expression that contains the variable n, rather than an expression containing the unary numeral representing the numeric value of n. ("n >= 7" = ge('n', &7) is a different Gödel number than "#[&n] >= 7" = ge(&n, &7).) Being able to distinguish between such meanings was the point of introducing the explicit notation, because in my original post it was unclear what I meant and I confused myself enough that I ended up with a buggy proof. Possibly we're at least part-way there to understanding each other, but you're conceptualizing things differently?

Also, what does "... #&F ..." mean? From my understanding, that would get unpacked as subst(repr(F), repr(#[repr(m)])), but that is a syntax error.

No, wait, that's completely wrong. We need to distinguish between applying repr(.) to an argument and constructing the Gödel number of an expression in which repr(.) is applied to an argument. For example, in my earlier comment, plus(x,y) was **not** equal to (x+y): if x and y are the Gödel numbers of two expressions, then (x+y) would simply *add those Gödel numbers* (which isn't useful), whereas plus(x,y) returns the Gödel number of an expression (namely the expression adding the expressions denoted by the Gödel numbers x and y). Do you see what I mean?

Perhaps it would be good to have a common prefix for functions that construct the Gödel numbers of expressions: replace plus(.,.) and ge(.,.) from the previous post by mkPlus(.,.) and mkGe(.,.), and add the function mkRepr(e), which constructs the expression that applies repr to the expression denoted by e, and mkSubst(e1,e2), which constructs the expression that applies subst to the expressions denoted by e1 and e2. Let's also have mkEq(.) for equality. Then, we have

```
"x = &4" = mkEq('x', mkRepr('4')) = mkEq('x', mkRepr(repr(4))).
```

(I'll continue to write the Gödel number of a variable by putting the variable in quotes, because it doesn't seem helpful to introduce another notation for that.) As another example, we have

```
"#[e] >= #[&k]" = mkGe(e, repr(k)).
```

Now, the example you quote:

```
"#[&F](& #[&m])" = mkSubst(repr(F), mkRepr(repr(m))).
```

So the middle "&" gets translated to mkRepr, because it is inside the quotes; but the other two &'s are just repr(.), because they are inside the splice, which means that they are *not* quoted.

No, that doesn't make sense to me. The inverse of & would be a computable function that takes the Gödel number of a unary numeral and returns the number represented by that numeral. #[...] is syntactic sugar somewhat similar to the $NAME in e.g. Perl's "Hello, $NAME!".

But "#[&7]" = "7", and if you replace the 7s with some other number, it's still true, right?

"n >= 7" would mean an expression that contains the variable n, rather than an expression containing the unary numeral representing the numeric value of n.

Got it.

No, wait, that's completely wrong. We need to distinguish between applying repr(.) to an argument and constructing the Gödel number of an expression in which repr(.) is applied to an argument. For example, in my earlier comment, plus(x,y) was not equal to (x+y): if x and y are the Gödel numbers of two expressions, then (x+y) would simply add those Gödel numbers (which isn't useful), whereas plus(x,y) returns the Gödel number of an expression (namely the expression adding the expressions denoted by the Gödel numbers x and y). Do you see what I mean?

Oh, I see. Everything after that paragraph, I'm going to have to think about for a while. Edit: got it (I think).

But "#[&7]" = "7", and if you replace the 7s with some other number, it's still true, right?

Ah! This is correct, but I would conceptualize it differently, as the combination of two distinct phenomena. First, #[...] is sort of the inverse of "...", which makes more sense to me because both of these are kinds of syntactic sugar -- we have both "#[e]" = e and "x + #['y']" = "x + y".

Second, "7" is the Gödel number of the unary numeral 7, and repr(7) also returns this Gödel number. In other words, "7" = &7. Putting the two together: "#[&7]" = &7 = "7".

Got it. [...] [G]ot it (I think).

:-) Thanks again for sticking with it!

...although the mkStuff way of writing things is ugly in one way: it means that if you want to desugar quotes inside quotes, you may need to introduce mkMkStuff functions, as in,

```
"x = 'a + #[y]'" = "x = mkPlus('a', y)" = mkEq('x', mkMkPlus(repr('a'), 'y')).
```

(Recall that 'a' denotes the Gödel number of the variable a; repr('a') is the Gödel number of an expression that evaluates to the Gödel number of a, which is what we need in that place.)

It would be better to introduce functions call(name,arg) and pair(x,y) such that

```
mkRepr(x) = call('repr', x)
mkEq(x,y) = call('eq', pair(x,y))
mkSubst(x,y) = call('subst', pair(x,y))
```

etc. Then we can apply the desugaring recursively--

```
"'repr(x)'" = "call('repr', 'x')" = call('call', pair(repr('repr'), repr('x'))).
```

HTH more than it confuses :-/

Disclaimer: I have only read the informal summary.

Suppose the statement C doesn't contain K at all (or else it contains K in a way that provably doesn't matter). Then PPT.2 contains the axiom "If K>0, and C is provable in PPT.2, then C". For all n>0, if we replace K by n, we get the axioms of BAD, which is inconsistent.

Is this a mistake, or am I going wrong somewhere?

Suppose the statement C doesn't contain K at all (or else it contains K in a way that provably doesn't matter).

Then C is a theorem of PA (and we're assuming that PA is sound).

EDIT: I was mistaken, see below. (If the statement doesn't contain K, then it's equivalent to a statement in PA.)

Then PPT.2 contains the axiom "If K>0, and

C is provable in PPT.2, then C". For all n>0, if we replace K by n, we get the axioms of BAD, which is inconsistent.

PPT.2 != BAD, so you don't have *exactly* the same axioms as BAD.

Yup, modulo one nit, but let me expand a bit on this answer.

Suppose the statement C doesn't contain K at all (or else it contains K in a way that provably doesn't matter).

Then C is a theorem of PA (and we're assuming that PA is sound).

Nit: C ranges over *all* statements; for example, it is an axiom of PPT.2 that "if K>0, and PPT.2 proves '0=1', then 0=1", even though 0=1 is not a theorem of PA. However, unless my proof is borked, it should be the case that if C doesn't contain K, then (PPT.2 proves C) iff (PA proves C), and also (PPT.2 proves C) iff (PPT.2 proves "PPT.2 proves 'C'"). Therefore, in a sense you can't *apply* the axiom involving C unless C is provable in PA, which is what I think you had in mind.

Then PPT.2 contains the axiom "If K>0, and C is provable in PPT.2, then C". For all n>0, if we replace K by n, we get the axioms of BAD, which is inconsistent.

PPT.2 != BAD, so you don't have exactly the same axioms as BAD.

One possible point of confusion here is that it might seem as if replacing K by 15 would yield the proof system

NOTGOOD := PA + for every statement C, "If 15>0, and 'C' is provable in NOTGOOD, then C,"

because it may seem as if replacing every occurrence of K in the formula "PPT.2 proves 'C'" should yield the statement "NOTGOOD proves 'C'".

But this is not true: the formula "PPT.2 proves 'C'" contains *no* occurrence of K, not even if C contains K. This is because the formula doesn't literally contain C, it contains *the unary representation of the Gödel number of C*, and similarly, the predicate "PPT.2 proves ..." only talks about Gödel numbers, not about K.

Then C is a theorem of PA (and we're assuming that PA is sound).

Not necessarily. For example, replace C with "X+Y=Z" (I'm just copying form the cartoon guide here). The axiom then becomes "If PPT2 proves X+Y=Z, then X+Y=Z."

Now replace X with 1, Y with 2, and Z with 3. "If PPT2 proves 1+2=3, then 1+2=3."

But we could just as easily type "If PPT2 proves 1+2=8, then 1+2=8." Even though 1+2=8 isn't a theorem of PA, we can type it in as a "test sentence." But once that's an *axiom*, it can be used with the PA axioms to prove 1+2=8, which is bad.

But we could just as easily type "If PPT2 proves 1+2=8, then 1+2=8." Even though 1+2=8 isn't a theorem of PA, we can type it in as a "test sentence."

That's correct, as I noted in my own response. (I believe ChronoDAS *meant* to say something slightly different, as I explain there, but the stated claim was wrong.)

But once that's an

axiom, it can be used with the PA axioms to prove 1+2=8, which is bad.

Well, um -- I just posted a proof which implies the opposite, and -- I don't expect you to go through the details of my proof if it's obvious to you that my result is wrong, but could you at least post your argument rather than just *asserting* this?

I'm not sure whether you think that "If PPT.2 proves '1+2=8', then 1+2=8" is an axiom of PPT.2. It is not; the axiom of PPT.2 is:

```
If K>0, and PPT.2 proves '1+2=8', then 1+2=8.
```

Now, I do claim that every statement in PPT.2 is true if K is replaced by any concrete number, like 42. Thus, I claim that you can, for example, add to PA the following axiom:

```
If 42>0, and PPT.2 proves '1+2=8', then 1+2=8.
```

And since PA would conclude this from the above axiom anyway, you might as well also add

```
If PPT.2 proves '1+2=8', then 1+2=8.
```

However, this doesn't lead to inconsistency any more than adding the following axiom to PA:

```
If PA proves '1+2=8', then 1+2=8.
```

For example, PA_omega can prove this. The point in both cases is that while it's consistent to *add* these axioms, the proof systems PA and PPT.2 *to which the axioms refer* do *not* contain the axiom itself (unlike in the case of BAD).

No, we have "If PPT2 proves that 'if 42>0 and 1+2=8', then 'if 41>0 then 1+2=8' " as an axiom of PPT2.

Where 'if 42>0 and 1+2=8' is C and 'if 41>0 then 1+2=8' is D. Those two statements have different Gödel numbers, and therefore are different statements.

Huh?

The closest axiom PPT.2 has to the one you're claiming is "If K>0, and PPT.2 proves that 'if K>0 and 1+2=8', then (if K-1>0, then 1+2=8)." If you substitute 42 for K -- which does NOT give you another axiom or AFAICT theorem of PPT.2, but if you do it anyway -- then you get the formula, "If 42>0, and PPT.2 proves that 'if K>0 and 1+2=8', then (if 42-1>0, then 1+2=8)." I'm not sure how you came up with the statement you claim to be an axiom of PPT.2, and I'm not sure what point you are trying to make.

No, it doesn't give you an axiom or theorem, it gives you a statement. In particular, it gives you a statement which does not prove itself through Lobs theorem.

First, assume the existence of up(n) that replaces K with K+1 in the godel number of a sentence.

Let C="#[up(&'If K>0, and PPT.2 |- #[&C], then #[down(C)]')]" so the axiom "If K>0, and PPT.2 |- #[&C], then #[down(C)]" implies #[down(#[up(&'If K>0, and PPT.2 |- #[&C], then #[down(C)]')])] and cancelling the ups and downs shows we've just proven our own consistency.

I probably butchered the syntax, but couldn't something similar to this be used put Löb's theory back in business?

Using senses you interact and obtain information about molds, and separately about hallucinations. Then you eat potatoe containing mold producing hallucinogenic toxins and experience a fake sensory stimulation, you can produce the abstract reasoning that this sensory stimuli is not real but is instead a hallucination.

Or in other words I believe it's logically impossible for any AI to self-improve without large amount of input data and/or hardcoded mechanics. Operating with this "external" data you can proove abstract reasoning to a limited degree, and with the proven abstract reasoning, fix the hardcoded data.

Instead of working against the problem, why not work with it?

[This comment is no longer endorsed by its author]

(Marcello observed this while reading the previous cheat attempt.) Since "1=0" contains no mentions of K, and is thus invariant under substitution of K-1 for K, why doesn't this language contain "provability of '0=1' implies 0=1" and thus prove 0=1?

Because the axiom contains an additional condition: it's "(K>0 and provability of '0=1') implies 0=1". Since you can't prove K>0, you can't apply Löb's theorem.

Now, you *could* add an axiom "K=12" and get a new sound proof system which would prove PPT.2 consistent; or you *could* substitute 12 for every not-quoted occurrence of K in the axioms of PPT.2, and get a new sound proof system which would prove PPT.2 consistent; but that's not a problem, because in both cases you get a *new* proof system, and there's nothing unusual about being able to construct a new system in which the old system's consistency can be shown.

Wei Dei asked me in email:

Still looking at the proof but here's a question for you in the mean time. Can you think of any problem examples which can't be solved by a quining approach, but can be solved by parametric polymorphism?

Good point and I don't have an answer yet, but while thinking about it it's occurred to me that AI_Q2 with PA instead of PA(n) might be equivalent to Giles' idea.

(The PA(n) stuff in your definition of AI_Q2 doesn't *seem* to me to add much to the idea -- it's obviously more powerful than using PA, but using PA(omega) is obviously more powerful than using PA(n), so. [**ETA**: cf. also this discussion.])

If the two really are equivalent, then Giles' idea seems to me to make clearer what is actually happening, and it *feels* to me like an instance of the parametric polymorphism trick -- maybe my version just isn't the best way to apply the trick to AI. So if I can't quickly find an example where my version does better, I think maybe I'll just table the issue, keep both versions in mind, start working on the next slightly more complex toy problem, and see what turns out to be useful...

*[By "quickly find", I mean in much less than 3^^^3 steps.]*

In his open problems talk, Eliezer explains how Löb's theorem prevents you from having a consistent proof system P with an axiom schema that anything P proves is actually true, and asks how we can then "build an AI that could completely rewrite itself, without decreasing

the amount of trust it had in mathevery time it executed that self-rewrite" (18:46).Recently, I posted about an attempt to apply a general trick for avoiding diagonalization problems to a minimal toy version of this problem. Since then, Wei Dai has posted an interesting quining approach to the same toy problem, and Giles had a promising idea for doing something similar in a different way and will hopefully do a write-up filling in the details. Unfortunately my

own"proof" turned out to be broken.I

thinkI've fixed the problem and made the proof more comprehensible and intuitive in the process. (To avoid confusion, note that what I'm proving is slightly different from, though related to, what I did in the previous post.) However, getting the details right seems to be far from trivial, soI would very much appreciate if people checked my new argument, and told me that it looks okay / where it goes wrong / where they get lost. Thanks in advance!I'll be more explicit about quoting/unquoting than before, which means I'll need to introduce some notation. However, to sustain you through the schlep of preliminaries, I thought I'd start with an informal summary.

*

Löb's theorem shows that it's inconsistent to assume BAD := Peano Arithmetic + for each statement C, the axiom "if 'C' is provable in BAD, then C". I will extend the language of PA with a constant symbol K, and consider the proof system PPT.2 := Peano Arithmetic + for each statement C, the axiom "if K>0, and 'C' is provable in PPT.2, then D", where D is the statement obtained from C by replacing each occurrence of "K" by "K-1". [PPT.2 is short for for

parametric polymorphism trick, version 0.2.] There will also be a variant PPT.3, where C can have a free variable.I won't just show that these systems are consistent — I will show that they are

true(i.e., sound), from which consistency follows. Of course, to do so I will have to say what my strange constant K is supposed to mean. I will show that for any natural number n, all axioms of PPT.2 and PPT.3 are true in the standard model of PA if K is interpreted to mean n.What's the use? In my toy version of the AI rewrite problem, the AI will accept a rewrite proposed by its unreliable heuristics module if PPT.3 proves that after executing "K" rewrites this way, the AI's values will still be preserved. By the above, this implies that

anyfinite number of rewrites will be safe. I will show that my AI will acceptitselfas a safe rewrite — exactly what you wouldnothave if you used a hierarchy of proof systems, where PA(n+1) knows that things proven in PA(n) are true, as Eliezer explains in his talk — which seems to indicate that PPT.3 is not entirely powerless in the direction we want. (I'mnotsaying that this somehow solves the whole problem, though, just that it suggests this might be a place worth looking in.)I'll sketch my soundness proof for PPT.2. We proceed by induction. For K=0 (by which I mean that K is interpreted as the number 0), the assertion is trivial, since the axioms of PPT.2 have no content in this case. Suppose PPT.2 is sound for K=n, and let C be a statement provable in PPT.2. By the induction hypothesis, this implies that in the K=n interpretation, C is true. Therefore, for K=(n+1), D is true, where D is the sentence obtained from C by substituting (K-1) for K. This proves the truth of the axiom "If K>0, and 'C' is provable in PPT.2, then D."

*

## Preliminaries

The object language I'll be talking about is the language of Peano Arithmetic extended by a constant symbol K; PA will refer to Peano Arithmetic in this extended language. As usual, expressions in the object language can be represented

inthe object language through Gödel numbers.I will take the unusual step of also representing them by Gödel numbers

in the metalanguage. (Quantifications such as "For all formulas C, ..." mean "For all natural numbers C, if C is the Gödel number of a formula, then ...".) This allows me to operate on expressions on both the object- and the metalevel through the same computable functions:down(n)takes (the Gödel number of) a formula and replaces each occurrence of "K" in that formula by "K-1";repr(n)takes a number and returns the Gödel number of the unary representation of this number; andsubst(F,e)replaces all free occurrences of the special variable "_" in the expression F by the expression e. I will writerepr(EXPR)as&EXPR, andsubst(F,e)asF(e)(the onlyfunctionsI apply to values are the ones I just introduced, so there should be no confusion between substitution and function application).When I write "0=1 implies 7>9", then this does denotes not a string, but the Gödel number of the expression that formalizes the statement in quotation marks in PA. Quotation marks work like Lisp's backquote, and #[STUFF] is splicing, like Lisp's comma: "If #[C], then 0=1" denotes the Gödel number of the expression obtained by splicing the expression denoted by the Gödel number C into the appropriate place. Splicing is often used in combination with

repr(.): "42 < #[&n]" splices in the unary representation of the number n. When given a program p (in the form of the Gödel number of its source), our toy AI will search for proofs of the statement "#[&p] is safe for the next K rewrites".I'll write |- for

proves, as in PPT.2 |- "2+2 = 4". (Unlike in my last post, I'll only need to write PPT.2 or PPT.3 to the left of the |-, so you can consider PPT.2 |- ... and PPT.3 |- ... as one-place predicates.) This can be formalized in the object language. I'll write n |= C, pronouncedn models C,for:(the formula represented by the Gödel number) C is true in the standard model of PA, if K is interpreted as the number n. This is metalanguage-only, since the object language has no truth predicate. For both |- and |=, we're only interested in their meaning on closed formulas, i.e., formulas without free variables.If n |= C, then C is

true. If C is just a variable, that doesn't immediately tell us much, but if n |= "STUFF", and STUFF doesn't contain an occurrence of K, then we can conclude that STUFF. (Not all we can say in the metalanguage can be said in the object language, but all that can be said in the object language can be said in the metalanguage, if we fix the meaning of K.) If STUFFdoescontain K, then we replace every occurrence of K by n, and then concludethat.If the quotation contains a splice, as in n |= "0=1 implies #[D]", then we have the same problem as with n |= C (although we could conclude that 0=1 implies n |= D). But in one case, we can say more, namely when all splices are of the form #[& STUFF]: this means that what gets spliced in is the unary numeral representing STUFF, and we know which number this numeral denotes, namely STUFF. For example, if n |= "#[&x] > #[&y]", then x>y. (It's even true that if C is a closed formula, then 0 |= "#[&C] >= 0", so that C >= 0, which looks like a type error—but recall that formulas are represented by Gödel numbers, even on the metalevel.) Take a moment to convince yourself of all this: it will be important.

"STUFF" and 'STUFF' have the same meaning and can be nested. Splicing works on the innermost level: In "For all x, PPT.2 |- '0 <= #[&x]'", what gets spliced in is the value of the

object-levelvariable x, not of some metavariable of the same name.*

## System PPT.2 and its soundness

I'll introduce two proof systems, PPT.2 which I sketched in the introduction and PPT.3 which allows formulas with a free variable, because the definition and soundness proof of PPT.2 are simpler and will hopefully help build intuition for PPT.3.

(1) PPT.2 := PA + for every closed formula C, the axiom "If K>0, and PPT.2 |- #[&C], then #[down(C)]".

This says that if K>0, and PPT.2 proves C, then the proposition obtained from C by replacing K by K-1 is in fact true. We want to show that this claim is correct when K is interpreted to be any particular natural number n.

We proceed by induction. We know that the axioms of PA are true, so we only need to worry about the additional axioms of PPT.2. For K=0, these axioms are trivial, and so we will assume that soundness has been proven if K is interpreted as n, and show soundness for n+1. Thus, for any C, we need to prove

(2) n+1 |= "If K>0, and PPT.2 |- #[&C], then #[down(C)]."

As usual in the semantics of first-order logic, this is equivalent to

(3) If (n+1 |= "K>0, and PPT.2 |- #[&C]"), then (n+1 |= "#[down(C)]").

Now "#[down(C)]" is just down(C), so this is the same as

(4) If (n+1 |= "K>0, and PPT.2 |- #[&C]"), then (n+1 |= down(C)).

So, we need to prove (4). To do this, we'll assume the left-hand side

(5) n+1 |= "K>0, and PPT.2 |- #[&C]",

and try to prove the right-hand side.

Note that the only place where K appears in the quoted stuff in (5) is at the very beginning. The formula represented by C may very well contain occurrences of K, but what gets spliced in is

notthe formula C itself, but &C, the unary representation of its Gödel number. Therefore, from (5) it follows that(6) n+1 > 0, and PPT.2 |- C.

Since we already know that PPT.2 is sound for n, and since anything first-order logic concludes from a sound axiom system is itself true, the second half of (6) implies that

(7) n |= C.

But this obviously implies that

(8) n+1 |= down(C),

as desired, because down(.) replaces K by K-1, and if K is interpreted as n+1, then K-1 is interpreted as n. This completes the proof that PPT.2 is sound.

*

## PPT.3 and its soundness

A

one-parameter formulais a formula which may have "_" as a free variable, but no others. Recall that if F is (the Gödel number of) a one-parameter formula, and e is (the Gödel number of) some expression, then F(e) denotes (the Gödel number of) the result of substituting e for "_" in F.(9) PPT.3 := PA + for every one-parameter formula F, the axiom "For all x: If K>0, and PPT.3 |- #[&F](&x), then #[down(F('x'))]".

Here, "#[&F](&x)" transports F from the metalevel to the object level, and then (when the resulting object-level expression is "executed") substitutes the representation of the number x for "_". Thus, on the object level, this denotes a closed formula. On the other hand, "#[down(F('x'))]" splices in a formula with free variable 'x'.

Again, the extra axioms are trivially satisfied for K=0, so we suppose that soundness has been shown for K interpreted as n, and we want to show soundness for n+1. We must show that for every one-parameter formula F,

(10) n+1 |= "For all x: If K>0, and PPT.3 |- #[&F](&x), then #[down(F('x'))]."

Now we use that we're only interested in the

standardmodel of PA, where every possible value of x can be written explicitly as a unary numeral. Thus, if we want to prove that a statement of the form "For all x: STUFF" is true in this model, i.e. that n+1 |= "For all x: STUFF", then it suffices to prove that for every natural number m, n+1 |= C, where C is obtained from STUFF by replacing each free occurrence of 'x' by the unary numeral &m.There is one place where we've written 'x' explicitly in our formula, namely the "...(&x)" part, and then there are the places 'x' gets substituted into, in the down(F('x')) part. To replace the first occurrence, we can just splice in the numeral &n, i.e. write #[&n] instead of x. Replacing the other occurrences is even simpler: we need only replace F('x') by F(&n). Thus, what we must prove is that for every natural number m,

(11) n+1 |= "If K>0, and PPT.3 |- #[&F](& #[&m]), then #[down(F(&m))]."

Take heart, dear reader: this is the worst that the line noise is going to get in this post.

To prove (11), it suffices to prove

(12) If (n+1 |= "K>0, and PPT.3 |- #[&F](& #[&m])"), then (n+1 |= down(F(&m))).

We assume that the left-hand side is true, i.e., that

(13) n+1 |= "K>0, and PPT.3 |- #[&F](& #[&m])",

and try to prove the right-hand side.

The only place where K occurs in (13) is right at the beginning, so it follows from (13) that

(14) n+1 > 0, and PPT.3 |- F(&m).

Here, we use that as noted in the Preliminaries, when using n+1 |= "STUFF" to conclude STUFF, we can just replace splices #[& EXPR] by EXPR.

By the induction hypothesis, we know that PPT.3 is sound for n, so the second half of (14) implies

(15) n |= F(&m).

But as in the soundness proof for PPT.2, it's obvious that this implies

(16) n+1 |= down(F(&m)).

This completes the soundness proof for PPT.3.

*

## A toy AI

Now would be a good time to read the beginning of my previous post, if you haven't already, which introduces the toy problem I consider here. You should have read the first two sections and the first paragraph of the third section, which gives the definition of "p is safe for n steps".

Define AI_PPT.3 by

It's immediate from the definition that if AI_PPT.3 doubles down on a program p, then

(17) PPT.3 |- "#[&p] is safe for K steps".

By soundness of PPT.3, it then follows that for every n, we have

(18) n |= "#[&p] is safe for K steps",

which implies

(19) p is safe for n steps.

Since this holds for every n, we've shown that AI_PPT.3 only doubles down on safe programs, and is therefore safe.

*

The reasoning leading to (17) can be formalized in PPT.3, without requiring a gigantonormous number of proof steps, because the argument doesn't need to do things like simulate AI_PPT.3 for the whole 3^^^3 cycles. Let's write

PPT.3 quickly proves Cif C appears near the beginning of the list of the first 3^^^3 theorems of PPT.3. Then we have(20) PPT.3 quickly proves, "For all programs p: If AI_PPT.3 doubles down on p, then PPT.3 |- '#[&p] is safe for K steps'."

Now, we'll show in a moment that

(21) PPT.3 quickly proves "For all x: If K>0, and PPT.3 |- '#[&x] is safe for K steps', then x is safe for K-1 steps."

From (20) and (21), we get

(22) PPT.3 quickly proves, "For all programs p: If AI_PPT.3 doubles down on p, and K>0, then p is safe for K-1 steps."

Now it's clear from the definition of "safe for n steps" that

(23) PPT.3 quickly proves, "AI_PPT.3 is safe for K steps"

—and this shows that AI_PPT.3 doubles down on itself, as claimed. All that remains is to show (21).

*

Recall that for every one-parameter formula F, PPT.3 contains the axiom, "For all x: If K>0, and PPT.3 |- #[&F](&x), then #[down(F('x'))]".

We set F = "_ is safe for K steps". The corresponding axiom is,

(24) "For all x: If K>0, and PPT.3 |- #[& '_ is safe for K steps'](&x), then #[down('_ is safe for K steps'('x'))]."

Now, "#[& 'STUFF']" is just "STUFF", so (25) is the same formula (i.e., the same Gödel number) as

(25) "For all x: If K>0, and PPT.3 |- '_ is safe for K steps'(&x), then #[down('_ is safe for K steps'('x'))]."

Since the expression '_ is safe for K steps'('x') lives on the metalevel, we can do that computation and find that (25) is the same formula as

(26) "For all x: If K>0, and PPT.3 |- '_ is safe for K steps'(&x), then #[down('x is safe for K steps')]."

On the other hand, the '_ is safe for K steps'(&x) part lives on the object level, so if we replace it by '#[&x] is safe for K steps', we'll get a different formula (different Gödel number) that (25). However, by a trivial computation, these denote the same Gödel number on the object level, which means that PPT.3 quickly proves

(27) "For all x: If K>0, and PPT.3 |- '#[&x] is safe for K steps', then #[down('x is safe for K steps')]."

The down(...) part lives on the metalevel, so by computation, (27) is the same formula as

(28) "For all x: If K>0, and PPT.3 |- '#[&x] is safe for K steps', then #['x is safe for K-1 steps']",

which, in turn, is just syntax for

(29) "For all x: If K>0, and PPT.3 |- '#[&x] is safe for K steps', then x is safe for K-1 steps."

Thus, we've shown that PPT.3 quickly proves (29), as (21) asserted.