# Ω 2

Personal Blog
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

MIRI's paper on robust cooperation in the one-shot prisoner's dilemma (by Patrick LaVictoire and others) models agents which know their opponents' source code through Gödel-Löb provability logic, a modal logic for reasoning about provability in Peano Arithmetic. This framework of "modal agents" has turned out to be a really useful way of reasoning about these agents. Vladimir Slepnev recently proposed reformulating UDT with a halting oracle in terms of provability logic as well, and I agree that this is the right tool for the job. This post is a primer on provability logic and the results about it that we've been using in our work.

# The language

Gödel-Löb provability logic ("GL") is propositional logic plus a modal operator, , which we interpret as "provable". Formulas may contain variables which denote propositions; for example is a formula saying that if the proposition is true, then it's provable that the proposition implies itself. We usually write for the proposition "true" and for the proposition "false".

Some conventions that we've been using: Use small letters (e.g. , ) to refer to propostional variables; use capital letters (e.g. , ) to refer to particular formulas; use , as metavariables ranging over formulas. (The line between "concrete formula" and "metavariable standing in for a formula" isn't hard-and-fast.) When a formula contains propositional variables, list them in parantheses when talking about the formula: e.g., .

The meaning of a closed formula (i.e., a formula without propositional variables) is given by a recursive translation to a sentence of Peano arithmetic: If the modal formula is translated to the arithmetic formula , then is translated to the arithmetic formula stating "the formula is provable in PA". (Propositional connectives are translated in the obvious way, e.g. is translated to .)

In practice, instead of making this translation explicit, we just abbreviate " is provable in PA" as in formulas of arithmetic, and leave it to the context to disambiguate whether we're talking about a modal or an arithmetic formula. Because of this, we may sometimes write instead of to emphasize the Gödel quotations in the arithmetic version of the formula.

# Soundness and completeness

The axioms and inference rules of GL can be stated in the usual style of a modal logic, but in practice, two basic results about GL provide a much more convenient way to reason about it:

• The soundness theorem states that if a formula is provable in GL, then for any closed formulas in the language of arithmetic, is provable in PA.
• The completeness theorem is the exact converse: It states that if is a formula of provability logic, and if for any closed formulas in the language of arithmetic, is provable in PA, then is provable in GL.

In summary:

This means that we can argue that something is provable in GL by arguing that the analogous thing is provable in PA. For example, Löb's theorem states that if , then . This implies that the analogous statement holds in GL: Suppose that . Then by soundness, for all , and hence by Löb, . But by completess, this implies .

Similarly, implies , again because the analogous statement holds about PA.

As a matter of fact, both of these happen to be an inference rules of GL. But the point is that you don't need to memorize the axioms and inference rules of GL in order to use it; it's sufficient to know about its soundness and completeness.

# Substitution of equivalent formulas

Despite the above, there is one important result about GL that is not obvious from soundness and completeness: If GL proves two formulas and equivalent, then you can substitute for in any other formula, even inside boxes; in other words, in this case, GL will prove that is equivalent to , for every formula .

# Fixed points

The other really important results about GL are the existence and uniqueness of fixed points.

We say that a formula is modalized in if all occurrences of in are inside boxes. We say that is fully modalized if it is modalized in each . (We don't need the latter notion in the rest of this post, but it's frequently used in applications.)

Suppose that is modalized in . Then there is a formula which is a fixed point of in the sense that Moreover, there's a computer program that computes given ; and if is modalized in , then it is modalized in as well.

More generally, this holds for multiple parameters and multiple formulas , . In that case, we can find formulas such that for every , GL proves that is equivalent to

Moreover, it can be shown that these fixed points are unique---not in the sense that there is only one set of formulas satisfying the above condition, of course (if works, so does : see the rule about substitution of equivalent formulas), but in the sense that if is a solution and is also a solution, then , .

This is true not only about solutions that can be written in the language of provability logic, but for any solution in the language of arithmetic: Suppose that , , and are all closed formulas of arithmetic, such that for , PA proves and . Then , for .

# Decidability

Provability in GL turns out to be decidable, though it's PSPACE-complete, which sounds pretty bad, but perhaps the formulas we're interested in are mostly so short that it doesn't matter---I don't know. In the case of modal agents, Mihaly and Marcello wrote program which computes what two such agents do against each other, and it was efficient enough that we were able to use it to verify the examples in the paper. I don't know whether the techniques would generalize to the application to UDT.

If you want to go dig deeper, you may want to consult the Stanford Encyclopedia of Philosophy entry on provability logic or Per Lindström's excellent article "Provability logic---a short introduction" (Theoria, 62(1-2):19--61, 1996). Gated copy.

Some of the results claimed here aren't in Lindström, though they follow straight-forwardly from what is there; I'll later add the proofs in comments. In the meantime, some of the additional stuff can be found in the modal agents paper.

# Ω 2

New Comment

Thread for proofs of results claimed above (not limited to stuff not found in Lindström); contributions appreciated. Stuff not in Lindström includes the uniqueness of arithmetic fixed points in PA (found in the modal agents paper), and I think the version of the fixed point theorem with more than one (the latter should be provable by iterated application of the fixed point theorem for a single ).

Generalized fixed point theorem:

Suppose that are modal sentences such that is modalized in (possibly containing sentence letters other than ).

Then there exists in which no appears such that .

We will prove it by induction.

For the base step, we know by the fixed point theorem that there is such that

Now suppose that for we have such that .

By the second substitution theorem, . Therefore we have that .

If we iterate the replacements, we finally end up with .

Again by the fixed point theorem, there is such that .

But as before, by the second substitution theorem, .

Let stand for , and by combining the previous lines we find that .

By Goldfarb's lemma, we do not need to check the other direction, so and the proof is finished

An immediate consequence of the theorem is that for those fixed points and every , .

Indeed, since is closed under substitution, we can make the change for in the theorem to get that .

Since the righthand side is trivially a theorem of , we get the desired result.

One remark: the proof is wholly constructive. You can iterate the construction of fixed point following the procedure implied by the construction of the to compute fixed points.

Uniqueness of arithmetic fixed points:

Notation:

Let be a fixed point on of ; that is, .

Suppose is such that . Then by the first substitution theorem, for every formula . If , then , from which it follows that .

Conversely, if and are fixed points, then , so since is closed under substitution, . Since , it follows that .

(Taken from The Logic of Provability, by G. Boolos.)