# 2

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

MIRI's paper on robust cooperation in the one-shot prisoner's dilemma (by Patrick LaVictoire and others) models agents which know their opponents' source code through Gödel-Löb provability logic, a modal logic for reasoning about provability in Peano Arithmetic. This framework of "modal agents" has turned out to be a really useful way of reasoning about these agents. Vladimir Slepnev recently proposed reformulating UDT with a halting oracle in terms of provability logic as well, and I agree that this is the right tool for the job. This post is a primer on provability logic and the results about it that we've been using in our work.

# The language

Gödel-Löb provability logic ("GL") is propositional logic plus a modal operator, , which we interpret as "provable". Formulas may contain variables which denote propositions; for example is a formula saying that if the proposition is true, then it's provable that the proposition implies itself. We usually write for the proposition "true" and for the proposition "false".

Some conventions that we've been using: Use small letters (e.g. , ) to refer to propostional variables; use capital letters (e.g. , ) to refer to particular formulas; use , as metavariables ranging over formulas. (The line between "concrete formula" and "metavariable standing in for a formula" isn't hard-and-fast.) When a formula contains propositional variables, list them in parantheses when talking about the formula: e.g., .

The meaning of a closed formula (i.e., a formula without propositional variables) is given by a recursive translation to a sentence of Peano arithmetic: If the modal formula is translated to the arithmetic formula , then is translated to the arithmetic formula stating "the formula is provable in PA". (Propositional connectives are translated in the obvious way, e.g. is translated to .)

In practice, instead of making this translation explicit, we just abbreviate " is provable in PA" as in formulas of arithmetic, and leave it to the context to disambiguate whether we're talking about a modal or an arithmetic formula. Because of this, we may sometimes write instead of to emphasize the Gödel quotations in the arithmetic version of the formula.

# Soundness and completeness

The axioms and inference rules of GL can be stated in the usual style of a modal logic, but in practice, two basic results about GL provide a much more convenient way to reason about it:

• The soundness theorem states that if a formula is provable in GL, then for any closed formulas in the language of arithmetic, is provable in PA.
• The completeness theorem is the exact converse: It states that if is a formula of provability logic, and if for any closed formulas in the language of arithmetic, is provable in PA, then is provable in GL.

In summary:

This means that we can argue that something is provable in GL by arguing that the analogous thing is provable in PA. For example, Löb's theorem states that if , then . This implies that the analogous statement holds in GL: Suppose that . Then by soundness, for all , and hence by Löb, . But by completess, this implies .

Similarly, implies , again because the analogous statement holds about PA.

As a matter of fact, both of these happen to be an inference rules of GL. But the point is that you don't need to memorize the axioms and inference rules of GL in order to use it; it's sufficient to know about its soundness and completeness.

# Substitution of equivalent formulas

Despite the above, there is one important result about GL that is not obvious from soundness and completeness: If GL proves two formulas and equivalent, then you can substitute for in any other formula, even inside boxes; in other words, in this case, GL will prove that is equivalent to , for every formula .

# Fixed points

The other really important results about GL are the existence and uniqueness of fixed points.

We say that a formula is modalized in if all occurrences of in are inside boxes. We say that is fully modalized if it is modalized in each . (We don't need the latter notion in the rest of this post, but it's frequently used in applications.)

Suppose that is modalized in . Then there is a formula which is a fixed point of in the sense that Moreover, there's a computer program that computes given ; and if is modalized in , then it is modalized in as well.

More generally, this holds for multiple parameters and multiple formulas , . In that case, we can find formulas such that for every , GL proves that is equivalent to

Moreover, it can be shown that these fixed points are unique---not in the sense that there is only one set of formulas satisfying the above condition, of course (if works, so does : see the rule about substitution of equivalent formulas), but in the sense that if is a solution and is also a solution, then , .

This is true not only about solutions that can be written in the language of provability logic, but for any solution in the language of arithmetic: Suppose that , , and are all closed formulas of arithmetic, such that for , PA proves and . Then , for .

# Decidability

Provability in GL turns out to be decidable, though it's PSPACE-complete, which sounds pretty bad, but perhaps the formulas we're interested in are mostly so short that it doesn't matter---I don't know. In the case of modal agents, Mihaly and Marcello wrote program which computes what two such agents do against each other, and it was efficient enough that we were able to use it to verify the examples in the paper. I don't know whether the techniques would generalize to the application to UDT.

If you want to go dig deeper, you may want to consult the Stanford Encyclopedia of Philosophy entry on provability logic or Per Lindström's excellent article "Provability logic---a short introduction" (Theoria, 62(1-2):19--61, 1996). Gated copy.

Some of the results claimed here aren't in Lindström, though they follow straight-forwardly from what is there; I'll later add the proofs in comments. In the meantime, some of the additional stuff can be found in the modal agents paper.