# 5

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Summary: although we can find an assignment of probabilities to logical statements that satisfies an outer reflection principle, we have more difficulty satisfying both the outer reflection principle and an inner reflection principle. This post presents an impossibility result.

# Introduction

In probabilistic logic, we construct a that assigns a probability to each logical proposition (in, say, the language of set theory plus a symbol) in a way that satisfies some coherence axioms. Additionally, we can construct a that assigns probability 1 to 's coherence and satisfies an "outer" reflection principle*:

Using the contrapositive, the outer reflection principle equivalently states

Note that the outer reflection principle is not stated within ; instead, it is stated in a meta-language that talks about . To push reasoning about the system into the system itself, we might want an "inner" reflection principle:

There exists a coherent assigning probability 1 to its own coherence and satisfying this inner reflection principle (but not the outer reflection principle)**. However, it has been previously proven that no coherent satisfies both the outer reflection principle and this inner reflection principle.

# Impossibility result

In this post, we will show a more general result. We can have no coherent satisfying the outer reflection principle and assigning positive probability to the its own coherence and the inner reflection principle: that is, there is no coherent satisfying the outer reflection principle and

Note that this version of the inner reflection principle plus self-coherence is weaker than

Proof:

Suppose such a coherent and exist. Construct a fixed-point statement

Also define as shorthand

We know that due to the inner reflection principle.

Suppose . Then:

Due to implying which implies :

which contradicts 's coherence. Therefore, we must have .

Since , it follows that . Equivalently, . By the outer reflection principle, , or equivalently , which is a contradiction when combined with .

Q. E. D.

# Conclusion

This post developed out of some work with Paul and Benja to try to create a probabilistic logic that can reason about reflective oracles and thereby reason about itself in a useful way. The idea would be that it would (indirectly) assign nonzero probability to its own reflection principle, so that we could gain more reflective power through repeated conditioning. Due to the impossibility result in this post, this cannot work. It might still be possible to construct a system satisfying some weaker inner reflection principle.

* This result in stronger than the one stated in the original probabilistic logic paper (which only proves that some coherent satisfies the outer reflection principle, not that it also assigns probability 1 to its own coherence); its proof might be shown in a future post.

** This result has not been written up either.