# 2

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

In a comment on my post on topological truth predicates, Paul suggests an approach that uses probabilistic oracle machines instead, in order to make this work more comprehensible to computer scientists. I like this idea!

Paul sketches a framework developed by him and Jessica Taylor, based on a conversation with Scott Aaronson; in this post, I propose a slight simplification of their framework. My version has an oracle , which takes the source code of a probabilistic oracle machine and a . If for every possible oracle , halts with probability one and outputs either or , then : (i) returns "true" if the probability that returns is ; (ii) returns "false" if it is ; (iii) randomly returns "true" or "false" if it is exactly . If is not of this form, the oracle always returns "false". I think this is sufficient for all the applications I have in mind---in particular, for defining a variant of AIXI that is able to reason about worlds containing other, equally powerful variant AIXIs (but many details remain to be checked).

To be clear, by "probabilistic oracle machine" I mean a probabilistic Turing machine with access to our oracle, which itself behaves probabilisticly. Let's write for the set of all rational probabilities, and let's consider an oracle to be a function , such that specifies the probability that the oracle will return "true" when invoked on the pair . (If the oracle is invoked multiple times on the same arguments, each of these is an independent random event.) The function plays a similar role in this framework to the function from the topological truth predicates post. Feedback on notation and terminology is encouraged---my exposure to theoretical CS is limited.

Let's define a query to be a probabilistic oracle machine such that for every oracle , halts with probability one and outputs either or . If is a query, let be the probability that outputs . Then our condition above can be re-stated as follows:

(i) If , then . (ii) If , then . (iii) If , then is arbitrary. (iv) For any which is not the Gödel number of a query, .

The last of these conditions is a bit of a devious trick; it gives the machine access to a fairly powerful halting oracle, which can e.g. check whether a probabilistic Turing machine halts with probability one. (Given such a machine , construct the machine which runs and outputs if halts; then if halts with probability , is a query and , but if halts with probability , is not a query and .) I think that I need this in my variant of AIXI in order to filter out "world models" which don't necessarily halt, and I think this will be enough to do so, but I'll leave working out the details to a later post.

At the end of this post, I'll prove that a with these properties exists. But first, let me sketch how it can be used.

Suppose that we have an agent which is trying to choose between two actions, and , and suppose that the agent has two probabilistic oracle machines, and , which represents the agent's model of what will happen in the world if it chooses action or action , respectively. (This doesn't mean that the agent has to have perfect knowledge about how the world works---since these are probabilistic machines, they can represent probability distributions over the laws of physics.) We could think of and as outputting information about things that happen in the world, which we can then plug into a computable utility function, but it will be easier to think of them as directly returning a utility.

Our goal, then, is for our agent to choose if the expectation of the utility returned by is greater than the expectation of the utility returned by .

Utilities are real numbers (and if we consider infinite-horizon problems with temporal discounting, rational utilities won't suffice), so we need a way for probabilistic machines in our framework to represent probability distributions over reals. Let's say that an machine "outputs the real number " if it outputs a sequence of nested closed intervals with endpoints in , converging to the single point . We'll assume that and output real numbers in with probability one, for every oracle . I'll write for the expected value of the number returned by , if is such an oracle machine (leaving the oracle implicit in the notation).

We want to find an oracle machine that will output if , output if , and output either or if the expectations are equal. Note that in our representation, we can write machines that compute e.g. the sum or difference of the numbers computed by two other machines; and note that is equivalent to . We have ; thus, if we could find a query such that returns with probability , we could simply call our oracle on , and take action if the oracle returns "true", action otherwise.

We can implement as follows. Set and , and initialize a run of . Then, for every : run until it returns its next ('th) interval, with lower bound and upper bound . Then, with probability , halt and output ; with probability , halt and output ; otherwise, continue looping.

Since with probability one, outputs a sequence of intervals that converge to a single point, halts with probability one and outputs either or , whatever oracle we run it on; i.e., it is a query. We can describe its behavior as sampling a random number from the distribution , and then outputting with probability and with probability . When looked at it this way, it's clear that when run on , this machine will output with probability , as desired.

Let's now proceed to the proof that an appropriate exists. This is analogous to the proof in the topological truth predicates post. (However, while we'll still need to use the product topology and the infinite-dimensional generalization of Kakutani’s fixed point theorem, this proof won't require knowledge of Grothendieck universes, set theory, or formal logic.)

Recall that an oracle is an arbitrary function . We say that an oracle reflects another oracle if for every and every query , we have if and if , and for any that is not the Gödel number of a query, we have . In interpretation, reflects if it gives correct information about the probability that a query returns when it's executed with oracle .

An oracle is called reflective if it reflects itself; reflective oracles give answers which are true about the oracle itself.

Our main goal in this post is just to show that reflective oracles exist. However, later I'll want to show that for every Nash equilibrium of a finite game, there is a reflective oracle which makes the players of that finite game play that Nash equilibrium, and in order to do so, it will be helpful to have a slightly stronger statement, which shows that there are reflective oracles satisfying certain constraints.

These constraints will take the form of a partial function , which specifies the values  should take on a certain set . We call a function of this type a partial oracle, and say that extends  if for all . We call a partial oracle reflective if for every oracle extending , there is an oracle which reflects and which also extends .

With these preliminaries, we can state our existence result for reflective oracles:

Theorem.

(i) There is a reflective oracle . ​ (ii) For every reflective partial oracle , there is a reflective oracle  which extends .

Proof. We begin by showing that the empty partial oracle (i.e., the partial oracle satisfying ) is reflective, since (i) then follows from (ii). Thus, let  be any oracle (since every oracle extends ), and define  as follows: For any and such is a query and , set ; for all other pairs , set . Then, clearly both reflects  and (trivially) extends . This finishes the proof that is a reflective partial oracle.

It remains to show (ii). For this, we use the infinite-dimensional generalization of Kakutani’s fixed point theorem:

Suppose that is a non-empty, convex and compact subset of a locally convex topological vector space. Suppose further that is a set-valued function such that is non-empty, convex and compact for all , and such that the graph of , is a closed set. Then  has a fixed point; that is, there is an  such that .

We let be the set of all oracles extending ; then , which is a locally convex topological vector space when endowed with the product topology (since this is true of any power of ). is clearly non-empty, convex, and closed, and it is a subset of  which, by Tychonoff’s theorem, is compact. Moreover, is countable, so it's sufficient to consider convergence of sequences in .

We choose to consist of all oracles  that reflect  and extend . By the assumption that  is reflective, is non-empty for every . Moreover, it is easy to see that is both closed and convex; hence, if we can also show that  has closed graph, then by the fixed point theorem, there is a  such that , which is exactly what we want to show.

Thus, assume that we have sequences and of oracles extending  such that for every , and suppose that these sequences converge to limits ; then we need to show that , i.e., that reflects . To see this, we must show that for every query and any , implies , and implies . (The condition that for any that is not the Gödel number of a query follows directly from the fact that this is true about all .)

Without loss of generality, assume that . Let . Since is a query, halts with probability one (and returns either 0 or 1); hence, there is some such that with probability , halts in at most steps. Despite the fact that is probabilistic and can make calls to an oracle, there are only finitely many possible execution traces of during the first timesteps, and thus only finitely many pairs on which might invoke its oracle during these timesteps.

Since converges to , there is an such that for all , for each of these finitely many pairs . Suppose that we model each call to the oracle as sampling a random number from a uniform distribution, and returning "true" if this number is greater than ; then the above result implies that with probability , and behave the same way during the first timesteps. Hence, with probability , and both halt during the first steps and return the same result. But this implies that , for all .

Since reflects , this implies , and since this holds for all and converges to , it follows that , as desired. □