Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Previously, I described a type of probabilistic oracle machines that I think is powerful enough to allow the implementation of a variant of AIXI that can reason about worlds containing other instances of this variant of AIXI. In this post, I describe how to implement a simplicity prior (a variant of Solomonoff induction) in this framework; there are some technical issues which will also come up when implementing AIXI that are easier to talk about in the context of Solomonoff-like induction.

Solomonoff induction is a prior probability distribution over infinite bit strings, {0,1}ω. We want to define a similar prior using probabilistic oracle machines with reflective oracles instead of Turing machines---and then "implement" that prior as a probabilistic oracle machine with that oracle.

The basic idea is simple. Say that a probabilistic oracle machine H is a hypothesis if, for every oracle O′:N×(Q∩[0,1])→[0,1], invoking H[O′]almost surely outputs an infinite bit string. Fix some programming language for probabilistic oracle machines, and randomly choose a program implementing a hypothesis H, where the probability of each program of length ℓ is proportional to 2−ℓ. Then run H[O], where O is our actual, reflective oracle, to obtain a sample bit string. This concludes the definition of our prior.

This also suggests how we could implement this prior, as a probabilistic oracle machine S that samples bit strings with these probabilities; it seems like all we need is a way to test whether or not a given program implements a hypothesis. I'll tackle that problem later in this post, but first let me point out a different problem: simply running the hypothesis H that we've sampled in the first step is not an ideal way to implement the prior.

Suppose that we want to determine the probability that the output of S starts with 0111010. Well, here's an idea: why don't we use our reflective oracle?

To do so, we need to write a query Q which returns 1 if the output of S has the right prefix, 0 otherwise. Easy, or so it seems; run S until it has outputted the first seven bits, check whether they're 0111010, return 0 or 1 accordingly. Then we can call our oracle on (┌Q┐,p), for any p∈Q∩[0,1], to check whether the probability of the prefix 0111010 is greater or smaller than p.

But a requirement for a machine Q to be a query is that it Q[O′] halts with probability one, for every oracle O′. In the implementation of S given above, any way we might find to check whether an arbitrary program H is a hypothesis will certainly involve a use of the reflective oracle, which means that if we run S on an arbitrary oracle O′, we might end up choosing a "hypothesis" which loops forever without producing any output. Thus, the machine described in the previous paragraph also has a positive probability of looping forever when run on this bad oracle, which means that it isn't a query.

To fix this, let's change the definition of S in such a way as to turn it into a hypothesis (i.e., into a machine that almost surely outputs an infinite bitstring, no matter what oracle it's run on). Then the Q described above will be a query, and we can use this method to compute the probabilities of arbitrary finite prefixes.

To do so, we need to make sure that once we sampled an H, we definitely output an infinite bitstring, no matter what our oracle is and whether or not H is actually a hypothesis. Of course, ifH is a hypothesis and we're running on a reflective oracle, then we want to output a sample from H[O] (with the correct distribution).

We can do this---with a simple trick. Let's start by letting Q be the machine which runs H until it outputs its first bit, and then returns that bit. If H is a hypothesis, then Q is a query.

So we call our oracle on the pair (┌Q┐,0.5), and throw a fair coin.

If the coin comes up heads and the oracle says "false" (the probability of Q returning 1 is smaller than 0.5), we output a zero.

If the coin lands heads and the oracle says "true", we output a one.

If the coin lands tails and the oracle says "false", we call our oracle on (┌Q┐,0.25) and repeat the process; if the coin lands tails and the oracle says "true", we call the oracle on (┌Q┐,0.75) and repeat.

This way, if we're running on a reflective oracle, we output a bit with the same distribution as the first bit outputted by H; but no matter what oracle we're running on, we definitely output a bit with probability one (because in each step of the binary search described above, we have a 50% probability of producing output).

This technique generalizes beyond the particular query we were talking about above:

Lemma (Protecting queries). There is a computable function f(┌Q┐) which takes the Gödel number of a probabilistic oracle machine and returns the Gödel number of a different probabilistic oracle machine, such that (i) f(┌Q┐) is a query, and (ii) if Q is also a query, and O is a reflective oracle, then f(┌Q┐)[O] has the same distribution as Q[O].

This gives us the first bit. Now we want to produce a second bit of output, whose probability distribution is the same as the distribution of the second bit of H---conditional on H having produced the first bit that we just produced.

Here's a simple way to do so. Suppose that the first bit was a 1. Let Q now be the following machine: Run H until it produces its first bit. If that bit is a 0, start over. If the bit is a 1, run it until it produces its second bit, and return that bit.

If H is a hypothesis which has a positive probability of producing a 1 as its first bit, then Q is a query. Assuming this is safe if we're running on a reflective oracle, since in this case the probabability that we choose 1 as our first bit is equal to the probability that H chooses 1 as its first bit.

Now we apply the above lemma about queries, with this new Q. This gives us an actual query, which we can just run since it is guaranteed to halt, and which has the correct distribution if we're running on a reflective oracle.

We can use exactly the same procedure for each of the later bits, leading to a lemma exactly analogous to the earlier one for queries:

Lemma (Protecting hypotheses). There is a computable function f(┌H┐) which takes the Gödel number of a probabilistic oracle machine and returns the Gödel number of a different probabilistic oracle machine, such that (i) f(┌H┐) is a hypothesis, and (ii) if H is also a hypothesis, and O is a reflective oracle, then f(┌H┐)[O] has the same distribution as H[O].

Given this result, the main missing piece is how to test whether a given machine ┌H┐ is a hypothesis. We need this test to give the correct result if we're running on a reflective oracle, and we also need it to halt no matter which oracle we're running on---in other words, we need it to be a query.

It turns out that this is quite straight-forward to implement, given our definition of reflective oracles as always returning "false" when passed anything other than a query. Given the source code ┌H┐ of a machine that may or may not be a hypothesis, we first implement the following machine, M; M has the property that it is a query if and only if H is a hypothesis.

M first chooses a natural number, n, according to any distribution that places positive probability on arbitrary large numbers.

Then, it runs H until H has outputted n bits.

If H halts before it has outputted n bits, then M loops forever. (If H goes into an infinite loop before it has outputted n bits, then M does of course also loop forever.)

If H does in fact output n bits, then M outputs 1 and halts.

Our test is now to call our oracle on the pair (┌M┐,0). Since this is only a single invocation of the oracle, it always halts. Moreover, if we're running on a reflective oracle, then there are two cases:

If H is a hypothesis, then M halts with probability one; whatever number M selects in the first step, H eventually outputs that many bits. In this case, M is a query that always outputs 1, implying that invoking the oracle on (┌M┐,0) will return "true".

If H is not a hypothesis, there is a positive probability that it will halt or loop forever after outputting only some finite number of bits, and there is a positive probability that M will choose a number larger than this. Hence, there is a positive probability that M loops forever, which implies that it is not a query, and hence that invoking the oracle on (┌M┐,0) will return "false".

We now have a query to test whether something is a hypothesis, and way of safely running a potential hypothesis, which will not go into infinite loops even if we're not running on a reflective oracle. Are we done?

Unfortunately not quite; there's still an annoying problem remaining, and so far I only have a slightly kludgy solution to it. The problem is that we would like our prior to, with probability one, choose a ┌H┐ that actually is a hypothesis, and to do so it can't use the obvious method of trying random machines until finds one for which the query returns true.

Why not? Well, our query consists of a single oracle invocation---and if we're not running on a reflective oracle, all instances of that oracle invocation might return "false", in which case we would loop forever. And that would mean that our sampler S would still not be a hypothesis, whose distribution we can probe with a reflective oracle.

My kludge is simple: Select only a single machine ┌H┐ at random. If our test tells us that this isn't a hypothesis, then output an infinite stream of zeros. If our test tells us it is a hypothesis, then output a single 1 before executing the protected version of H. Then S is a hypothesis, and we can use the tools detailed at the beginning of this post to figure out its distribution; and if we can find the probability of arbitrary finite prefixes of S's output, we can also compute the conditional probability, given that the first bit is a 1. This gives us the probability distribution we were initially looking for.

There's probably something more elegant, but I haven't yet been able to come up with an alternative that's both elegant and simple.

Previously, I described a type of probabilistic oracle machines that I think is powerful enough to allow the implementation of a variant of AIXI that can reason about worlds containing other instances of this variant of AIXI. In this post, I describe how to implement a simplicity prior (a variant of Solomonoff induction) in this framework; there are some technical issues which will also come up when implementing AIXI that are easier to talk about in the context of Solomonoff-like induction.

Solomonoff induction is a prior probability distribution over infinite bit strings, {0,1}ω. We want to define a similar prior using probabilistic oracle machines with reflective oracles instead of Turing machines---and then "implement" that prior

asa probabilistic oracle machine with that oracle.The basic idea is simple. Say that a probabilistic oracle machine H is a

hypothesisif, for every oracle O′:N×(Q∩[0,1])→[0,1], invoking H[O′] almost surely outputs an infinite bit string. Fix some programming language for probabilistic oracle machines, and randomly choose a program implementing a hypothesis H, where the probability of each program of length ℓ is proportional to 2−ℓ. Then run H[O], where O is our actual, reflective oracle, to obtain a sample bit string. This concludes the definition of our prior.This also suggests how we could implement this prior, as a probabilistic oracle machine S that samples bit strings with these probabilities; it seems like all we need is a way to test whether or not a given program implements a hypothesis. I'll tackle that problem later in this post, but first let me point out a different problem: simply

runningthe hypothesis H that we've sampled in the first step is not an ideal way to implement the prior.Suppose that we want to determine the probability that the output of S starts with 0111010. Well, here's an idea: why don't we use our reflective oracle?

To do so, we need to write a query Q which returns 1 if the output of S has the right prefix, 0 otherwise. Easy, or so it seems; run S until it has outputted the first seven bits, check whether they're 0111010, return 0 or 1 accordingly. Then we can call our oracle on (┌Q┐,p), for any p∈Q∩[0,1], to check whether the probability of the prefix 0111010 is greater or smaller than p.

But a requirement for a machine Q to be a query is that it Q[O′] halts with probability one, for

everyoracle O′. In the implementation of S given above, any way we might find to check whether an arbitrary program H is a hypothesis will certainly involve a use of the reflective oracle, which means that if we run S on an arbitrary oracle O′, we might end up choosing a "hypothesis" which loops forever without producing any output. Thus, the machine described in the previous paragraphalsohas a positive probability of looping forever when run on this bad oracle, which means that it isn't a query.To fix this, let's change the definition of S in such a way as to turn it into a hypothesis (i.e., into a machine that almost surely outputs an infinite bitstring, no matter what oracle it's run on). Then the Q described above will be a query, and we can use this method to compute the probabilities of arbitrary finite prefixes.

To do so, we need to make sure that once we sampled an H, we definitely output an infinite bitstring, no matter what our oracle is and whether or not H is actually a hypothesis. Of course,

ifH is a hypothesis and we're running on a reflective oracle, then we want to output a sample from H[O] (with the correct distribution).We can do this---with a simple trick. Let's start by letting Q be the machine which runs H until it outputs its first bit, and then returns that bit. If H is a hypothesis, then Q is a query.

So we call our oracle on the pair (┌Q┐,0.5), and throw a fair coin.

This way, if we're running on a reflective oracle, we output a bit with the same distribution as the first bit outputted by H; but no matter what oracle we're running on, we definitely output a bit with probability one (because in each step of the binary search described above, we have a 50% probability of producing output).

This technique generalizes beyond the particular query we were talking about above:

Lemma(Protecting queries).There is a computable function f(┌Q┐) which takes the Gödel number of a probabilistic oracle machine and returns the Gödel number of a different probabilistic oracle machine, such that (i) f(┌Q┐) is a query, and (ii) if Q is also a query, and O is a reflective oracle, then f(┌Q┐)[O] has the same distribution as Q[O].This gives us the

firstbit. Now we want to produce a second bit of output, whose probability distribution is the same as the distribution of the second bit of H---conditionalon H having produced the first bit that we just produced.Here's a simple way to do so. Suppose that the first bit was a 1. Let Q now be the following machine: Run H until it produces its first bit. If that bit is a 0, start over. If the bit is a 1, run it until it produces its second bit, and return that bit.

If H is a hypothesis which has a positive probability of producing a 1 as its first bit, then Q is a query. Assuming this is safe if we're running on a reflective oracle, since in this case the probabability that

wechoose 1 as our first bit is equal to the probability that H chooses 1 as its first bit.Now we apply the above lemma about queries, with this new Q. This gives us an actual query, which we can just run since it is guaranteed to halt, and which has the correct distribution if we're running on a reflective oracle.

We can use exactly the same procedure for each of the later bits, leading to a lemma exactly analogous to the earlier one for queries:

Lemma(Protecting hypotheses).There is a computable function f(┌H┐) which takes the Gödel number of a probabilistic oracle machine and returns the Gödel number of a different probabilistic oracle machine, such that (i) f(┌H┐) is a hypothesis, and (ii) if H is also a hypothesis, and O is a reflective oracle, then f(┌H┐)[O] has the same distribution as H[O].Given this result, the main missing piece is how to test whether a given machine ┌H┐ is a hypothesis. We need this test to give the correct result if we're running on a reflective oracle, and we also need it to halt no matter which oracle we're running on---in other words, we need it to be a query.

It turns out that this is quite straight-forward to implement, given our definition of reflective oracles as always returning "false" when passed anything other than a query. Given the source code ┌H┐ of a machine that may or may not be a hypothesis, we first implement the following machine, M; M has the property that it is a query

if and only if H is a hypothesis.Our test is now to call our oracle on the pair (┌M┐,0). Since this is only a single invocation of the oracle, it always halts. Moreover, if we're running on a reflective oracle, then there are two cases:

nota hypothesis, there is a positive probability that it will halt or loop forever after outputting only some finite number of bits, and there is a positive probability that M will choose a number larger than this. Hence, there is a positive probability that M loops forever, which implies that it is not a query, and hence that invoking the oracle on (┌M┐,0) will return "false".We now have a query to test whether something is a hypothesis, and way of safely running a potential hypothesis, which will not go into infinite loops even if we're not running on a reflective oracle. Are we done?

Unfortunately not quite; there's still an annoying problem remaining, and so far I only have a slightly kludgy solution to it. The problem is that we would like our prior to, with probability one, choose a ┌H┐ that actually

isa hypothesis, and to do so it can't use the obvious method of trying random machines until finds one for which the query returns true.Why not? Well, our query consists of a single oracle invocation---and if we're not running on a reflective oracle, all instances of that oracle invocation might return "false", in which case we would loop forever. And

thatwould mean that our sampler S would still not be a hypothesis, whose distribution we can probe with a reflective oracle.My kludge is simple: Select only a single machine ┌H┐ at random. If our test tells us that this

isn'ta hypothesis, then output an infinite stream of zeros. If our test tells us itisa hypothesis, then output a single 1 before executing the protected version of H. Then S is a hypothesis, and we can use the tools detailed at the beginning of this post to figure out its distribution; and if we can find the probability of arbitrary finite prefixes of S's output, we can also compute theconditionalprobability, given that the first bit is a 1. This gives us the probability distribution we were initially looking for.There's probably something more elegant, but I haven't yet been able to come up with an alternative that's both elegant and simple.