Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Single-bit reflective oracles are enough

4So8res

New Comment

Also, FYI, I tossed together reflective implementations of Solomonoff Induction and AIXI using Haskell, which you can find on the MIRI github. It's not very polished, but it typechecks.

The latest version on this forum of the reflective oracle formalism uses

multibitoracles. These oracles are defined on probabilistic oracle machines with advance-only output tapes, and answer queries of the form, "Does machine M, when run on this same oracle, produce output starting with the bitstring →x with probability at least p?" (If M sometimes goes into an infinite loop after outputting some prefix of →x, the oracle behaves as if M outputs additional bits according to some arbitrary but fixed probability distribution; see the above post for details.)The reason for using multibit oracles was that we wanted to define versions of Solomonoff induction and AIXI, and these need to deal with hypotheses (environments) that produce infinite amounts of output, not just single bits. However, when we recently wrote up the details of this, we realized that there was a simple way to define these in a way that

doesn'tneed more than a single-bit oracle! (We'll publish a draft papers with the details soon; in the meantime, the single-bit formalism we use can be obtained by taking the multibit oracles from the post linked above, and only querying them about single bits of output.)How does this work? Consider Solomonoff induction. The trick is to consider a hypothesis to be a probabilistic oracle machine which takes

all the bits outputted so farasinput, and outputs only the next bit. This way, we only need to apply the single-bit oracle in the most straightforward possible way in order to get the conditional probability of the next bit! (And if we can get the conditional probabilities, we can of course easily compute the unconditional probabilities of any finite bitstring.)The way we discovered this was by mulling the best way to go from reflective Solomonoff induction to reflective AIXI. With the above formalism, it's easy: An environment is a machine which takes as input the entire history so far (i.e., sequence of observations and actions), and a prefix of the bitstring representing the next observation, and which outputs the next bit of that observation. With this definition, again, we can easily compute the conditional probabilities.

Details to follow in the paper drafts, which I hope we'll have online within the next week.