Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Oracle machines instead of topological truth predicates

1paulfchristiano

0Benya_Fallenstein

1So8res

0Benya_Fallenstein

1jessicata

0Benya_Fallenstein

1jessicata

0Benya_Fallenstein

0Benya_Fallenstein

0Benya_Fallenstein

0jessicata

0cousin_it

2Benya_Fallenstein

New Comment

13 comments, sorted by Click to highlight new comments since: Today at 11:24 PM

Do we really need the constraint M[O'] halts with probability 1 for each O'? I remember the results going through anyway, if we just allow O(M) to be arbitrary with whatever probability M[O] doesn't halt. This actually doesn't even require Kakutani, we can just use Brouwer's.

The idea is that P(STATE[O] outputs 1) is a continuous function of P(O[X] outputs 1), P(STATE0[O] outputs 1), and P(STATE1[O] outputs 1), where X is the oracle query that STATE makes in the next time step (if any), STATE0 is the resulting state after the oracle returns 0, and STATE1 is the resulting state after the oracle returns 1. So given one value for O, we can define the new oracle f(O) using this update step. Because f(O)(STATE) depends only on O(X), O(STATE0), and O(STATE1), f is a continuous function. So we get a fixed point.

I guess one reason to dislike this version is that it can't be used in such a straightforward way as part of AIXI, without some further restriction, such as some upper bound on the computational complexity of the world (perhaps exponentially larger than the computational complexity of the agent).

If this application is the point of the restriction that M[O'] halts with probability 1 for all O', then it might be reasonable to more explicitly call that out. It seems like a simpler notion of reflective oracles might find use elsewhere. For example, it seems like most practicing computer scientists (at least in the US) don't much care about the difference between "infinite" and "doubly exponential." In some contexts (esp. when you have access to a halting oracle) the infinite is important because it is serving as an analogy for the computationally intractable. But I don't think that's the case here (because we are considering "doubly exponential relative to O," which is already intractable for a polynomial time agent relative to O).

Thanks for expanding on your construction! I hadn't thought of the recursive construction, that's really neat.

I'm not that worried about the application to AIXI: unless I'm missing something, we can just additionally give our machines access to a double halting oracle (for ordinary Turing machines), and recover the same power. I considered doing that and didn't go with it because the stuff in my post seemed slightly more elegant, but if there's a reason to prefer the other version, it seems fine to use it.

I'm not clear on what you mean by ""; I'm guessing you were sloppy in talking about rather than my or some other variant, but I may be missing something. The version seems to need Kakutani unless we want to allow some -tolerance: consider for arbitrary and for ; we want this to equal for all such that and to equal for all such that , which makes discontinuous if it's single-valued. My intuition is that we generally need Kakutani to get exact optimality rather than optimality (as a rule of thumb, not a hard-and-fast rule), because which choice is optimal jumps discontinuously.

In the version, continuity is not as immediate as if we could determine by looking at only three values of . However, we can use the same trick as in my post on simplicity priors: Abbreviate the probability that the oracle will say "greater" or "less" when called on as and , respectively; then, the probability that a reflective assigns to halting equals

.

This is clearly a continuous function of , since later calls to are multiplied with smaller and smaller weights. Thus, we can use this expression and the analogous one for in the definition of .

We can then check that a fixed point of this correspondence is a reflective oracle by showing inductively that for every , if the probability that halts in steps and outputs is greater than then and if the probability that it halts in steps and outputs is less than , then .

Tangential: I'm now thinking that it would be more elegant not to speak about halting; rather, use write-and-advance-only output tapes, and talk about the probability that the first bit outputs is a or a , with "never outputs anything" as the third possibility(whether due to looping or halting without output). Interested in whether or not this presentation would seem natural to theoretical computer scientists. (Although still interested in comments on the notation for the probabilities of the oracle's outputs.)

Typos & syntax complaints:

and let’s consider an oracle to be a function , such that specifies the probability that the oracle will return "true" when invoked on the pair

Confusing notation. In the paragraph above, had the type .

We want to find an oracle machine that will output if , output if , and output either or if the expectations are equal.

Should be "And output either or if the expectations are equal", presumably.

[Will edit this post as I find more complaints.]

Fixed the typo, thanks!

I considered describing the probabilities of the oracle returning "true" by a different function , but it seemed too pedantic to have a different letter. Maybe that's wrong, but it still feels too pedantic. If I do things that way I probably shouldn't be writing " returns 'true' if...", though...

This is nice! I actually don't remember exactly what formulation of this I discussed with Paul at the workshop, but I really like this version. I do think that it may be useful (at least from a pedagogical perspective) to look at what happens in the finite case, where we have a finite set of halting stochastic programs that may query the oracle to ask about the output probabilities of other programs in this set. This restricted version is powerful enough to express the liar's paradox and some multi-agent problems.

In this case, as long as we are assured that each program always halts, I believe the problem is computable. We can enumerate all possible calls that any program might give to the oracle. Then, we can write the desired 1-returning probabilities for each call as a set-valued function of the oracle's 1-returning probabilities for each call. This function will be in closed form using only addition, subtraction, multiplication, and a set-valued version of the unit step function. So, it will be possible to find the fixed point of this function using a combination of brute-force search and polynomial solving, similar to how it's possible to find Nash equilibria.

Thanks! I've been also thinking about making a finite version; my main motivation was that I've been hoping to take my still-incompletely-specified AIXI variant and find an analog variant of AIXI*tl*, the computable version of AIXI that considers only hypotheses which terminate in time and whose source length is .

I think that would in fact be a pretty reasonable finite set of programs to use in what you're suggesting, since "we want it to terminate in time " provides a motivation for why we would require programs to terminate surely, as opposed to almost surely. (Terminating surely seems to be necessary for the condition of "only finitely many possible calls to the oracle" to hold: there is an almost surely terminating stochastic oracle program which chooses a random pair in and calls the oracle on that.)

It seems like the finite, AIXI*tl* version should be able to implement any finite game with a rational payoff matrix (for sufficiently large and ), which is pretty neat. (Of course, even though this is computable, it does require the game being played to be much simpler than the computers the players are running on---which are spending basically all of their computation time on computing the oracle, rather than the agent code that calls that oracle; in any real life situation, it's pretty clear that there would be a better use that these agents could put their supercomputers to.)

I hadn't realized the point about the problem being expressible using just elementary operations and the set-valued step function---that's pretty neat! I don't know enough about polynomial solving to figure out the details of how this would work, but it's extremely plausible that a variation of a technique used for finding Nash equilibria would work. (In fact, I wonder whether the problem can somehow be transformed into the problem of finding a Nash equilibrium of an appropriate finite game?) I'd be interested in seeing some more details worked out. It seems like some proof that equilibria in the finite case can actually be found would be nice to have in an exposition aimed at computer scientists. (One brute-force proof idea that I'd considered before would be just to say that by the fixed point theorem, there is some solution, so if you try rational-valued solutions in a fine enough grid, you have to find values that are within any of being a solution, in some appropriate sense. I haven't worked out the details.)

I thought about it some more and realized that this is equivalent to finding a Nash equilibrium. What we do is, we create a player for every call . This player has 2 actions (call them action 0 and action 1). The player's utilities are set up so that they always get utility for choosing action 0. Also, given fixed actions for the other players, their utility for action 1 is set to be the probability that returns 1 given that is an oracle that returns the action of a different player corresponding to the call it receives. Now, the player's expected utility is for action 0, and for action 1 it will be 's probability of returning 1 where is an oracle that returns answers to calls under the same distribution as the other players' mixed strategies. So, it will be allowed to choose action 0 iff , action 1 iff , and will be allowed to mix iff , as desired. It should be noted that within a program, repeated calls to the oracle must return the same value, so they should be cached.

Going the other direction is easy: to find the Nash equilibrium for a game where each player has 2 actions, set up a program for each player that queries the oracle for the actions of each other player and then returns the action that maximizes expected utility. For games with more than 2 actions (say, each player has actions ), we can create a different set of programs for each player: program where returns 1 iff some maximum expected utility action's index is less that or equal to .

It seems to follow that the full problem is equivalent to a game with countably infinite players. I was thinking of how you would solve the full problem using some kind of oracle machine. It would at least be necessary to determine whether a program halts for every infinite input you give it, because this is necessary to detect queries. This interacts weirdly with the requirement that queries almost surely halt; it may require something analogous to for counting the number of inputs a program halts on? The problem of determining if a machine halts for every finite input is computable by a double oracle machine but not a regular oracle machine, but I don't know what that says about infinite inputs.

About determining whether something's a query: A formulation of the statement we want to test is, (A) "for every (infinite) input on the oracle tape, and every , there is a such that with probability , the program halts in steps."

I think this is equivalent to, (B) "For every , there is a such that, for any input on the oracle tape, the program halts within timesteps with probability ."

Reason why I think this is equivalent: Clearly (B) implies (A). Suppose that (B) is false; then there is a such that for all , there is an input on the oracle tape such that the program runs for steps with probability . Let be a function that returns such an oracle tape input for every .

Now, there either must be infinitely many such that starts with , or infinitely many such that starts with , or both. Write down such a bit; say, . Now, there must be infinitely many such that starts with , or such that starts with , or both. Iterate; this way, we obtain an infinitely long input for the oracle tape, which has the property that for every , there are arbitrarily large ---and in particular, there's a ---such that the program runs for steps with probability . In other words, for every , the probability that the program runs for steps when given this input on its oracle tape is . This shows that (A) is false.

Thus, it's sufficient to test whether a prospective query satisfies (B). But in timesteps, the machine can only read the first bits of its oracle tape and of its random input tape, so the statement "for any input on the oracle tape, the program halts within timesteps with probability " is primitive recursive. Thus, (B) is a statement, which we can test with a double halting oracle.

(warning: math above written quickly and not carefully checked)

In the other direction, suppose that is a primitive recursive predicate, and consider the following probabilistic Turing machine: First, randomly choose an (placing positive probability on every natural number). Then, search exhaustively for an such that is true. If you find one, halt and output . This machine is a query if and only if . Together with the parent comment, this shows that having an oracle that tells you whether or not something is a query is equivalent to having an oracle for statements (i.e., a double halting oracle).

Aagh... I'm conflicted. I really like the simplicity of having players correspond 1:1 to calls to the oracle, and their mixed strategies to the probability that the oracle returns "true", but the implication that the oracle must return the same answer if it's called twice on the same arguments during the execution of a single program interacts very badly with the intuitive picture of the applications I have in mind.

The intuitive picture is that the agents under consideration are living in a world whose laws of physics are probabilistic and allow for the construction of oracles---meaning, devices that invoke the underlying reflective oracle. An agent would be just a computer with access to such a device. Then, according to CDT, it's optimal to program these computers to do oracle calls that compare the expected utilities of taking the different actions available to the agent, and then taking an action that maximizes expected utility. (I.e., it's not that that's the only way you could program that computer---you could program it to use its oracle in many different ways, but this one is an optimal one according to causal reasoning.) The world, then, can be modelled to be an oracle machine program which simulates the local laws of physics and outputs what happens in the world.

But for this picture to be correct, it's really important for different calls to the oracle to be independent. Suppose that they aren't, and we have two players playing a game of Matching Pennies. One of these players chooses according the expected utility calculation described above, using a single oracle invocation to figure out which action is better. But the other player... *makes the same call to the oracle as the first player*, uses this to determine whether the first player chooses heads or tails, and does the opposite. Clearly, the second player wins every game, meaning that the algorithm run by the first player is no longer optimal.

So I think the version you outlined doesn't work for my main application, even though I really like the simplicity of it.

One possibility would be to have a player for every instance where the oracle is invoked in one of the possible execution histories of one of the programs under consideration. But then it wouldn't be guaranteed that the different players corresponding to the same pair will play the same mixed strategies. Of course, it can be shown that there are Nash equilibria that are symmetric in this way, but requiring such a result seems to take away from the simplicity of the connection to Nash equilibria, and it's not immediate that the algorithms for finding Nash equilibria can be adapted to finding this kind of symmetric equilibrium (although they almost certainly can).

Another possibility would be to try to formulate this as a game where players correspond to pairs , but where their pure strategies are numbers in the interval . In this case, if the players' payoffs are a continuous bounded function of the strategy profile, Kakutani's fixed point theorem will guarantee that there is a pure Nash equilibrium, and it should be possible to find this result in a standard textbook somewhere (economists need continuous strategy spaces for modelling things like auction bids). And there might be existing algorithms for finding Nash equilibria for such games. Judd, Renner & Schmedders (2012) looks promising, although I haven't really delved into it. (It also looks annoyingly complicated, though.)

For the payoffs, we could then re-use your idea: If player chooses strategy , their payoff is plus times the probability that returns , where behaves with probabilities according to the pure strategy profile under consideration.

Although, now I'm worried whether or not it's a problem that when we evaluate whether a player would do better by switching from to , we also switch the oracle to a different oracle that makes calls to return "true" with probability rather than with probability ? (In other words, when we consider replacing by , we're also potentially changing the probability that halts...)

Hmm...this seems like less of a problem to me. The thing we need for the Nash equilibrium equivalence is that, within an execution of a single program, equal calls to the oracle return equal results (or equivalently, you can't give the same call to the oracle twice). But we don't need to give the same call to the oracle twice in order to model the matching pennies game, because you just need one call (to determine whether your opponent plays heads with greater than 50% probability), and the two players are represented as different programs.

The thing we're basing the Nash equilibrium equivalence on is the question "given fixed return values of calls, what is the probability that this program returns 1?" in order to write in utility values for our normal form game. Since we're basing everything on this process, it follows that within an execution of a program, it only make sense to make a given call once. But nothing about this rules out having different programs reason about the same call, and having the oracle give them different answers.

For what it's worth, I feel that this post is about halfway between your original post and what an interested amateur with CS background would actually understand :-) Maybe you could try writing for an imaginary audience of ordinary programmers with no special knowledge of math? According to "Explainers Shoot High, Aim Low", that might be unexpectedly helpful.

Hm, I think it would be *expectedly* helpful to explain things more, but it would also take more time, and aiming much lower would take much more time. (That's part of why it's taking me so long to turn these things into research papers, although in that case the length limitations make the constraint even more difficult.) At the moment, I'm trying to get the things I'm working on out there *at all*, especially with something like this where I haven't yet worked through all the details of whether I can actually do my intended application based on it. After all, this doesn't need to be the last explanation of this topic, ever!

That said, I do appreciate the problem that something written in this style is harder to digest than something with more explanations. Maybe I'll try to shift in that direction, but I don't want that to come at the cost of not getting new interesting thoughts out at all (which I'm worried it might).

In a comment on my post on topological truth predicates, Paul suggests an approach that uses probabilistic oracle machines instead, in order to make this work more comprehensible to computer scientists. I like this idea!

Paul sketches a framework developed by him and Jessica Taylor, based on a conversation with Scott Aaronson; in this post, I propose a slight simplification of their framework. My version has an oracle O(┌M┐,p), which takes the source code of a probabilistic oracle machine M and a p∈Q∩[0,1]. If for every possible oracle O′, M[O′] halts with probability one and outputs either 0 or 1, then O(┌M┐,p): (i) returns "true" if the probability that M[O] returns 1 is >p; (ii) returns "false" if it is <p; (iii) randomly returns "true" or "false" if it is exactly p. If M is not of this form, the oracle always returns "false". I think this is sufficient for all the applications I have in mind---in particular, for defining a variant of AIXI that is able to reason about worlds containing other, equally powerful variant AIXIs (but many details remain to be checked).

To be clear, by "probabilistic oracle machine" I mean a probabilistic Turing machine with access to our oracle, which itself behaves probabilisticly. Let's write P:=Q∩[0,1] for the set of all rational probabilities, and let's consider an oracle to be a function O:N×P→[0,1], such that O(┌M┐,p) specifies the probability that the oracle will return "true" when invoked on the pair (┌M┐,p). (If the oracle is invoked multiple times on the same arguments, each of these is an independent random event.) The function O(┌M┐,p) plays a similar role in this framework to the function θ(┌φ┐,┌ψ┐) from the topological truth predicates post. Feedback on notation and terminology is encouraged---my exposure to theoretical CS is limited.

Let's define a

queryto be a probabilistic oracle machine M such that for every oracle O, M[O] halts with probability one and outputs either 0 or 1. If M is a query, let P(M,O) be the probability that M[O] outputs 1. Then our condition above can be re-stated as follows:(i) If P(M,O)>p, then O(┌M┐,p)=1. (ii) If P(M,O)<p, then O(┌M┐,p)=0. (iii) If P(M,O)=p, then O(┌M┐,p)∈[0,1] is arbitrary. (iv) For any n which is not the Gödel number of a query, O(n,p)=0.

The last of these conditions is a bit of a devious trick; it gives the machine access to a fairly powerful halting oracle, which can e.g. check whether a probabilistic Turing machine halts with probability one. (Given such a machine T, construct the machine T′ which runs T and outputs 1 if T halts; then if T halts with probability 1, T′ is a query and O(┌T′┐,0)=1, but if T halts with probability <1, T′ is not a query and O(┌T′┐,0)=0.) I think that I need this in my variant of AIXI in order to filter out "world models" which don't necessarily halt, and I think this will be enough to do so, but I'll leave working out the details to a later post.

At the end of this post, I'll prove that a O with these properties exists. But first, let me sketch how it can be used.

Suppose that we have an agent which is trying to choose between two actions, A and A′, and suppose that the agent has two probabilistic oracle machines, M and M′, which represents the agent's model of what will happen in the world if it chooses action A or action A′, respectively. (This doesn't mean that the agent has to have perfect knowledge about how the world works---since these are probabilistic machines, they can represent probability distributions over the laws of physics.) We could think of M and M′ as outputting information about things that happen in the world, which we can then plug into a computable utility function, but it will be easier to think of them as directly returning a utility.

Our goal, then, is for our agent to choose A if the expectation of the utility returned by M is greater than the expectation of the utility returned by M′.

Utilities are real numbers (and if we consider infinite-horizon problems with temporal discounting, rational utilities won't suffice), so we need a way for probabilistic machines in our framework to represent probability distributions over reals. Let's say that an machine "outputs the real number r∈[0,1]" if it outputs a sequence of nested closed intervals with endpoints in P=Q∩[0,1], converging to the single point r. We'll assume that M[O′] and M′[O′] output real numbers in [0,1] with probability one,

for every oracle O′:N×P→[0,1]. I'll write E[N] for the expected value of the number returned by N[O], if N is such an oracle machine (leaving the oracle O implicit in the notation).We want to find an oracle machine that will output A if E[M]>E[M′], output A′ if E[M]<E[M′], and output either A or A′ if the expectations are equal. Note that in our representation, we can write machines that compute e.g. the sum or difference of the numbers computed by two other machines; and note that E[M]>E[M′] is equivalent to E[(1+M−M′)/2]>0.5. We have E[(1+M−M′)/2]∈[0,1]; thus, if we could find a query N such that N[O] returns 1 with probability E[(1+M−M′)/2], we could simply call our oracle on O(┌N┐,0.5), and take action A if the oracle returns "true", action A′ otherwise.

We can implement N as follows. Set x0:=0 and y0:=1, and initialize a run of (1+M−M′)/2. Then, for every n∈N≥1: run (1+M−M′)/2 until it returns its next (n'th) interval, with lower bound xn and upper bound yn. Then, with probability (xn−xn−1)/(yn−1−xn−1), halt and output 1; with probability (yn−1−yn)/(yn−1−xn−1), halt and output 0; otherwise, continue looping.

Since with probability one, (1+M−M′)/2 outputs a sequence of intervals that converge to a single point, N halts with probability one and outputs either 0 or 1, whatever oracle we run it on; i.e., it is a query. We can describe its behavior as sampling a random number r∈[0,1] from the distribution (1+M−M′)/2, and then outputting 1 with probability r and 0 with probability (1−r). When looked at it this way, it's clear that when run on O, this machine will output 1 with probability E[(1+M−M′)/2], as desired.

Let's now proceed to the proof that an appropriate O exists. This is analogous to the proof in the topological truth predicates post. (However, while we'll still need to use the product topology and the infinite-dimensional generalization of Kakutani’s fixed point theorem, this proof won't require knowledge of Grothendieck universes, set theory, or formal logic.)

Recall that an

oracleis an arbitrary function O:N×P→[0,1]. We say that an oracle O′reflectsanother oracle O if for every p∈P and every query M, we have O′(┌M┐,p)=1 if P(M,O)>p and O′(┌M┐,p)=0 if P(M,O)<p, and for any n that is not the Gödel number of a query, we have O(n,p)=0. In interpretation, O′ reflects O if it gives correct information about the probability that a query returns 1 when it's executed with oracle O.An oracle is called

reflectiveif it reflects itself; reflective oracles give answers which are true about the oracle itself.Our main goal in this post is just to show that reflective oracles exist. However, later I'll want to show that for every Nash equilibrium of a finite game, there is a reflective oracle which makes the players of that finite game play that Nash equilibrium, and in order to do so, it will be helpful to have a slightly stronger statement, which shows that there are reflective oracles satisfying certain constraints.

These constraints will take the form of a partial function π:N×P↛[0,1], which specifies the values O should take on a certain set dom(π)⊆N×P. We call a function of this type a

partial oracle, and say that Oextendsπ if O(n,p)=π(n,p) for all (n,p)∈dom(π). We call a partial oracle πreflectiveif for every oracle O extending π, there is an oracle O′ which reflects O and which also extends π.With these preliminaries, we can state our existence result for reflective oracles:

Theorem.(i) There is a reflective oracle O:N×P→[0,1]. (ii) For every reflective partial oracle π:N×P↛[0,1], there is a reflective oracle O which extends π.

Proof.We begin by showing that the empty partial oracle π∅ (i.e., the partial oracle satisfying dom(π∅)=∅) is reflective, since (i) then follows from (ii). Thus, let O be any oracle (since every oracle extends π∅), and define O′:N×P→[0,1] as follows: For any M and p such M is a query and P(M,O)>p, set O′(┌M┐,p)=1; for all other pairs (n,p), set O(n,p)=0. Then, O′ clearly both reflects O and (trivially) extends π∅. This finishes the proof that π∅ is a reflective partial oracle.It remains to show (ii). For this, we use the infinite-dimensional generalization of Kakutani’s fixed point theorem:

We let A be the set of all oracles extending π; then A⊂RN×P, which is a locally convex topological vector space when endowed with the product topology (since this is true of any power of R). A is clearly non-empty, convex, and closed, and it is a subset of [0,1]N×P which, by Tychonoff’s theorem, is compact. Moreover, N×P is countable, so it's sufficient to consider convergence of sequences in RN×P.

We choose f(O) to consist of all oracles O′ that reflect O and extend π. By the assumption that π is reflective, f(O) is non-empty for every O∈A. Moreover, it is easy to see that f(O) is both closed and convex; hence, if we can also show that f has closed graph, then by the fixed point theorem, there is a O∈A such that O∈f(O), which is exactly what we want to show.

Thus, assume that we have sequences On and O′n of oracles extending π such that O′n∈f(On) for every n∈N, and suppose that these sequences converge to limits O,O′∈A; then we need to show that O′∈f(O), i.e., that O′ reflects O. To see this, we must show that for every query M and any p∈P, P(M,O)>p implies O′(┌M┐,p)=1, and P(M,O)<p implies O′(┌M┐,p)=0. (The condition that O′(n,p)=0 for any n that is not the Gödel number of a query follows directly from the fact that this is true about all O′n.)

Without loss of generality, assume that P(M,O)<p. Let ε:=(p−P(M,O))/2. Since M is a query, M[O] halts with probability one (and returns either 0 or 1); hence, there is some T∈N such that with probability >1−ε, M[O] halts in at most T steps. Despite the fact that M is probabilistic and can make calls to an oracle, there are only finitely many possible execution traces of M during the first T timesteps, and thus only finitely many pairs (┌M′┐,p′) on which M might invoke its oracle during these timesteps.

Since On converges to O, there is an n0 such that for all n≥n0, |On(┌M′┐,p′)−O(┌M′┐,p′)|<ε/T for each of these finitely many pairs (┌M′┐,p′). Suppose that we model each call to the oracle as sampling a random number from a uniform [0,1] distribution, and returning "true" if this number is greater than O(┌M′┐,p′); then the above result implies that with probability >1−ε, M[On] and M[O] behave the same way during the first T timesteps. Hence, with probability >1−2ε, M[On] and M[O] both halt during the first T steps and return the same result. But this implies that P(M,On)<P(M,O)+2ε=p, for all n≥n0.

Since O′n reflects On, this implies O′n(┌M┐,p)=0, and since this holds for all n≥n0 and O′n converges to O′, it follows that O′(┌M┐,p)=0, as desired. □