In this article cousin_it describes an algorithm, using a Turing Oracle, which gives a working model of UDT. This algorithm seems to work well in the case of a utility function which takes natural numbers or rationals as values but, as I will explain, the algorithm doesn't actually work for utilities given by real numbers; I will also propose an algorithm which doesn't use an oracle and which seems to wield the correct answer given sufficient computing time.

One of the hidden assumptions in the initial post is that comparison of utilities is decidable (at least decidable with a Turing oracle): in step 3 of the algorithm you are supposed to return the action that corresponds to the highest utility found on step 2 of the algorithm, but if comparison of utilities isn't actually decidable you won't be able (in the general case) to do that and as it happens that equality of real numbers isn't decidable (even using an oracle). Also there are no canonical forms for real numbers like there are for natural or rational numbers so you can have a situation where the order relationship between different real numbers depends on the value of A().

A simple modification of the original algorithm can deal with this problem:

  1. Enumerate proofs of statements of the form A()=a ⇒ U()≥u, where u is a rational number in canonical form, for t time steps.
  2. Return the action that corresponds to the highest utility found on step 2.

Why does this works? Because if the formal system used is sound, when the formal system proves that A()=a ⇒ U()≥u where a is the action which ends up actually being taken then U() must actually end up being at least u.

So for example, in the case of Newcomb's problem if t is sufficiently high it will eventually be proven that A()=1 ⇒ U()≥1 000 000 and so, if the formal system used is sound, the utility which ends up being received by the agent will have to be at least 1 000 000 which is of course only possible if the agent one boxes so that's what it will end up doing.

New to LessWrong?

New Comment
9 comments, sorted by Click to highlight new comments since: Today at 3:18 PM

Thanks, this looks mostly right to me. Non-oracle algorithms with time limits were actually the first formulation we came up with, and then the oracle setting was mostly motivated by trying to devise an algorithm that didn't have an arbitrary time limit. So it might be fruitful to try and remove the time limit from your idea too.

I wonder if there's some way to maximize utility "implicitly" without having to mention concrete values in the proofs, e.g. by trying to prove statements of the form "if one-boxing leads to some unknown utility U, then two-boxing leads to utility less than U". This exact form doesn't seem to work, but maybe something similar can be made to work? Or if it can't be made to work, maybe there's some nice impossibility result lurking around the corner? Thanks for taking the time to think about this.

(If optimal utility is irrational, there doesn't exist highest rational utility value u in valid statements of the form A()=a ⇒ U()≥u, in which case the time limitation must be used, and action won't necessarily be optimal.)

Rationals are dense in the reals so there are always a rational value between any two real numbers. So for example, if it can be proven in your formal system that A()=1 ⇒ U()=π and this happen to be the maximal utility attainable, there will be a rational number x greater than the utility that can be achieved by any other action and such that x≤π, so you will be able to prove that A()=1 ⇒ U()≥x and so the first action will end up being taken because by definition of x it is greater than the utility that can be obtained by taking any other action.

I see. Still, time limit is not the criterion you want to use here, it would be better to establish rational interval bounds for utility of each possible action, and once the intervals are disjoint, you act. This way, the action is always optimal if all utilities are different, while if you just use the time limit, it could arrive before different actions could be evaluated with enough precision.

That doesn't actually work. Take the Newcomb's problem. Suppose that your formal system quickly prove that A()≠1. Then it conclude that A()=1 ⇒ U()∈[0,0], on the other it end up concluding correctly that A()=2 ⇒ U()∈[1000,1000] so it end up two boxing. This is a possible behavior, even if the formal system used is sound, if one use rational intervals as you recommend. On the other hand, as I have shown, if you chose t sufficiently large the algorithm I recommend in my post will necessarily end up one boxing if the formal system used is sound. (using intervals was actually the first idea I had when coming to term with the problem detailed in the post, but it simply doesn't work.)

This is a possible behavior, even if the formal system used is sound, if one use rational intervals as you recommend.

Not if we use a Goedel statement/chicken rule failsafe like the one discussed in Slepnev's article you linked to.

On the other hand, as I have shown, if you chose t sufficiently large the algorithm I recommend in my post will necessarily end up one boxing if the formal system used is sound.

Edit: This part of my comment is wrong.

This is incorrect, as Zeno had shown more than 2000 years ago. It could be that your inference system generates an infinite sequence of statements of the form A()=1 ⇒ U()≥Si with sequence {Si} tending to, say, 100, but with all Si<100, so that A()=1 loses to A()=2 no matter how large the timeout is.

On the other hand, as I have shown, if you chose t sufficiently large the algorithm I recommend in my post will necessarily end up one boxing if the formal system used is sound.

This is incorrect, as Zeno had shown more than 2000 years ago. It could be that your inference system generates an infinite sequence of statements of the form A()=1 ⇒ U()≥Si with sequence {Si} tending to, say, 100, but with all Si<100, so that A()=1 loses to A()=2 no matter how large the timeout is.

That's why you enumerate all proofs of statement of the form A()=a ⇒ U()≥u (where u is rational number in canonical form). It's a well known fact that it is possible to enumerate all the provable statements in a given formal system without skipping any.

This is a possible behavior, even if the formal system used is sound, if one use rational intervals as you recommend.

Not if we use a Goedel statement/chicken rule failsafe like the one discussed in Slepnev's article you linked to.

There are some subtleties about doing this in the interval setting which made me doubt that it could be done, but after thinking about it some more I must admit that it is possible.

But I think that my algorithm for the non-oracle setting is still valuable.

But I think that my algorithm for the non-oracle setting is still valuable.

I think it's a useful trick not so much because it doesn't require oracles, but because it doesn't require a Goedel statement (chicken rule), even though it depends on choosing a time limit, while the algorithm with Goedel statement doesn't.

It's a well known fact that it is possible to enumerate all the provable statements in a given formal system without skipping any.

You're right, there is no problem here, as long as you enumerate everything.