The potential exploitability of infinite options

LESSWRONG
LW

The potential exploitability of infinite options — LessWrong

You would think that interacting with a fully known, deterministic system, when you have unbounded computing power should be easy, you just pick the series of actions that lead to the highest reward. In cases with finitely many end states, this is the case.

Suppose you meet a benevolent genie, who says "pick a number, any real number, and I will give you that many utils".

The genie is offering you infinitely many options, with no one option being best. So whatever option you pick, you will want to go even higher. Bounded utility agents have the same problem. If you have a maximum utility of 10, and the genie offers you any amount of utility strictly less than 2, then you are still faced with an infinite sequence of ever better moves. If the agent could also use their time for something else, then they can say 2-1/3^^^^3 and then conclude that its not worth their time to specify numbers even closer to 2.

Of course you can get infinite expected utility in the infinite case. Toss a coin so that you pick 2 with prob 1/2, you pick 4 with prob 1/4 and so on. However, as soon as the coin lands, you find that you have finite utility, given an option to flip again you always will. There is also the question of adding 1 to all utilities on the coin is good, because you are guaranteed 1 util beyond what you would've gotten otherwise, or indifferent because you expect infinite utils either way. In the first case, the probabilistic option doesn't fix the problem, agents still have an infinite list of ever better options. In the second case, the agent will be ambivalent about any finite amount of utility in many situations with unbounded potential utility.

Things can be made even worse for our agent. Suppose that we start in Purgatory, and consider being there as -1 util/day. Every day the genie offers the agent an option to go to Heaven, +10 util/day for the amount of time that they have already been in Purgatory. After this period is up, they would get Oblivion, at 0 util/day.

A naive rational agent would see that it could gain more utils by rejecting the genies offer for a bit longer, than by accepting. As such it would carry on rejecting the genie and staying in Purgatory forever. A timeless decision theorist might decide that there is no meaningful difference between the decisions faced on different days, and that "always accept" yields better results than "always reject". However an agent that can precommit itself to a strategy like "accept after N days" will get 9N utils. Given an ability to precommit yourself, the agent is placed back in the first situation. It has to pick a finite N, and the bigger it makes N, the more utils it gets.

If we assume that our agent is a Turing machine, with fixed Komelgorov complexity, then there is a maximum number that is outputted by Turing machines of that size. However "couldness" of X is based on the fact that in the logical counter factual where X is the best possible action, the agent does X. (according to Elizer) So by this definition, the agent "could" output a number larger than its busy beaver number, but from our view outside the system, we know that it won't. The agent can also follow through this argument to show it can't output anything larger than its busy beaver number, but as the busy beaver numbers aren't computable, it doesn't know how big this is.

Is there any decision theory that works well when faced with infinite choice?