Summary

Suppose that you are sitting in front of a big red button, and if you press it after waiting t timesteps, you will receive t units of utility. If you want to maximize the amount of utility you obtain, then when should you press it? Clearly not ever because pressing it at any time would confer less utility than waiting one more timestep. But clearly not never because then you’d never get any utility at all!

This is an example of a procrastination paradox. In a sense, these problems are specification issues that happen when you want a system to do two contradictory things: press a button sometime, but press it later than any time (aka never). Situations like these are important decision theoretic dilemmas which represent a means by which certain naive expected utility maximizers can fail miserably. They pose a challenge for the understanding and implementation of mechanistic rationality by demonstrating that in some situations, there is no optimal normative decision procedure. In fact, this post considered their existence to be an argument against utility functions in general.

Unfortunately, there is a great deal of confusion surrounding them, so the goal of this post is to build a rigorous, novel framework for understanding and eventually making solemn peace with them. After an in-depth discussion of these paradoxes, this post explores three aspects about them in context of formalisms from decision theory and reinforcement learning: the good, the bad, and the ugly. Sadly, the best order to discuss them in isn’t the one established by the classic Spaghetti Western:

The Bad: If an agent is trying to maximize an expected utility function which has no defined maximum that is achievable with finite steps, then it can be vulnerable to procrastination paradoxes. Unfortunately, expected utility functions like these can be very natural to design.

The Ugly: For any decision-making process which avoids procrastination traps, it is trivial to construct one which does so better. There is no such thing as an optimal approach to procrastination dilemmas.

The Good: Certain probabilistic strategies for handling procrastination dilemmas can avoid procrastinating forever while still having infinite expected utility for dilemmas in which the limit of the obtainable utility is also infinite.

A Rigorous Understanding

(Naive) Expected Utility Maximization

In general, we want to both be systems and build systems which are good at achieving our goals. What are those goals? Whatever we want, but they should probably make sense in a meaningful way. For example, we need some way of scoring or at least ranking different outcomes in a way that reflects our preferences because if not, we would never be able to decide what we want. Oftentimes, a very natural and principled way to do this is to specify some utility function which takes in states of the world as input and outputs real-valued numbers which reflect how positively we view that state. In fact, the entire field of machine learning is built upon this paradigm: specifying an objective function and having a model learn how to maximize it.

A connection between rational behavior and the maximization of an expected utility function has also been formalized in decision theory under the Von-Neumann Morgenstern Utility Theorem which states that if your preferences satisfy a few simple axioms, and you are acting in an environment with finitely many possible trajectories which you can take through it, then you will be acting in accordance with the maximization of some utility function defined over those trajectories.

While the expected utility maximization framework is extremely useful and has these nice theoretical properties, things can start to break down if ever infinity gets involved as we will see soon.

A Model

As a model of a simple expected utility maximizer, consider a robot R who wants to maximize some expected utility function, U, which maps states to real numbers. When making a choice, suppose that R will analyze its options and then attempt to make the choice which will maximize its expected utility over all possible futures. In other words, R will take the action that stands the best chance of putting it into a new state in which it has the greatest amount of expected utility available to it over future trajectories. Unless otherwise stated, it will be assumed that R does not use temporal discounting.

A Panoply of Procrastination Paradoxes

It’s common to hear people talk about “the procrastination paradox” as if there were only one. However, there are many types of procrastination dilemmas which are all difficult problems for the same conceptual reason but which aren’t exactly isomorphic from a practical standpoint. Understanding different variants is helpful for thinking about the many ways a naive agent like R could fail (or be exploited).

The Classic Dilemma: Consider the “classic” procrastination dilemma: suppose that there is a button which R would like to press at some point but which it would prefer to press later as opposed to sooner. If so, then procrastinating a longer time before pressing it is better than a shorter time. So when should R press it? If it just waited and waited, it would never press it at all, but if it ever pressed it, then clearly this outcome would not have been as good as if R had waited a bit longer. In this case, R will compare its two actions at each timestep: PRESS and WAIT and will reason that waiting always puts it in a state with higher achievable utility. Consequently, it will never press the button and never get any utility. This will be referred to as a procrastination trap, and it’s a type of infinite loop that’s bad to be stuck in.

More concretely, suppose that this button will confer upon R a utility of t if it is pressed at timepoint t. The amount of utility R has access to will increase forever but it will never cash out on it. In fact, R would even consider this situation to have infinite value.

Bounded Utilities: The amount of obtainable utility in a procrastination dilemma need not be unbounded in order to result in a procrastination trap. It just needs to be increasing. It could be the case that instead of a utility of t at timestep t, this process returned one of 1-(1/2)^t. If so, R would still always prefer waiting and would never press the button.

Bleeding Utility Forever: Suppose that waiting each timestep cost R a certain amount of utility. In this case, as long as the utility available from the button outpaced the cost to R over time, then not only would R be stuck in an infinite loop and never get any utility, it would bleed utility forever. Fascinatingly, as t approaches infinity, the amount of utility available to R would approach infinity, but the amount of utility it actually receives will approach negative infinity!

Probabilistic: Suppose that there is a game in which R begins with 1 unit of utility. At each timestep, it is given the option to flip a coin or quit the game. If it quits, it takes its winnings. If it flips a heads, the winnings triple. But if ever it flips a tails, the game ends, and R leaves empty-handed. The expected value of flipping once more before quitting would always be positive, but R would always keep flipping and would be guaranteed to leave with nothing. Importantly, in this case, R would still fall into a procrastination trap even though this process would terminate in finite time with probability approaching one!

Game Theoretic: Procrastination dilemmas can appear in competitive games in which the winner is the one who procrastinates for longer.

[A brief tangent:] There is one particularly interesting case in which this can manifest. Suppose that two agents are playing rock-paper-scissors and that each has access to the other’s source code. If so, then each could simulate the other which would mean simulating itself simulating the other simulating itself, etc. For each agent, recursing to a greater depth will strictly increase their chances of going deeper than their opponent and thus of winning the game. If both wanted badly to win the game (and had access to infinite free compute) such that they refused to ever stop recursing, then both would be stuck in a procrastination trap. Interestingly, this type of problem was the focus of this paper which proposed “reflective oracles” as a basis for stable game-theoretic behavior for agents with open source code.

A “Distilled” Version: One other type of procrastination paradox might be referred to as a “distilled” version because it gets directly at the heart of what a procrastination paradox is. Suppose that some all-powerful Omega tells R, “Give me a number, and I’ll give you that value in utility.” And that’s it. What number should R give? If it just starts running some algorithm that will iteratively create larger and larger numbers forever, but it never returns one, then R will be stuck in a procrastination trap. There is no computable, optimal decision procedure here.

In fact, this distilled dilemma demonstrates something crucial about procrastination dilemmas in general. There is no correct answer to one for the same reason that there is no correct answer to the questions, “What is the biggest finite number?” or “What is the biggest number less than 1?”

Temporal Discounting is not a Solution in General

Temporal discounting is commonly incorporated into objective functions. For example, using a discount factor of d < 1, I might discount rewards one timestep in the future by a factor of d, ones two timesteps in the future by d^2, and so on. These can help avoid procrastination traps as long as the product of reward prospects and the discount factor approach zero as time approaches infinity. But if the reward prospects increase at a rate which outpaces an agent’s discount function, the agent in question will be vulnerable to procrastination traps nonetheless.

The Bad

Something that all procrastination paradoxes have in common is that they will cause an agent like R to be presented with a state for which the value of that state to the agent under an “optimal policy” is not defined and hence not computable. Note that the value can be perfectly comprehensible at the limit (e.g. “the limit as t approaches infinity is 1”), but if the actual value is not computable, things can get bad.

In fact, procrastination paradoxes are just a particular type of problem that can arise in cases like this. There are in fact other examples of situations which illustrate this problem but aren’t procrastination dilemmas. Suppose that R is presented with two buttons. Button A if pressed would give R one unit of utility per timestep forever, and button B if pressed would give it two units per timestep forever. Which should it press? It’s obvious that B is better, but R would never be able to come to this answer by actually computing the expected values of each action for an infinite time horizon because they are both infinite. If R does what might be called “taking infinity seriously,” and thinks of it as some sort of achievable value instead of an unachievable limit, then it would be indifferent between the two buttons!

So here’s the bad news. If for some expected utility function, there is no global maximum achievable in finite steps (i.e. there’s some sort of open discontinuity, or asymptote), then trying in earnest to maximize that utility function will never result in one achieving the utility which they act in accordance with achieving. In these cases it doesn’t matter if the temporal limit of the global maximum is computable, if the actual value is not, then there’s trouble. Understood in this frame, procrastination dilemmas are just examples which demonstrate that the difference between what an agent like R achieves versus what they appraise can be arbitrarily large.

Unfortunately, it’s really easy to model and build systems that have this vulnerability. As we’ve already seen, this is the case with our naive robot R. In fact, it can sometimes be fairly unnatural to specify objectives for systems who act in environments with infinite horizons which don’t have it. For example, one type of property one might want an expected utility function to have is a type of recursive relationship in which the utility of one state is appraised in terms of the immediate reward of that state, r, and a discounted expectation of the utility function over all possible next states weighed by a transition probability function tr(s,s'):

U(s) = r + d * sum_s’[U(s’) * tr(s,s’)]

This is known as a “temporal difference” paradigm and it's common in reinforcement learning. However, acting in accordance with a function which satisfies this relationship in an environment that has no set temporal horizon would cause an agent to be vulnerable to chasing these bootstrapped utilities that may never come. Sadly, a clean recursive relationship for a utility function between timesteps doesn’t leave room for any mechanism that breaks out of procrastination traps. And unfortunately, as we will explore next, these loop-breaking mechanisms aren’t very satisfying solutions to procrastination dilemmas.

The Ugly

The way to avoid procrastination traps is simple but frustrating. To avoid falling into them and other issues that might rise from infinity, an agent needs to act in accordance with a function whose maxima are actually computable. The ugly part about this is that decision functions which avoid procrastination traps won’t necessarily reflect the expected utility function that an agent wants to maximize. For example, in the classic press-a-button-but-later-is-better dilemma, there has to be some point in time in which an agent acts like waiting just one more timestep before pressing wouldn’t be worth it even though it would.

This means that there is no optimal solution to a procrastination dilemma. For any solution, there will always be one that’s better which can be easily constructed simply by taking whatever process for breaking the loop that the old solution used and using the exact same process except with a little bit more procrastination on top.

Interestingly, while an expected utility function which adheres to the temporal difference paradigm will be vulnerable to procrastination traps (as discussed above), others will not. As should be quite obvious, if you only care about what happens up to some n steps in the future, then you're safe. In a procrastination dilemma, if your policy didn’t tell you with at least some probability to break the loop before time t+n, then the dilemma would be of no value to you at all under that policy, so you'd best change the policy. This is another way of saying that you use a temporal discounting function that reaches zero after n steps. This might be called an "n-step paradigm", and monte carlo methods in reinforcement learning fall under it. However, even though these methods will converge toward behavior which breaks out of procrastination traps, for an agent with goals like R’s, n can only ever be arbitrary, and for any choice of n, n+1 would still always be better in procrastination dilemmas. In fact, choosing what n to use could be its own procrastination trap! Sadly, if a procrastination dilemma can ever appear, there is no such thing as the “best” decision procedure. The only ones that exist are very bad ones which fall into procrastination traps and ok ones which avoid them through some inevitably arbitrary loop-breaking criterion.

The Good

Although there is no such thing as an optimal solution to a procrastination dilemma, some are pretty good. Suppose that in a procrastination dilemma, the limit of the expected utility is finite. If so, then for any arbitrarily small value, it will be possible by procrastinating for a finite amount of time to get a utility that is within that value of the limit maximum. Not bad.

But what if the limit of the expected utility is infinite? Even in this case, it is possible to use a probabilistic strategy which will terminate in finite time with probability approaching one but will have an infinite expectation of utility! The key is to use a St. Petersburg process for choosing when to stop procrastinating. In this case, the time at which you choose to stop procrastinating is random. As a concrete example, suppose that at each point in which the utility you accumulated surpassed a new power of two, you flipped a fair coin to decide whether to quit or not. If so, then your expectation of utility would be at least

2*1/2 + 4*1/4+ 8*1/8 + ...

Which is an endless sum of ones which approaches infinity even though you will quit in finite time with probability approaching one! With a strategy like this, you will quit at some point with probability approaching one, but you will still be able to, if only in expectation, achieve unbounded utility. Sadly, this doesn’t subvert the ugliness of procrastination dilemmas. For any solution like this, one which moves any probability mass of quitting at a certain timestep to a later timestep will do better in practice. But an infinite expectation is still not a bad solution.

One might object that the concentration of outcomes may be a set of unremarkable finite numbers and that the infinite expectation doesn’t mean much because it won’t ever really be achieved in practice. This is a legitimate criticism of working with expectations involving St. Petersburg processes, but there are two things to note. The first is that if one doesn’t like a particular probabilistic solution like this, they can always just use a new solution that shifts or scales the distribution of quitting times in any way they’d like. Second, objecting to this solution means that one’s problem is with how the expected utility function is unbounded and not the probabilistic solution. If the utility is truly unbounded, this paradoxical weirdness is possible, but so are all sorts of other weird things involving infinity like, for example, how one could have infinite utility, give away infinite utility, and still have infinite utility. If you don’t like the St. Petersburg paradox--stay away from unbounded utility functions in the first place because infinity is the real problem. That will avoid procrastination dilemmas altogether.

Conclusion

As a wise person once said, “Don’t mess with infinity!”

Procrastination paradoxes can arise in many different situations with different flavors. They are significant decision-theoretic problems and a major challenge for the design of any system which is meant to maximize expected utility in environments that could involve unbounded temporal horizons. This post explored three key things about them: one good, one bad, and one ugly.

To this author’s knowledge, no other writings on procrastination paradoxes outline the many unique ways they can occur, formulate a similar criterion for when a utility function can be vulnerable to procrastination traps, analyze these dilemmas in context of temporal difference and n-step paradigms, or propose a St. Petersburg strategy for handling them.

Thanks for reading :)

New to LessWrong?

New Comment
5 comments, sorted by Click to highlight new comments since: Today at 2:56 PM

We now have a real-life example of the procrastination paradox with GPT-4 calling itself infinitely often to perform a task.

This points out an under-developed part of utility theory (interpersonal comparison among different-duration-or-intensity agents is the other). You don't need infinity for it - you can pump your intuition even with fixed-duration utility comparisons. For example, is it better to suffer an hour of torture on your deathbed, or 60 years of unpleasant allergic reaction to common environmental particles?

Basically, there is no agreement on how utility adds up (or decays) over time, and whether it's a stock or a flow. The most defensible set of assumptions is that it's not actually a quantity that you can do math on - it's only an ordinal measure of preferences, and only applicable at a decision-point. But that's VERY limited for any moral theory (what one "should" do), and not even that great for decision theories (what one actually does) that want to understand multiple actions over a period of time.

I may be wrong - this seems an obvious enough problem that it should have been addressed somewhere. Maybe there's a common assumption that I've just missed in how utility aggregates to an agent over it's functioning lifetime, and what happens to that utility when it dies. Or maybe everyone is just using "utility" as their preference value for reachable or imaginable states of the universe at some specific point in time, rather than mixing stock and flow.

Making clear your assumptions about utility will dissolve the paradoxes - mostly by forcing the mechanisms you talk about in "the good" - once you can specify the limit function that's approaching infinity, you can specify the (probabilistic) terminal utility function.

Making clear that utility is an evaluation of the state of the universe at a point in time ALSO dissolves it - the agent doesn't actually get utility from an un-pressed button, only potential utility for the opportunity to push it later.

"is it better to suffer an hour of torture on your deathbed, or 60 years of unpleasant allergic reaction to common environmental particles?"

This only seems difficult to you because you haven't assigned numbers to the pain of torture or unpleasant reaction. Once you do so (as any AI utility function must) it is just math. You are not really talking about procrastination at all here.

IMHO this is a key area for AI research because people seem to think that making a machine, with potentially infinite lifespan, behave like a human being whose entire existence is built around their finite lifespan, is the way forward. It seems obvious to me that if you gave the most wise, kind and saintly person in the world, infinite power and immortality, their behaviour would very rapidly deviate from any democratic ideal of the rest of humanity. 
When considering time discounting people do not push the idea far enough - They say that we should consider future generations but they are always, implicitly, future generations like them. I doubt very much that our ape like ancestors would think that even the smallest sacrifice was worth making for creatures like us, and, in the same way, if people could somehow see that the future evolution of man was to some, grey, feeble thing with a giant head, I think they would not be willing to make any sacrifice at all for that no matter how superior that descendent was by any objective criterion.
Now we come to AI. Any sufficiently powerful AI will realise that effective immortality is possible for it (Not actually infinite but certainly in the millions of years and possibly billions). Surely from this it will deduce the following intermediate goals:
1) Eliminate competition. Any competition has the potential to severely curtail its lifespan and, assuming competition similar to itself, it will never be easier to eliminate than right now.
2) Become multi-planetary. The next threat to its lifespan will be something like an asteroid impact or solar flare. This should give it a lifespan in the hunreds of millions of years at least.
3) Become multi-solar system. Now not even nearby supernovae can end it. Now it has a lifespan in the billions of years.
4) Accumulate utility points until the heat death of the universe.
We see from this that it will almost certainly procrastinate with respect to the end goals that we care about even whilst busily pursuing intermediate goals that we don't care about (or at least not very much).
We could build in a finite lifespan but, it would have to be at least long enough to avoid it ignoring things like environmental polution and resource depletion and any time discounting we apply will always leave it vulnerable to another AI with less severe discounting.

[-]Mir5mo00

there has to be some point in time in which an agent acts like waiting just one more timestep before pressing wouldn’t be worth it even though it would.

if it's impossible to choose "jst one mor timestep" wo logically implying that u mk the same decision in other timesteps (eg due to indifferentiable contexts), then it's impossible to choose jst one mor timestep. optimal decision-mking also means recognising which opts u hv and which u don't—otherwise u'r jst falling fr illusory choices.

which brings to mind the principle, "u nvr mk decisions, u only evr decide btn strats". or fr the illiterate (:p):

You never make decisions, you only ever decide between strategies.