The Doubling Box

53jimrandomh

4Oscar_Cunningham

2Giles

0wedrifid

0prase

0Oscar_Cunningham

28TrE

8fubarobfusco

0Mestroyer

3fubarobfusco

0Mestroyer

5OrphanWilde

4tgb

4Grognor

0Mestroyer

19[anonymous]

2Mitchell_Porter

6Vaniver

5Nisan

2Nisan

-1Vaniver

0Giles

10DuncanS

0Mestroyer

0A1987dM

7Vladimir_Nesov

4Kindly

4Pentashagon

0Mestroyer

2Giles

1Kindly

1aaronde

0aaronde

0Mestroyer

4Kindly

1wedrifid

0Mestroyer

2Kindly

0Mestroyer

0Kindly

0Mestroyer

0Kindly

3OrphanWilde

1Mestroyer

1OrphanWilde

0Mestroyer

3Decius

2[anonymous]

2Decius

3RolfAndreassen

9rocurley

3falenas108

2Giles

1tim

1falenas108

2Pentashagon

2pragmatist

2pragmatist

0Mestroyer

0pragmatist

0Mestroyer

2IlyaShpitser

8TrE

2Mestroyer

2IlyaShpitser

0Mestroyer

1weverka

1MileyCyrus

1Manfred

0aaronde

0Sniffnoy

0wedrifid

-1billswift

14selylindi

6wedrifid

0Mestroyer

-1billswift

2rocurley

-4asparisi

5wedrifid

-3asparisi

3wedrifid

-5Mitchell_Porter

0Grognor

New Comment

Some comments are truncated due to high volume. (⌘F to expand all)

This problem makes more sense if you strip out time and the doubling, and look at this one:

Choose an integer N. Receive N utilons.

This problem has no optimal solution (because there is no largest integer). You can compare any two strategies to each other, but you cannot find a supremum; the closest thing available is an infinite series of successively better strategies, which eventually passes any single strategy.

In the original problem, the options are "don't open the box" or "wait N days, then open the box". The former can be crossed off; the latter has the same infinite series of successively better strategies. (The apparent time-symmetry is a false one, because there are only two time-invariant strategies, and they both lose.)

The way to solve this in decision theory is to either introduce finiteness somewhere that caps the number of possible strategies, or to output an ordering over choices instead of a single choice. The latter seems right; if you define and prove an infinite sequence of successively better options, you still have to pick one; and lattices seem like a good way to represent the results of partial reasoning.

4

This is pretty much the only comment in the entire thread that doesn't fight the hypothetical. Well done, I guess?

2

This seems like a helpful simplification of the problem. Note that it also works if you receive 1-1/N utilons, so as with the original post this isn't an unbounded utility issue as such.
Just one point though - in the original problem specification it's obvious what "choose an integer N" means: opening a physical box on day n corresponds to choosing 2^n. But how does your problem get embedded in reality? Do you need to write your chosen number down? Assuming there's no time limit to writing it down then this becomes very similar to the original problem except you're multiplying by 10 instead of 2 and the time interval is the time taken to write an extra digit instead of a day.

0

It doesn't. It gets embedded in something with infinite time and in which infinite utility can be given out (in infinite different denominations).

0

Writing decimal digits isn't the optimal way to write big numbers. (Of course this doesn't invalidate your point.)

0

It kind of is if you have to be able to write down any number.

I've never met an infinite decision tree in my life so far, and I doubt I ever will. It is a property of problems with an infinite solution space that they can't be solved optimally, and it doesn't reveal any decision theoretic inconsistencies that could come up in real life.

Consider this game with a tree structure: You pick an arbitrary natural number, and then, your opponent does as well. The player who chose the highest number wins. Clearly, you cannot win this game, as no matter which number you pick, the opponent can simply add one to that number. This also works with picking a positive rational number that's closest to 1 - your opponent here adds one to the denominator and the numerator, and wins.

The idea to use a busy beaver function is good, and if you can utilize the entire universe to encode the states of the busy beaver with the largest number of states possible (and a long enough tape), then that constitutes the optimal solution, but that only takes us further out into the realm of fiction.

8

"You are finite. Zathras is finite. This utility function has infinities in it. No, not good. Never use that."
— Not Babylon 5

0

But I do not choose my utility function as an means to get something. My utility function describes is what I want to choose means to get. And I'm pretty sure it's unbounded.

3

You've only expended a finite amount of computation on the question, though; and you're running on corrupted hardware. How confident can you be that you have already correctly distinguished an unbounded utility function from one with a very large finite bound?
(A genocidal, fanatical asshole once said: "I beseech you, in the bowels of Christ, think it possible that you may be mistaken.")

0

I do think it possible I may be mistaken.

5

The tax man's dilemma, an infinite decision tree grounded in reality:
Assume you're the anthropomorphization of government. And you have a decision to make: You need to decide the ideal tax rate for businesses.
In your society, corporations reliably make 5% returns on investments, accounting for inflation. That money is reliably reinvested, although not necessarily in the same corporation.
How should you tax those returns in order to maximize total utility? You may change taxes at any point. Also, you're the anthropomorphic representation of government - you are, for all intents and purposes, immortal.
Assume a future utility discount rate of less than the investment return rate, and assume you don't know the money-utility relationship - you can say you weigh the possibility of future disasters which require immense funds against the possibility that money has declining utility over time to produce a constant relationship for simplicity, if you wish. Assume that your returns will be less than corporate returns, and corporate utility will be less than your utility. (Simplified, you produce no investment returns, corporations produce no utility.)

4

I never saw an infinite decision tree
I doubt I'll ever see one
But if it weren't imaginary
I'd rather like to play one
Hmm I guess I see now why so few classic poets used decision theory as their inspiration.

4

I like this. I was going to say something like,
but you said the same thing, and less rudely, so upvoted.

0

If it's impossible to win, because your opponent always picks second, then every choice is optimal.
If you pick simultaneously, picking the highest number you can describe is optimal, so that's another situation where there is no optimal solution for an infinite mind, but for a finite mind, there is an optimal solution.

After considering this problem, what I found was surprisingly fast, the specifics of the boxes physical abilities and implementation becomes relevant. I mean, let's say Clippy is given this box, and has already decided to wait a mere 1 year from day 1, which is 365.25 days of doubling, and 1 paperclip is 1 utilon. At some point, during this time, before the end of it, There are more paperclips then there used to be every atom in the visible universe. Since he's predicted to gain 2^365.25 paperclips, (which is apparently close to 8.9*10^109) and the observable universe is only estimated to contain 10^80 atoms. So to make up for that, let's say the box converts every visible subatomic particle into paperclips instead.

That's just 1 year, and the box has already announced it will convert approximately every visible subatomic particle into pure paperclip bliss!

And then another single doubling... (1 year and 1 day) Does what? Even if Clippy has his utility function unbounded, it should presumably still link back to some kind of physical state, and at this point the box starts having to implement increasingly physically impossible ideas to have to double paperclip utility, like:

Breaking t...

2

This is a thought experiment which is not meant to be possible in our world. But such thought experiments are a way of testing the generality of your decision procedures - do they work in all possible worlds? If you must imagine a physics that makes the eternal doubling possible, try picturing a network of replicating baby universes linked by wormholes.

6

As in the old saw, part of your strength as a real decision-maker is that your decision procedures choose less well in impossible worlds than in possible worlds.

5

A world that can support paperclip production of arbitrary magnitude is not an impossible world. The speed of light is a contingent fact.

2

Why does that have to be true?

-1

It doesn't have to be true. It's desirable because decision procedures that rely on other knowledge about reality are faster/better/cheaper than ones that don't import knowledge about reality. Specialization for the situation you find yourself in is often useful, though it does limit flexibility.

0

Utility doesn't have to be proportional to the amount of some particular kind of physical stuff in the universe. If the universe contained 1 paperclip, that could be worth 2 utilons, if it contained 2 paperclips then it could be worth 4 utilons, if it contained 20 paperclips then it could be worth 2^20 utilons. The box would then double your utility each day just by adding one physical paperclip.
I still think these kinds of considerations are worth thinking about though. Your utility function might grow faster than a busy beaver function, but then the doubling box is going to have trouble waiting the right length of time to deliver the

Your other option is to sell the box to the highest bidder. That will probably be someone who's prepared to wait longer than you, and will therefore be able to give you a higher price than the utilons you'd have got out of the box yourself. You get the utilons today.

0

I don't think you are a wannabe Captain Kirk. I think you are Captain Kirk.

0

Why does my fight-the-hypothetical module never think about that? (It does often think about options which wouldn't be available in the Least Convenient Possible World -- but not this one, until someone else points it out.)

If you can use mixed strategies (i.e. are not required to be deterministically predictable), you can use the following strategy for the doubling-utility case: every day, toss a coin; if it comes up heads, open the box, otherwise wait another day. Expected utility of each day is constant 1/2, since the probability of getting heads on a particular day halves with each subsequent day, and utility doubles, so the series diverges and you get infinite total expected utility.

4

Even better, however, would be to toss two coins every day, and only open the box if both come up heads :)

4

This suggests a strategy; tile the universe with coins and flip each of them every day. If they all come up heads, open the box (presumably it's full of even more coins).

0

better yet, every day count one more integer toward the highest number you can think of, when you reach it, flip the coins. If they don't all come up heads, start over again.

2

That way your expected utility becomes INFINITY TIMES TWO! :)

1

There are meaningful ways to compare two outcomes which both have infinite expected utility. For example, suppose X is your favorite infinite-expected-utility outcome. Then a 20% chance of X (and 80% chance of nothing) is better than a 10% chance of X. Something similar happens with tossing two coins instead of one, although it's more subtle.

1

Actually what you get is another divergent infinite series that grows faster. They both grow arbitrarily large, but the one with p=0.25 grows arbitrarily larger than the series with p=0.5, as you compute more terms. So there is no sense in which the second series is twice as big, although there is a sense in which it is infinitely larger. (I know your point is that they're both technically the same size, but I think this is worth noting.)

0

This is what I was going to say; it's consistent with the apparent time symmetry, and is the only solution that makes sense if we accept the problem as stated. But it seems like the wrong answer intuitively, because it means that every strategy is equal, as long as the probability of opening the box on a given day is in the half-open interval (0,0.5]. I'd certainly be happier with, say, p=0.01 than p=0.5, (and so would everyone else, apparently) which suggests that I don't actually have a real-valued utility function. This might be a good argument against real-valued utility functions in general (bounded or not). Especially since a lot of the proposed solutions here "fight the hypothetical" by pointing out that real agents can only choose from a finite set of strategies.

0

So you can have infinite expected utility, but be guaranteed to have finite utility? That is weird.

So I don't really know how utilons work, but here is an example of a utility function which is doubling box-proof. It is bounded; furthermore, it discounts the future by changing the bound for things that only affect the future. So you can get up to 1000 utilons from something that happens today, up to 500 utilons from something that happens tomorrow, up to 250 utilons from something that happens two days from now, and so on.

Then the solution is obvious: if you open the box in 4 days, you get 16 utilons; if you open the box in 5 days, you'd get 32 but you...

1

You are right that such a utility function cannot be supplied with a (utility) doubling box problem---and for much the same reason that most utility functions that approximate human preferences could not be exposed to a doubling box. Nevertheless this amounts to refusing to engage with the game theory example rather than responding to it.

0

This thwarts the original box, but I just edited the OP to describe another box that would get this utility function in trouble.

2

So you're suggesting, in my example, a box that approaches 500 utilons over the course of a day, then disappears?
This isn't even a problem. I just need to have a really good reaction time to open it as close to 24 hours as possible. Although at some point I may decide that the risk of missing the time outweighs the increase in utilons. Anyway this isn't even a controversial thought experiment in that case.

0

I thought you would realize I was assuming what I did for the case with the utility function that discounts completely after a certain time: "Suppose you can think as fast as you want, and open the box at arbitrary speed."
But if your utility function discounts based on the amount of thinking you've done, not on time, I can't think of an analogous trap for that.

0

So, ideally, these utility functions wouldn't be arbitrary, but would somehow reflect things people might actually think. So, for example, if the box is only allowed to contain varying amounts of money, I would want to discount based on time (for reasons of investment if nothing else) and also put an upper bound on the utility I get (because at some point you just have so much money you can afford pretty much anything).
When arbitrary utilons get mixed in, it becomes complicated, because I discount different ways to get utility at a different rate. For instance, a cure for cancer would be worthless 50 years from now if people figured out how to cure cancer in the meantime already, at which point you'd total up all the casualties from now until then and discount based on those. This is different from money because even getting a dollar 100 years from now is not entirely pointless.
On the other hand, I don't think my utility function discounts based on the amount of thinking I've done, at least not for money. I want to figure out what my true response to the problem is, in that case (which is basically equivalent to the "You get $X. What do you want X to be?" problem). I think it's that after I've spent a lot of time thinking about it and decided X should be, say, 100 quadrillion, which gets me 499 utilons out of a maximum of 500, then making the decision and not thinking about it more might be worth more than 1 utilon to me.

0

Now you're just dodging the thought experiment by changing it.

0

In a way, yes. I'm trying to cleanly separate the bits of the thought experiment I have an answer to, from the bits of the thought experiment I don't have an answer to.

Am I correct in assessing that your solution is to stop when you can no longer comprehend the value in the box? That is, when an additional doubling has no subjective meaning to you? (Until that point, you're not in a state loop, as the value with each doubling provides an input you haven't encountered before.)

I was about to suggest stopping when you have more utilons than your brain has states (provided you could measure such), but then it occurred to me the solutions might be analogous, even if they arrive at different numbers.

1

I wouldn't want to stop when I couldn't comprehend what was in the box. I always want more utility, whether I can understand that I have it or not. My solution is to wait as long as you can before waiting any longer puts you in an infinite loop and guarantees you will never get any.

1

As long as you comprehend the number in the box, you're not in an infinite loop. The input is different. Once the number is no longer meaningful, you're potentially in an infinite loop; the input is the same.

0

I'm pretty sure I could stay out of an infinite loop much longer than I could comprehend what was in the box. The contents of the box are growing exponentially with the number of days. If I just count the number of days, I can stay in the realm of small numbers much longer.

I wait until there are so many utiltons in the box that I can use them to get two identical boxes and have some utiltions left over. Every time a box has more than enough utilitons to make two identical boxes, I repeat that step. Any utilitons not used to make new boxes are the dividend of the investment.

2[anonymous]

Now that you mention that, that's true, and it gives me several other weird ideas. The box gives you tokens that you exchange for utilons, which seem like they are supposed to be defined as "Whatever you want/define them to be, based on your values."
Ergo, imagine a Happy Michaelos that gets about twice as much positive utilons from everything compared to Sad Michaelos. Sad Michaelos gets twice as much NEGATIVE utilons from everything compared to Happy Michaelos.
Let's say a cookie grants Happy Michaelos 1 utilons. It would take two cookies to grant Sad Michaelos 1 utilons. Let's say a stubbed toe grants Sad Michaelos -1 utilons. It would take two stubbed toes to grant Happy Michaelos -1 utilons.
So if Happy Michaelos or Sad Michaelos gets to open the box and they are friends who substantially share utility and cookies... It should be Sad Michaelos who does so (both will get more cookies that way.)
As far as I can tell, this is a reasonable interpretation of the box.
So, I should probably figure out how the people below would work, since they are increasingly unreasonable interpretations of the box:
Extremely Sad Michaelos:
Is essentially 1 million times worse off than Sad Michaelos. Ergo, it the logic above holds, Extremely Sad Michaelos gets 2 million cookies from turning in a single token.
Hyper Pout Michaelos:
Is essentially 1 billion times worse off than Sad Michaelos. He also has a note in his utility function that he will receive -infinity(aleph 0) utilons if he does not change his utility function back to Sad Michaelos's utility function within 1 second after the box is powerless and he has converted all of his tokens. If the logic above holds, Hyper Pout Michaelos gets 1 billion times more cookies than Sad Michaelos, and then gets to enjoy substantially more utilons from them!
Omnidespairing Michaelos:
Is almost impossible to grant utilons to. The certainty of omnipotence grants him 1 utilon. Everything else that might be positive (say, a 99% cha

2

If changing my utility function has expected positive results, based both on my current utility function and in the proposed change, then...
Here the problem is that the utilon is not a unit that can be converted into any other unit, including physical phenomena.

What if instead of growing exponentially without bound, it decays exponentially to the bound of your utility function?

I think you mean 'asymptotically'.

9

Exponentially decaying functions are part of the general class of asymptotic functions; both versions are correct.

I way to think about this problem to put you in near mode is to imagine what the utility might look like. Ex:

Day 1: Finding a quarter on the ground

Day 2: A child in Africa getting $5

.....

Day X: Curing cancer

Dax X+1: Curing cancer, Alzheimers, and AIDS.

On one hand, by waiting a day, more people would die of cancer. On the other, by not waiting, you'd doom all those future people to die of AIDS and Alzheimers.

2

Suppose instead of multiplying the utility by 2 each day, the box multiplied the utility by 1. Would it look like this?
Day 1: Curing cancer
Day 2: Curing cancer
Day 3: Curing cancer ...
Probably not - each of those "curing cancer" outcomes is not identical (cancer gets cured on a different day) so you'd assign them different utilities. In order to conform to the specification, the box would have to add an extra sweetener each day in order to make up for a day's worth of cancer deaths.

1

You are adding a condition that was not present in the original problem. Namely, that every day you do not open the box, you lose some number of utilions.

1

Whoops, you're right.

How exactly do the constant utilons in the box compensate me for how I feel the day *after* I open the box (I could have doubled my current utility!)? The second day after (I could have quadrupled my current utility!!)? The Nth day after (FFFFFFFFFFFFUUUUU!!!)? I'm afraid the box will rewrite me with a simple routine that says "I have 2^(day-I-opened-the-box - 1) utility! Yay!"

If you return to a state you have already been at, you know you are going to be waiting forever and lose and get nothing.

You seem to be assuming here that returning to a state you have already been at is equivalent to looping your behavior, so that once a Turing machine re-enters a previously instantiated state it cannot exhibit any novel behavior. But this isn't true. A Turing machine can behave differently in the same state provided the input it reads off its tape is different. The behavior must loop only if the the combination of Turing machine state...

You have given reasons why requiring bounded utility functions and discounting the future are not adequate responses to the problem *if considered individually*. But your objection to the bounded utility function response assumes that future utility isn't discounted, and your objection to the discounting response assumes that the utility function is unbounded. So what if we require both that the utility function must be bounded *and* that future utility must be discounted exponentially? Doesn't that get around the paradox?

...I remember reading a while ago about

0

If your utility function is bounded and you discount the future, then pick an amount of time after now, epsilon, such that the discounting by then is negligible. Then imagine that the box disappears if you don't open it by then. at t = now + epsilon * 2^-1, the utilons double. At 2^-2, they double again. etc.
But if your discounting is so great that you do not care about the future at all, I guess you've got me.
This isn't the St. Petersburg paradox (though I almost mentioned it) because in that, you make your decision once at the beginning.

0

Perhaps I am misinterpreting you, but I don't see how this scheme is compatible with a bounded utility function. For any bound n, there will be a time prior to epsilon where the utilons in the box will be greater than n.
When you say "At 2^-2...", I read that as "At now + epsilon 2^-1 + epsilon 2^-2...". Is that what you meant?

0

yeah, that's what I meant. Also, instead of doubling, make it so they exponentially decay toward the bound.

If you know the probability distribution P(t) of you dying on day t, then you can solve exactly for optimal expected lifetime utilons out of the box. If you don't know P(t), you can do some sort of adaptive estimation as you go.

8

Note that the problem is making the counterfactual assumption that

2

P(t) = 0.

2

Why is this an interesting problem?

0

I'm not sufficiently familiar with my own internal criteria of interesting-ness to explain to you why I find it interesting. Sorry you don't as well.

- Open the box when and iff you need the utilon. This simple solution gets you the maximum utilons if you need a utilon and none if you don't need a utilon.

You could build a machine that opens the box far in the future, at the moment when the machine's reliability starts degrading faster than the utilons increase. This maximizes your expected utility.

Or if you're not allowed to build a machine, you simply do the same with yourself (depending on our model, possibly multiplying by your expected remaining lifespan).

Bringing together what others have said, I propose a solution in three steps:

Adopt a mixed strategy where, for each day, you open the box on that day with probability p. The expected utility of this strategy is the sum of (p (1-p)^n 2^n), for n=0... which diverges for any p in the half-open interval (0,0.5]. In other words, you get infinite EU as long as p is in (0,0.5]. This is paradoxical, because it means a strategy with a 0.5 risk of ending up with only 1 utilon is as good as any other.

Extend the range of our utility function to a number syste

There is a good reason to use a bounded utility function, note -- if the conditions of Savage's theorem hold, the utility function you get from it is bounded.

How long do you wait before opening it? If you never open it, you get nothing (you lose! Good day, sir or madam!) and whenever you take it, taking it one day later would have been twice as good.

When do I "lose" precisely? When I never take it? By happy coincidence 'never' happens to be the very next day after I planned to open the box!

[This comment is no longer endorsed by its author]

There are no other ways to get utilons.

Is a weakness in your argument. Either you can survive without utilons, a contradiction to utility theory, or you wait until your "pre-existing" utilons are used up and you need more to survive.

This suggests a joke solution: Tell people about the box, then ask them for a loan which you will repay with proceeds from the box. Then you can live off the loan and let your creditors worry about solving the unsolvable.

6

Utilons don't need to be associated with survival. Survival can be a mere instrumental good used to increase the amount of actual utilons generated (by making, say, paperclips). I get the impression that you mean something different by the word than what the post (and the site) mean.

0

What's wrong with not having any more reason to live after you get the utilons?

-1

I see you found yet another problem, with no way to get more utilons you die when those in the box are used up. And utility theory says you need utility to live, not just to give you reason to live.

2

This is contrary to my understanding. Can you explain, please?

If I am actually immortal, and *there is no other way to get Utilions* then each day, the value of me opening the box is something like:

Value=Utilions/Future Days

Since my Future Days are supposedly infinite, we are talking about *at best* an infinitesimal difference between me opening the box on Day 1 and me opening the box on Day 3^^^^3. There is no actual wrong day to open the box. If that seems implausible, it is because the hypothetical itself is implausible.

5

The expected value of opening the box is:
Value=Utilons
That is all. That number already represents how much value is assigned to the state of the universe given that decision. Dividing by only future days is an error. Assigning a different value to the specified reward based on whether days are in the past or the future changes the problem.

-3

Presumably, if Utilions are useful at all, then you use them. Usually, this means that some are lost each day in the process of using them.
Further, unless the Utilions represent some resource that is non-entropic, then I will lose some number of Utilions each day even if they aren't lost by me using them. This works out to the same answer in the long run.
Let's assume we have an agent Boxxy, an immortal AI whose utility function is that opening the box tomorrow is twice as good as opening it today. Once he opens the box, his utility function assigns that much value to the universe. Let's assume this is all he values. (This gets us around a number of problems for the scenario.)
Even in this scenario, unless Boxxy is immune to entropy, some amount of information (and thus, some perception of utility) will be lost over time. Over a long enough time, Boxxy will eventually lose the memory of opening the Box. Even if Boxxy is capable of self-repair in the face of entropy, unless Boxxy is capable of actually not undergoing entropy, some of the Box-information will be lost. (Maybe Boxxy hopes that it can replace it with an identical memory for its utility function, although I would suspect at that point Boxxy might just to decide to remember having opened the Box at a nearer future date) Eventually, Boxxy's memory and thus, Boxxy's Utilions, will either be completely artifiical with at best something like a causal relationship to previous memory states of opening the box, or Boxxy will lose all of its Utilions.
Of course, Boxxy might never open the box. (I am not a superintelligence obsessed with box opening. I am a human intelligence obsessed with things that Boxxy would find irrelevant. So I can only guess as to what a box-based AGI would do.) In this case, the Utilions won't degrade, but Boxxy can still expect a value of 0 in this case.
Frankly, the problem is hard to think about at that level, because real immortality (as the problem requires) would require somewa

3

No. Those are resources.

Let's say you have a box that has a token in it that can be redeemed for 1 utilon. Every day, its contents double. There is no limit on how many utilons you can buy with these tokens. You are immortal. It is sealed, and if you open it, it becomes an ordinary box. You get the tokens it has created, but the box does not double its contents anymore. There are no other ways to get utilons.

How long do you wait before opening it? If you never open it, you get nothing (you lose! Good day, sir or madam!) and whenever you take it, taking it one day later would have been twice as good.

I hope this doesn't sound like a reductio ad absurdum against unbounded utility functions or not discounting the future, because if it does you are in danger of amputating the wrong limb to save yourself from paradox-gangrene.

What if instead of growing exponentially without bound, it decays exponentially to the bound of your utility function? If your utility function is bounded at 10, what if the first day it is 5, the second 7.5, the third 8.75, etc. Assume all the little details, like remembering about the box, trading in the tokens, etc, are free.

If you discount the future using any function that doesn't ever hit 0, then the growth rate of the tokens can be chosen to more than make up for your discounting.

If it does hit 0 at time T, what if instead of doubling, it just increases by however many utilons will be adjusted to 1 by your discounting at that point every time of growth, but the intervals of growth shrink to nothing? You get an adjusted 1 utilon at time T - 1s, and another adjusted 1 utilon at T - 0.5s, and another at T - 0.25s, etc? Suppose you can think as fast as you want, and open the box at arbitrary speed. Also, that whatever solution your present self precommits to will be followed by the future self. (Their decision won't be changed by any change in what times they care about)

EDIT: People in the comments have suggested using a utility function that is both bounded and discounting. If your utility function isn't so strongly discounting that it drops to 0 right after the present, then you can find some time interval very close to the present where the discounting is all nonzero. And if it's nonzero, you can have a box that disappears, taking all possible utility with it at the end of that interval, and that, leading up to that interval, grows the utility in intervals that shrink to nothing as you approach the end of the interval, and increasing the utility-worth of tokens in the box such that it compensates for whatever your discounting function is exactly enough to asymptotically approach your bound.

Here is my solution. You can't assume that your future self will make the optimal decision, or even a good decision. You have to treat your future self as a physical object that your choices affect, and take the probability distribution of what decisions your future self will make, and how much utility they will net you into account.

Think if yourself as a Turing machine. If you do not halt and open the box, you lose and get nothing. No matter how complicated your brain, you have a finite number of states. You want to be a busy beaver and take the most possible time to halt, but still halt.

If, at the end, you say to yourself "I just counted to the highest number I could, counting once per day, and then made a small mark on my skin, and repeated, and when my skin was full of marks, that I was constantly refreshing to make sure they didn't go away...

...but I could let it double one more time, for more utility!"

If you return to a state you have already been at, you know you are going to be waiting forever and lose and get nothing. So it is in your best interest to open the box.

So there is not a universal optimal solution to this problem, but there is an optimal solution for a finite mind.

I remember reading a while ago about a paradox where you start with $1, and can trade that for a 50% chance of $2.01, which you can trade for a 25% chance of $4.03, which you can trade for a 12.5% chance of $8.07, etc (can't remember where I read it).

This is the same paradox with one of the traps for wannabe Captain Kirks (using dollars instead of utilons) removed and one of the unnecessary variables (uncertainty) cut out.

My solution also works on that. Every trade is analogous to a day waited to open the box.