# The self-fooling problem.

1 min read10th Oct 201145 comments

# 13

Personal Blog

I formulated a little problem. Care to solve it?

You are given the following information:

Your task is to hide a coin in your house (or any familiar finite environment).
After you've hidden the coin your memory will be erased and restored to a state just before you receiving this information.
Then you will be told about the task (i.e that you have hidden a coin), and asked to try to find the coin.

If you find it you'll lose, but you will be convinced that if you find it you win.

So now you're faced with finding an optimal strategy to minimize the probability of finding the coin within a finite time-frame.
Bear in mind that any chain of reasoning leading up to a decision of location can be generated by you while trying to find the coin.

You might come to the conclusion that there cant exist an optimal strategy other than randomizing. But if you randomize, then you have the risk of placing the coin at a location where it can be easily found, like on a table or on the floor. You could eliminate those risky locations by excluding them as alternatives in your randomization process, but that would mean including a chain of reasoning!

Pingbacks
45 comments, sorted by Highlighting new comments since
New Comment

But if you randomize, then you have the risk of placing the coin at a location where it can be easily found, like on a table or on the floor. You could eliminate those risky locations by excluding them as alternatives in your randomization process, but that would mean including a chain of reasoning!

That doesn't undo the gain from removing the obvious places. As an example, look at the probability-density graphics in http://thevirtuosi.blogspot.com/2011/10/linear-theory-of-battleship.html . Imagine you smooth them out to remove the 'obvious' places (initially high probability). Then you randomly pick, with that probability, each square. How does your other self undo this process?

(You also don't formulate it right. If the memory-wiped self is told that finding the coin wins, and also that a former self placed the coin for him to find, wouldn't he infer that his former self would be trying to help him win and so would begin his search at a Schelling point? He wouldn't even consider trying to beat an adversarial strategy like 'randomize' because he doesn't think there's an adversary!)

If the memory-wiped self is told that finding the coin wins, and also that a former self placed the coin for him to find, wouldn't he infer that his former self would be trying to help him win [...]?

The FINDER should hear the same story as the HIDER, with the only change that "win = find a coin". The finder should also know that the hider received a false information that "win = not find a coin". The finder should also know that the hider received an information that the finder will receive a false information what "win = find a coin", et cetera. (It seems like an infinite chain of information, but both hider and finder only need to receive a finite pattern describing it.)

Now we have a meta-problem: After such explanations, wouldn't both hider and finder suspect that they are given a false information? If yes, should they both expect probability 50% of being lied to? Or is there some asymetry, for example that rewarding "not find a coin" could make more sense to an outside spectator (which makes the rules) than rewarding "find a coin"? A perfectly rational spectator should prevent hider and finder from guessing which one of them is being lied to. But if the chance of both being right or wrong is exactly 50%, why should they even try?

All those are complications that needn't arise with a slightly different formulation. Just imagine that we're talking about someone who for fun decides to put such a challenge to their future self.

Anyway, we can postulate a person who hides something as best as they can, then they erase their own memory, then they decide to locate what they previously hid. Both hider version and searcher version try to do the best they can, because that's the maximum amount of fun for both of them. (The searcher will reintegrate the hider portion of their memories afterwards)

I recall something like this coming up a few times in fiction - someone erases their memory of something because they need not to know it for a time; then, later, must re-discover it.

Thank you.

Go to the bank. Obtain several thousand coins of a denomination equal to the one you are looking for. Mix the important coin in with the rest of the money (WITHOUT first making any effort to remember exact imperfections from this particular coin). Scatter piles of coins in various likely and unlikely places around your house.

They gave you a needle. Make the house a haystack.

In the words of Chesterton's Father Brown:

"Where does a wise man hide a leaf? In the forest. But what does he do if there is no forest?"

"Well, well," cried Flambeau irritably, "what does he do?"

"He grows a forest to hide it in."

Go to the bank. Obtain several thousand coins of a denomination equal to the one you are looking for.

This was the first approach I thought of. Make a haystack for the needle.

This will however allow you to find the haystack.

"Where is the coin? Right here, in the pile of coins."

Your coin finding self could beat your coin hiding self by scooping up all the coins,
bringing them before a judge and arbiter and saying "here it is."

You will likely have found the coin, even if you can not identify it.

If a judge can easily identify the coin, and you have found and presented the coin to the judge,
then it would be reasonable to say the coin is found.

Otherwise, the answer is lost to both the judge and the coin hider, much as if you ground it to a fine dust.

Save yourself the trouble and grind the coin into dust.

Destroy the answer. Make retrieval impossible.

(This was probably intended to be forbidden in the formulation of the problem, but it is not.)

Your coin finding self could beat your coin hiding self by scooping up all the coins, bringing them before a judge and arbiter and saying "here it is."

In that case, lock away the coin and hide the key among thousands of identical keys. Surely a wheelbarrow full of keys would not constitute a coin.

And if identifying the locked chamber (or container, or whatever) counts as finding the coin, then lock a bunch of them after hiding the coin in a randomly selected one.

Trivial solutions for the problem: (Coin hider to wins over the coin finder.)

• Loose the needle - not in a haystack - but in a large stack of needles.
• Grind the coin in to dust. Destroy it's physical basis.

The first makes the coin lose it's uniqueness, the second makes the coin physically irretrievable.

No reason to play by implied rules, rules that only a human would think to follow anyway.

You should randomize over all available hiding spots so that the probability of putting the coin in some spot is proportional to the effort needed to search that spot. This way future you won't have any preferred place to start. I think this strategy maximizes the expected effort spent by future you, assuming he knows your strategy and acts optimally.

You randomize it at the Nash equilibrium. The more obvious the place, the less probability you give to hiding it there.

The Nash equilibrium is based on the idea that your opponent will know your strategy, and fight specifically against it, and it will still be the best strategy. In this case, if you hide it using that probability distribution, no matter how you look for the coin later, you'll be exactly as likely to find it.

Are you allowed to use someone else's brain? If so, you could ask them to hide it.

Easy - just build a seed AI whose only goal is to prevent your future self from finding the coin, and let it FOOM.

Surely nothing can go wrong.

In short: Solve the self-fooling problem by replacing it with a self-FOOMing problem.

It's only a one-bit change (if we keep to lowercase)...

Yep. Just a bit of change in how you hide a bit of change.

Yet another bit of change.

God, this made me do a stereotypical chuck and head backtilt. You win.

If interpreted as it's probably intended, this is basically about maximizing the amount of information lost in the memory-wipe stage. A good strategy might be to choose some large volume that can't easily be scanned -- your lawn if you have a house, the cracks between the floorboards, et cetera -- and use some indexing system to pick random coordinates. Better yet, pick several such volumes and pick randomly between them. Add some random debris to throw off metal detection. Move furniture around to create red herrings. Hire a friend to rearrange your stuff for a day and let them do the hiding. If you live in an environment conducive to it and the reward's big enough, move to a different but internally identical space -- another apartment in the same building, say -- and match the old space's internal configuration as best you're able, with the target left in the old one. There are a lot of options, but most of the reasonably good ones already constitute a near-guarantee of success.

A more creative strategy might be to set up consequences for retrieval which exceed the reward you'd be getting: for example, to hide the coin in your compost pile and then mine it with explosives, with the mined region clearly marked and a deactivation timer set to go off after the deadline has expired. The threat can always be made credible because you have essentially perfect information about the personality making it.

[-][anonymous]10y 7

You also need some better time constraints. If you give me a full day (and a large enough incentive), I can find anything in my home, especially because the annoyingly difficult to reach places (like the inside of the stereo) can't be used to hide the coin without leaving obvious marks (disturbed dust, ...).

If you gave me this task, I'd probably cheat and put the coin into my pile of change.

Well, knowing that, if I was hiding a coin in your house, I'd make sure to disturb the dust and leave marks on the screws in as many places as possible. And I assume you would do the same if it was you hiding the coin.

[-][anonymous]10y 0

Sure, but that would be way too much work. It would take you almost as much time to hide the coin as it would take me to find it. That violates the virtue of laziness.

This reminds me of Berry's Paradox: the most arbitrary hiding place you can think of is by definition not very arbitrary.

Here's a deterministic solution that does at least as well as hiding the coin randomly (I think?). Take the expected amount of time t it would take to find the coin by random search. Write down all the deterministic coin-hiding algorithms you can think of on a piece of paper as fast as you can, starting with the most obvious. Continue until t time units have elapsed, and then use the last algorithm you thought of.

This does assume we're counting the time it takes your future self to compute your hiding algorithm towards the time it takes him/her to find the coin.

[-][anonymous]10y 3

If I extend the finite familiar environment to shops nearby my house, I thought of an interesting idea:

Generate a list of the nearest third party coin counting machines near my house. Like one of those Coin Star machines you see at a Grocery store. Roll a random die and select that element off the list. Deposit the coin along with other coins. If I have time, do this at multiple coinstar machines.

The store manager probably doesn't have the keys. Even if I think of this exact idea, I still have to either break the law and smash the machine (And then it's still a needle in a haystack problem, except the cops are hunting for me.) or socially engineer 1 or more Coin Star employees (Who isn't even in the grocery store, I have no talent at this, And then it's still a needle in a haystack problem.) and that's assuming I picked the RIGHT grocer store.

Even if I think "I would have put it into a randomly chosen third party coin counting change machine." I won't have any way of confirming the theory easily. or even being sure WHICH third party coin counting machine. If I spend hours and hours trying to check one, it could turn out it wasn't even at that grocery store. So it takes me less than an hour to generate this heuristic, but DAYS to solve, if it's even possible for me to do it at all.

I can still apply this reasoning to my house if my backyard is allowed.

Roll dice determine where I actually dig a deep hole. Put the coin in that hole. Disturb my entire yard so that I can't tell the difference. I can dig a single hole and disturb my yard in a much shorter time then I can actually dig out my hole yard.

If I can't leave the house, randomly select a toilet, or sink drain or tub drain and flush it down with water. Disturb all plumbing covers. The coin is probably SOMEWHERE in my houses plumbing. (Plumbing pipes often have catches for that sort of thing.) I have no idea where. It takes a long and effort to disassemble all of the plumbing in my house.

If I have even more time, do MULTIPLE ideas. If my entire backyard has been disturbed and all of my plumbing covers removed, I don't know whether I flushed it or buried it.

If the coin doesn't have to be intact when hidden, but I do have to return all pieces, split the coin into multiple pieces and do multiple ideas.

[-][anonymous]10y 2

But if you randomize, then you have the risk of placing the coin at a location where it can be easily found, like on a table or on the floor.

That's not really a problem--you could just randomize over the set of not-obviously-visible locations. You could even rearrange the furniture and objects in your house such that the rearranged version has more hiding spots than the current arrangement of furniture.

Then you will be told about the task (i.e that you have hidden a coin), and asked to try to find the coin. If you find it you'll lose, but you will be convinced that if you find it you win.

How does this work? If I know that there's evidence that will convince my future self that finding the coin will make me win, then I update on this, and don't believe you when you say that I'll lose if I find the coin. Vice-versa after the memory wipe.

Tell the person in both states that if he finds the coin an arbitrary dog is going to die and the subject will receive \$100. Then just before the subject starts to hide the coin, show the a cute puppy to them. The subject will try to hide the coin very well, and then later, without the memory of the cute puppy, will try to find the coin. Incentives should work out, adjust the animal(child?) and dollar amount to suit the subject.

I've continued asking people about this problem offline, and I'm finding that they are not very good at assuming the least convenient possible world, but they are very good at spotting loopholes in the problem as formulated. Another classic:

"I would put the coin in a sealed envelope. On the envelope I would record the events prior to my memory being wiped. Then I would leave it in plain sight on my nightstand."

Build a box that is so painful to open as to be impossible for any human. Place the coin in that box. Put the box anywhere you please.

Fan of Saw, are we?

Anyway, turn the problem around: instead of saying you win if you find the coin, rather say you meet the high-handed enemy if you don't find the coin.

Alternatively: build a box too painful for a human to open. Leave it on the kitchen table. Hide the coin somewhere else. Laugh at the misfortunes of your future self and/or worst enemy.

Hide the coin somewhere else

Hmmm...But don't the rules (especially as clarified here) specify that your future self will be able replicate any chain of reasoning you use? I doubt a trick like this would work, for the same reason any ordinary hiding place wouldn't work: Your future self would retrace your cognitive steps.

Never seen it actually...

Warning: not for the faint of heart, overly empathetic, pregnant, those who may become pregnant, or anyone with phobias, anxiety disorders or stress disorders of any kind.

That said, I thought it was pretty good. (The first one, anyway- the sequels devolved into gorn.) In case it hasn't become clear from context, the connection is that the movies are famous for portraying very similar situations to what you describe - the key that unlocks the bomb collar is hidden somewhere in a pit filled with hypodermic needles, for instance.

I would hide it in a place that is not too hard to find but really hard to reach so that my future self will stop and think about why I put it there. I (in the future) will then hopefully realize that if the task really was as told then I would have "hidden" the coin at the easiest possible place. Therefore, something about the task is wrong. As the coin is hard to reach, my past self probably doesn't want me to find it. This doesn't make sense unless one of us (past or future self) has been lied to. Having the same mental faculties as I have now, I trust that my future self will make the right decision, which may depend on the specific circumstances.

This approach is quite risky as it assumes that my future self will in fact follow the same line of reasoning I just did, but then again that is kind of the point of this task.

If it's allowed I could of course also just leave a note explicitly saying the above.

You could put it in a safe with a randomized combination.

There are places where the coin could be easily put but from where it's very hard to retrieve it. Flush it into the toilet, throw it in a narrow ventilation shaft... This would be better excluded to leave us with the essence of the problem.

I posed the problem to people at work today, and someone suggested swallowing it, because it takes more than 24 hours to pass through the human digestive system.

[-][anonymous]10y 0

You could eliminate those risky locations by excluding them as alternatives in your randomization process, but that would mean including a chain of reasoning!

Do I know that while I was trying to hide the coin I tought I would lose if I found it later?

Edit: Gwern said it first.

(You also don't formulate it right. If the memory-wiped self is told that finding the coin wins, and also that a former self placed the coin for him to find, wouldn't he infer that his former self would be trying to help him win and so would begin his search at a Schelling point? He wouldn't even consider trying to beat an adversarial strategy like 'randomize' because he doesn't think there's an adversary!)

I think I would write down a list of places where, if I had hidden it there, they would be hard to find even if I knew where to start looking, (my house has a fair number of these,) assign numbers to each, and then roll a die and hide it in the one with the corresponding number.

I'd probably assign individual numbers to options like "hidden amongst the belongings in any drawer in the house which I do not use regularly," and determine which one to use by number assignment and random number generator if I rolled that option.

As some comments has pointed out there are some loopholes in the original formulation, and I will do my best to close these or accept the fact that they're not closeable (which would be interesting in its own right).

Lets try a simpler formulation.. Basically what is being asked here is that given two intelligences A and B, where A and B are identical (perfect copies), can A have a strategy that minimizes the probability of B finding the coin?

Further: Any chain of reasoning leading to a constrained set of available locations followed by randomization could be used by B to constrain the set of locations to search. Is it therefore possible to beat complete randomization?

Further: Any chain of reasoning leading to a constrained set of available locations followed by randomization could be used by B to constrain the set of locations to search. Is it therefore possible to beat complete randomization?

Yes. You need to weight locations according to the time it takes to search them and then make a random selection from that weighted set; that'll give you longer search times on average than an unweighted random pick from a large set where most of the elements take a trivially small time to search. I could take a stab at proving that mathematically, if you're comfortable with some abstraction.

You can beat even that by cleverly exploiting features of the setup, as I and muflax did in our responses to the OP, but that's admittedly not quite in keeping with the spirit of the problem.

I see your point. A reduction of easily searched places will indeed make it more difficult for B to find the coin, even though B will have a smaller space to search. The question that remains is: given a mathematical description of the search/hide-space what probability distribution over locations (randomization process) will minimize the probability of B finding the coin.

And a way to reconcile this with the intuition that the future you will be able to use any chains of reasoning you use now is to recognize that while you and your future self may both use a chain of reasoning that partitions your house into two regions, you will choose the larger region while your future self will search the smaller region first.