Wishful thinking - believing things that make you happy - may be a result of adapting an old cognitive mechanism to new content.

Obvious, well-known stuff

The world is a complicated place.  When we first arrive, we don't understand it at all; we can't even recognize objects or move our arms and legs reliably.  Gradually, we make sense of it by building categories of perceptions and objects and events and feelings that resemble each other.  Then, instead of processing every detail of a new situation, we just have to decide which category it's closest to, and what we do with things in that category.  Most, possibly all, categories can be built using unsupervised learning, just by noting statistical regularities and clustering.

If we want to be more than finite-state automata, we also need to learn how to notice which things and events might be useful or dangerous, and make predictions, and form plans.  There are logic-based ways of doing this, and there are also statistical methods.  There's good evidence that the human dopaminergic system uses one of these statistical methods, temporal difference learning (TD).  TD is a backchaining method:  First it learns what state or action Gn-1 usually comes just before reaching a goal Gn, and then what Gn-2 usually comes just before Gn-1, etc.  Many other learning methods use backchaining, including backpropagation, bucket brigade, and spreading activation.  These learning methods need a label or signal, during or after some series of events, saying whether the result was good or bad.

I don't know why we have consciousness, and I don't know what determines which kinds of learning require conscious attention.  For those that do, the signals produce some variety of pleasure or pain.  We learn to pay attention to things associated with pleasure or pain, and for planning, we may use TD to build something analogous to a Markov process (sorry, I found no good link; and Wikipedia's entry on Markov chain is not what you want) where, given a sequence of the previous n states or actions (A1, A2, ... An), the probability of taking action A is proportional to the expected (pleasure - pain) for the sequence (A1, ... An, A).  In short, we learn to do things that make us feel better.

Less-obvious stuff

Here's a key point which is overlooked (or specifically denied) by most AI architectures:  Believing is an action.  Building an inference chain is not just like constructing a plan; it's the same thing, probably done by the same algorithm.  Constructing a plan includes inferential steps, and inference often inserts action steps to make observations and reduce our uncertainty.

Actions, including the "believe" action, have preconditions.  When building a plan, you need to find actions that achieve those preconditions.  You don't need to look for things that defeat them.  With actions, this isn't much of a problem, because actions are pretty reliable.  If you put a rock in the fire, you don't need to weigh the evidence for and against the proposition that the rock is now in the fire.  If you put a stick in a termite mound, it may or may not come out covered in termites.  You don't need to compute the odds that the stick was inserted correctly, or the expected number of termites; you pull it out and look at the stick.  If you can find things that cause it not to be covered in termites, such as being the wrong sort of stick, it's probably a simple enough cause that you can enumerate it in your preconditions for next time.

You don't need to consider all the ways that your actions could be thwarted until you start doing adversarial planning, which can't happen until you've already started incorporating belief actions into your planning.  (A tiger needs to consider which ways a wildebeest might run to avoid it, but probably doesn't need to model the wildebeest's beliefs and use min-max - at least, not to any significant depth.  Some mammals do some adversarial planning and modelling of belief states; I wouldn't be surprised if squirrels avoid burying their nuts when other squirrels are looking.  But the domains and actors are simpler, so the process shouldn't break down as often as it does in humans.)

When we evolved the ability to make extensive use of belief actions, we probably took our existing plan-construction mechanism, and added belief actions.  But an inference is a lot less certain than an action.  You're allowed to insert a "believe" act into your plan if you're able to find just one thing, belief or action, that plausibly satisfies its preconditions.  You're not required to spend any time looking for things that refute that belief.  Your mind doesn't know that beliefs are fundamentally different from actions, in that the truth-values of the propositions describing the expected effects of your possible actions are strongly, causally correlated with whether you execute the action; while the truth-values of your possible belief-actions are not, and can be made true or false by many other factors.

You can string a long series of actions together into a plan.  If an action fails, you'll usually notice, and you can stop and retry or replan.  Similarly, you can string a long series of belief actions together, even if the probability of each one is only a little above .5, and your planning algorithm won't complain, because stringing a long series of actions together has worked pretty well in your evolutionary past.  But you don't usually get immediate feedback after believing something that tells you whether believing "succeeded" (deposited something in your mind that successfully matches the real world); so it doesn't work.

The old way of backchaining, by just trying to satisfy preconditions, doesn't work well with our new mental content.  But we haven't evolved anything better yet.  If we had, chess would seem easy.

Summary

Wishful thinking is a state-space-reduction heuristic.  Your ancestors' minds searched for actions that would enable actions that would make them feel good.  Your mind, therefore, searches for beliefs that will enable beliefs that will make you feel good.  It doesn't search for beliefs that will refute them.

(A forward-chaining planner wouldn't suffer this bias.  It probably wouldn't get anything done, either, as its search space would be vast.)

New to LessWrong?

New Comment
18 comments, sorted by Click to highlight new comments since: Today at 6:23 PM

The final goal of a plan is a belief, i.e. the belief that state X currently holds. In your representation, this might appear as "X", but semantically it's always "believe(state(X))

If that means what I think it does, I disagree. If you employ enough sense of intentionality to call something a "goal", then a self-referencing intelligence can refer to the difference between X obtaining and it believing X obtains, and choose not to wirehead itself into a useless stupor. This is what JGWeissman was getting at in Maximise Expected Utility, not Expected Perception of Utility.

I stated it poorly. Guess I better rewrite it. In the meantime, see my reply to Yvain below.

... time passes ...

I didn't rewrite it. I deleted it. That whole paragraph about believe(state(X)) contributed nothing to the argument. And, as you noted, it was wrong.

With that paragraph deleted, it was difficult for me (just reading it now) to make the inference connecting your argument to wishful thinking. You might want to spell it out.

I don't think it's because I deleted that paragraph. I think it was just unclear. I rewrote the second half.

Much improved, and accordingly upvoted.

I read this article after you deleted that paragraph, but I had basically the same objection reading "between the lines" of the rest of what you said.

Obviously, any animal that did something like this all the time would die. It's possible that doing it to a limited degree might really happen. Is there a way to test your hypothesis?

What's the "something like this" in your sentence refer to?

Replacing a belief that actually obtains i.e. food, with a belief that actions it is already taking (sitting in place) will obtain it food.

"When we evolved the ability to make extensive use of belief actions, we probably took our existing plan-construction mechanism, and added belief actions."

This is an intriguing and plausible idea. Do you have any proposed mechanism for how we could test this?

I don't know how to test it in humans. I developed a variant of SNePS with SNActor that used a single inference engine both to direct inference, and to make plans, about 24 years ago. But it was poor at identifying the right actions and propositions to think about (and was running on a 66MHz CPU with maybe 8M RAM), so it didn't do either inference or plan construction well, so I couldn't conclude anything from it. It was never worked into the main SNePS code branch, and the code is lost now, unless I have it on an old hard drive.

Definitely makes some sense.

But I didn't understand what you meant in the paragraph starting with "What stops us from just saying..." What does stop us from just saying this, and how come some desires successfully result in action and others result in wishful thinking? Can you predict when wishful thinking would be more likely to occur?

On a similar note, if "the final goal of a plan is a belief", would you expect me to be indifferent between saving the world and taking a pill that caused me to believe that the world was saved, or is that confusing levels?

The algorithm you use to build your plan won't let you believe a step in the plan is successful until you can satisfy its preconditions. The problem is that "satisfy its preconditions" can be done in a one-sided, non-Bayesian manner, which doesn't work as well for inference as for action.

Re. the pill - that's a good question. To avoid taking the pill, you'd need to have a representation that distinguishes between causing X and causing believes(X), from the viewpoint of an outside observer. What I said in the post needs to be revised or clarified to account for this.

Your goal is X, a truth in the external world. When constructing a plan, you operate in a belief space representing the external world's viewpoint. You predict a plan will be successful if, in simulating it, you find it leads to the assertion X within that simulated belief space; not if it leads to finding believes(you, X) there. believes(you, X) in that belief space maps to X in your "root" belief space (which I'll call your mind); X in that belief space maps to X being true in the external world.

Successfully executing that plan would result in finding X in your mind. To an external observer, the X in your mind means believes(you, X), not X. That's because, to that observer, your mind is a belief space, just like the belief space you use when simulating a plan.

To represent the pill-taking action this way:

A = action(eat(me, pill)), precondition(A, have(me, pill)) , consequence(A, goal).

is not right, because that represents that you believe that eating the pill makes X true in the external world.

At first, it appears that it would also be wrong to represent it as

consequence(A, believes(me, goal))

because it appears that eating the pill would cause you to add believes(me, X) instead of X to your knowledge base, whereas you actually will add X.

However! Your inference engine is not the world. The representation in your mind, "consequence(A, believes(me, goal))", is not what the world actually uses to compute the results if you eat the pill. It's easy to forget this, because so often we write simulated worlds where we use one and the same rule set both for our agents to reason with, and also for the simulator to compute the next world state. So it's fine to use this representation.

"Supervised learning" means sensory inputs are presented and paired with indications of the desired associated motor outputs. Just using a reward signal is usually unsupervised: reinforcement learning.

http://en.wikipedia.org/wiki/Supervised_learning

The term "supervised learning" doesn't have to do just with things for which there are motor outputs. If you want to train a system to recognize numbers, and you provide it with 100,000 photographs of handwritten numbers, and each photo is labelled with the number it pictures, that's supervised learning.

The reward signal is like a label. You need an oracle that provides the proper reward signal. Therefore, supervised learning.

You should treat "motor outputs" as a synonym for "actuator signals" in the above comment if it is causing confusion.

Your definition of supervised learning doesn't seem to be the conventional one. Supervised learning is normally contrasted with reinforcement learning:

"Reinforcement learning differs from the supervised learning problem in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected."

As I tried to explain in the post, a complete system that uses some function to generate its own reward signal is unsupervised. If you don't know how that reward signal is generated, and are just looking at the learning done with it, you're looking at a supervised system, which is part of a more-mysterious unsupervised system.

'Unsupervised' is sexier, and people are motivated to bend the term to cover whatever they're working on. But for the purposes of this post, it doesn't matter one bit which term you use.

This all sounds very strange to me. If there is a supervisor - but all they do is use a carrot and a stick - then I think that would generally be classified as reinforcement learning. Supervised learning is where the learner gets given the correct outputs - or is told the right answers.

http://en.wikipedia.org/wiki/Supervised_learning

http://en.wikipedia.org/wiki/Unsupervised_learning

http://en.wikipedia.org/wiki/Semi-supervised_learning

I'm saying that applying carrot/stick is equivalent to saying yes/no.

I deleted the whole paragraph about supervised/unsupervised, since it contributed nothing and was obviously a distraction.