Positive Bias: Look Into the Dark


49


Eliezer_Yudkowsky

I am teaching a class, and I write upon the blackboard three numbers:  2-4-6.  "I am thinking of a rule," I say, "which governs sequences of three numbers.  The sequence 2-4-6, as it so happens, obeys this rule.  Each of you will find, on your desk, a pile of index cards.  Write down a sequence of three numbers on a card, and I'll mark it "Yes" for fits the rule, or "No" for not fitting the rule.  Then you can write down another set of three numbers and ask whether it fits again, and so on.  When you're confident that you know the rule, write down the rule on a card.  You can test as many triplets as you like."

Here's the record of one student's guesses:

4, 6, 2              No
4, 6, 8              Yes
10, 12, 14         Yes

At this point the student wrote down his guess at the rule.  What do you think the rule is?  Would you have wanted to test another triplet, and if so, what would it be?  Take a moment to think before continuing.

The challenge above is based on a classic experiment due to Peter Wason, the 2-4-6 task.  Although subjects given this task typically expressed high confidence in their guesses, only 21% of the subjects successfully guessed the experimenter's real rule, and replications since then have continued to show success rates of around 20%.

The study was called "On the failure to eliminate hypotheses in a conceptual task" (Quarterly Journal of Experimental Psychology, 12: 129-140, 1960).  Subjects who attempt the 2-4-6 task usually try to generate positive examples, rather than negative examples—they apply the hypothetical rule to generate a representative instance, and see if it is labeled "Yes".

Thus, someone who forms the hypothesis "numbers increasing by two" will test the triplet 8-10-12, hear that it fits, and confidently announce the rule.  Someone who forms the hypothesis X-2X-3X will test the triplet 3-6-9, discover that it fits, and then announce that rule.

In every case the actual rule is the same: the three numbers must be in ascending order.

But to discover this, you would have to generate triplets that shouldn't fit, such as 20-23-26, and see if they are labeled "No".  Which people tend not to do, in this experiment.  In some cases, subjects devise, "test", and announce rules far more complicated than the actual answer.

This cognitive phenomenon is usually lumped in with "confirmation bias".  However, it seems to me that the phenomenon of trying to test positive rather than negative examples, ought to be distinguished from the phenomenon of trying to preserve the belief you started with.  "Positive bias" is sometimes used as a synonym for "confirmation bias", and fits this particular flaw much better.

It once seemed that phlogiston theory could explain a flame going out in an enclosed box (the air became saturated with phlogiston and no more could be released), but phlogiston theory could just as well have explained the flame not going out.  To notice this, you have to search for negative examples instead of positive examples, look into zero instead of one; which goes against the grain of what experiment has shown to be human instinct.

For by instinct, we human beings only live in half the world.

One may be lectured on positive bias for days, and yet overlook it in-the-moment.  Positive bias is not something we do as a matter of logic, or even as a matter of emotional attachment.  The 2-4-6 task is "cold", logical, not affectively "hot".  And yet the mistake is sub-verbal, on the level of imagery, of instinctive reactions.  Because the problem doesn't arise from following a deliberate rule that says "Only think about positive examples", it can't be solved just by knowing verbally that "We ought to think about both positive and negative examples."  Which example automatically pops into your head?  You have to learn, wordlessly, to zag instead of zig.  You have to learn to flinch toward the zero, instead of away from it.

I have been writing for quite some time now on the notion that the strength of a hypothesis is what it can't explain, not what it can—if you are equally good at explaining any outcome, you have zero knowledge.  So to spot an explanation that isn't helpful, it's not enough to think of what it does explain very well—you also have to search for results it couldn't explain, and this is the true strength of the theory.

So I said all this, and then yesterday, I challenged the usefulness of "emergence" as a concept.  One commenter cited superconductivity and ferromagnetism as examples of emergence.  I replied that non-superconductivity and non-ferromagnetism were also examples of emergence, which was the problem.  But be it far from me to criticize the commenter!  Despite having read extensively on "confirmation bias", I didn't spot the "gotcha" in the 2-4-6 task the first time I read about it.  It's a subverbal blink-reaction that has to be retrained.  I'm still working on it myself.

So much of a rationalist's skill is below the level of words.  It makes for challenging work in trying to convey the Art through blog posts.  People will agree with you, but then, in the next sentence, do something subdeliberative that goes in the opposite direction.  Not that I'm complaining!  A major reason I'm posting here is to observe what my words haven't conveyed.

Are you searching for positive examples of positive bias right now, or sparing a fraction of your search on what positive bias should lead you to not see?  Did you look toward light or darkness?