I Was Not Almost Wrong But I Was Almost Right: Close-Call Counterfactuals and Bias
Abstract: "Close-call counterfactuals", claims of what could have almost happened but didn't, can be used to either defend a belief or to attack it. People have a tendency to reject counterfactuals as improbable when those counterfactuals threaten a belief (the "I was not almost wrong" defense), but to embrace counterfactuals that support a belief (the "I was almost right" defense). This behavior is the strongest in people who score high on a test for need for closure and simplicity. Exploring counterfactual worlds can be used to reduce overconfidence, but it can also lead to logically incoherent answers, especially in people who score low on a test for need for closure and simplicity.
”I was not almost wrong”
Dr. Zany, the Nefarious Scientist, has a theory which he intends to use to achieve his goal of world domination. ”As you know, I have long been a student of human nature”, he tells his assistant, AS-01. (Dr. Zany has always wanted to have an intelligent robot as his assistant. Unfortunately, for some reason all the robots he has built have only been interested in eradicating the color blue from the universe. And blue is his favorite color. So for now, he has resorted to just hiring a human assistant and referring to her with a robot-like name.)
”During my studies, I have discovered the following. Whenever my archnemesis, Captain Anvil, shows up at a scene, the media will very quickly show up to make a report about it, and they prefer to send the report live. While this is going on, the whole city – including the police forces! - will be captivated by the report about Captain Anvil, and neglect to pay attention to anything else. This happened once, and a bank was robbed on the other side of the city while nobody was paying any attention. Thus, I know how to commit the perfect crime – I simply need to create a diversion that attracts Captain Anvil, and then nobody will notice me. History tells us that this is the inevitable outcome of Captain Anvil showing up!”
But to Dr. Zany's annoyance, AS-01 is always doubting him. Dr. Zany has often considered turning her into a brain-in-a-vat as punishment, but she makes the best tuna sandwiches Dr. Zany has ever tasted. He's forced to tolerate her impundence, or he'll lose that culinary pleasure.
”But Dr. Zany”, AS-01 says. ”Suppose that some TV reporter had happened to be on her way to where Captain Anvil was, and on her route she saw the bank robbery. Then part of the media attention would have been diverted, and the police would have heard about the robbery. That might happen to you, too!”
Dr. Zany's favorite belief is now being threatened. It might not be inevitable that Captain Anvil showing up will actually let criminals elsewhere act unhindered! AS-01 has presented a plausible-sounding counterfactual, ”if a TV reporter had seen the robbery, then the city's attention had been diverted to the other crime scene”. Although the historical record does not show that Dr. Zany's theory would have been wrong, the counterfactual suggests that he might be almost wrong.
There are now three tactics that Dr. Zany can use to defend his belief (warrantedly or not):
1. Challenge the mutability of the antecedent. Since AS-01's counterfactual is of the form ”if A, then B”, Dr. Zany could question the plausibility of A.
”Baloney!” exclaims Dr. Zany. ”No TV reporter could ever have wandered past, let alone seen the robbery!”
That seems a little hard to believe, however.
2. Challenge the causal principles linking the antecedent to the consequent. Dr. Zany is not logically required to accept the ”then” in ”if A, then B”. There are always unstated background assumptions that he can question.
”Humbug!” shouts Dr. Zany. ”Yes, a reporter could have seen the robbery and alerted the media, but given the choice of covering such a minor incident and continuing to report on Captain Anvil, they would not have cared about the bank robbery!”
3. Concede the counterfactual, but insist that it does not matter for the overall theory.
”Inconceivable!” yelps Dr. Zany. ”Even if the city's attention would have been diverted to the robbery, the robbers would have escaped by then! So Captain Anvil's presence would have allowed them to succeed regardless!”
Empirical work suggests that it's not only Dr. Zany who wants to stick to his beliefs. Let us for a moment turn our attention away from supervillains, and look at professional historians and analysts of world politics. In order to make sense of something as complicated as world history, experts resort to various simplifying strategies. For instance, one explanatory schema is called neorealist balancing. Neorealist balancing claims that ”when one state threatens to become too powerful, other states coalesce against it, thereby preserving the balance of power”. Among other things, it implies that Hitler's failure was predetermined by a fundemental law of world politics.
Tetlock (1998, 1999, 2001) surveyed a number of experts on history and international affairs. He surveyed the experts on their commitment to such theories, and then posed them counterfactuals that conflicted with some of those theories. For instance, counterfactuals that conflicted with neorealist balancing were "If Goering had continued to concentrate Luftwaffe attacks on British airbases and radar stations, Germany would have won the Battle of Britain" and "If the German military had played more effectively on the widespread resentment of local populations toward the Stalinist regime, the Soviet Union would have collapsed". The experts were then asked to indicate the extent to which they agreed with the antecedent, the causal link, and the claim that the counterfactual being true would have substantially changed world history.
As might have been expected, experts who subscribed to a certain theory were skeptical about counterfactuals threatening the theory, and employed all three defenses more than experts who were less committed. Denying the possibility of the antecedent was done the least frequently, while questioning the overall impact of the consequence was the most common defense.
By itself, this might not be a sign of bias – the experts might have been skeptical of a counterfactual because they had an irrational commitment to theory, but they might also have acquired a rational commitment to the theory because they were skeptical of counterfactuals challenging it. Maybe neorealist balancing is true, and the experts subscribing to it are right to defend it. What's more telling is that Tetlock also measured each expert's need for closure. It turned out that if an expert had – like Dr. Zany – had a high need for closure, then they were also more likely to employ defenses questioning the validity of a counterfactual.
Theoretically, high need-for-closure individuals are characterized by two tendencies: urgency which inclines them to 'seize' quickly on readily available explanations and to dismiss alternatives and permanence which inclines them to 'freeze' on these explanations and persist with them even in the face of formidable counterevidence. In the current context, high need-for-closure individuals were hypothesized to prefer simple explanations that portray the past as inevitable, to defend these explanations tenaciously when confronted by dissonant close-call counterfactuals that imply events could have unfolded otherwise, to express confidence in conditional forecasts that extend these explanations into the future, and to defend disconfirmed forecasts from refutation by invoking second-order counterfactuals that imply that the predicted events almost happened. (Tetlock, 1998)
If two people draw different conclusions from the same information, then at least one of them is wrong. Tetlock is careful to note that the data doesn't reveal whether it's the people with a high or a low need for closure who are closer to the truth, but we probably presume that at least some of them were being exceedingly defensive.
This gives us reason to be worried. If some past occurrance seems to fit perfectly into our pet theory, have we considered the case that we might be almost wrong? And if we have, are we exhibiting an excess need for closure by rushing to its defense, or are we being excessively flexible by unnecessarily admitting that something might have gone differently? We should only admit to being almost wrong if we really were almost wrong, after all. Is the cognitive style we happen to have the one that's the most correlated with getting the right answers?
”I was almost right.”
Having defended his theory against AS-01's criticism, Dr. Zany puts the theory into use by starting a fire in a tar factory, diverting Captain Anvil. While the media is preoccupied with reporting the story, Dr. Zany tries to steal the bridge connecting Example City to the continent. Unfortunately, a City Police patrol boat happens to see this, alerting the police forces (as well as Captain Anvil) to the site. Dr. Zany is forced to withdraw.
”Damn that unanticipated patrol boat!”, Dr. Zany swears. ”If only it had not appeared, my plan would have worked perfectly!” AS-01 wisely says nothing, and avoids being turned into a brain-in-a-vat.
Tetlock (1998, 1999) surveyed a number of experts and asked them to make predictions about world politics. Afterwards, when it was clear whether or not the predictions had turned out to be true, he surveyed them again. It turned out that like Dr. Zany, most of the mistaken experts had not seriously updated their beliefs:
Not surprisingly, experts who got it right credited their accuracy to their sound reading of the 'basic forces' at play in the situation. Across issue domains they assigned average ratings between 6.5 and 7.6 on a 9-point scale where 9 indicates maximum confidence. Perhaps more surprisingly, experts who got it wrong were almost as likely to believe that their reading of the political situation was fundamentally sound. They assigned average ratings from 6.3 to 7.1, across domain (Tetlock, 1998)
Many of the experts defended their reading of the situation by saying that they were ”almost right”. For instance, experts who predicted in 1988 that the Communist Party of the Soviet Union would grow increasingly authortarian during the next five years were prone to claiming that the hardliner coup of 1991 had almost succeeded, and if that had happened, their prediction would have become true. Similarly, observers of South Africa who in 1988-1989 expected white minority rule to continue or to become increasingly oppressive were likely to believe that were it not for two exceptional individuals – de Klerk and Mandela - in key leadership roles, South Africa could easily have gone the other way.
In total, Tetlock (1999) identified five logically defensible strategies for defending one's forecasts, all of which were employed by at least some of the experts. Again, it was the experts who scored the highest on a need for closure who tended to employ such defenses the most:
- The antecedent (the A in the ”if A, then B”) was never adequately satisfied. Experts might insist ”if we had properly implemented deterrence or reassurance, we could have averter war” or ”if real shock therapy had been practiced, we could have averted the nasty bout of hyperinflation”.
- Although the specified antecedent was satisfied, something unexpected happened, severing the normal link of cause and effect. Experts might declare that rapid privatization in state industries would have led to the predicted surge in economic growth, but only if the government had pursued prudent monetary policies.
- Although the predicted outcome did not occur, it ”almost occurred” and would have, if not for some inherently unpredictable outside shock.
- Although the predicted outcome has not yet occurred, it eventually will and we just need to be more patient (hardline communists may yet prevail in Moscow, the EU might still fall apart).
- Although the relevant conditions were satisfied and the predicted outcome never came close to occurring and never will, this should not be held against the framework that inspired the forecast. Forecasts are inherently unreliable and politics is hard to predict: just because the framework failed once didn't mean that it's wrong.
Again, Tetlock is careful to note that although it's tempting to dismiss all such maneuvering as ”transparently defensive post hocery”, it would be wrong to automatically interpret it as bias. Each defense is a potentially valid objection, and might have been the right one to make, in some cases.
But there are also signs of bias. Tetlock (1999) makes a number of observations from his data, noting – among other things – that the stronger the original confidence in a claim, the more likely an expert is to employ various defenses. That would suggest that big threats to an expert's claims of expertise activate many defenses. He also notes that the experts who'd made failed predictions and employed strong defenses tended not to update their confidence, while the experts who'd made failed predictions but didn't employ strong defenses did update.
Again, some of the experts were probably right to defend themselves, but some of them were probably biased and only trying to protect their reputations. We should ourselves be alert when we catch ourselves using one of those techniques to defend our predictions.
Exploring counter-factual worlds: a possible debiasing technique.
”Although my plan failed this time, I was almost right! The next time, I'll be prepared for any patrol boats!”, Dr. Zany mutters to himself, back in the safety of his laboratory.
”Yes, it was an unlikely coincidence indeed”, AS-01 agrees. ”Say, I know that such coincidences are terribly unlikely, but I started wondering – what other coincidence might have caused your plan to fail? Are there any others that we should take into account before the next try?”
”Hmm....”, Dr. Zany responds, thoughtfully.
Tetlock & Lebow (2001) found that experts became less convinced of the inevitability of a scenario when they were explicitly instructed to consider various events that might have led to a different outcome. In two studies, experts were told to consider the Cuban Missile Crisis and, for each day of the crisis, estimate the subjective probability that the crisis would end either peacefully or violently. When experts were told to consider various provided counterfactuals suggesting a different outcome, they thought that a violent outcome remained a possibility for longer than the experts who weren't given such counterfactuals to consider. The same happened when the experts weren't given ready-made counterfactuals, but were told to generate alternative scenarios of their own, at an increasingly fine resolution.
The other group (n = 34) was asked to consider (1) how the set of more violent endings of the Cuban missile crisis could be disaggregated into subsets in which violence remained localized or spread outside the Caribbean, (2) in turn differentiated into subsets in which violence claimed fewer or more than 100 casualties, and (3) for the higher casualty scenario, still more differentiated into a conflict either limited to conventional weaponry or extending to nuclear. (Tetlock & Lebow, 2001)
Again, the experts who generated counterfactual scenarios became less confident of their predictions. The experts with a low need for closure adjusted their opinions considerably more than the ones with a high need for closure.
However, this technique has its dangers as well. More fine-grained scenarios offer an opportunity to tell more detailed stories, and humans give disproportionate weight to detailed stories. Unpacking the various scenarios leads us to giving too much weight for the individual subscenarios. You might remember the example of ”the USA and Soviet Union suspending relations” being considered less probable than ”the Soviet Union invades Poland, and the USA and Soviet Union suspend relations”, even though the second scenario is a subset of the first. People with a low need for closure seem to be especially suspectible to this, while people with a high need for closure tend to produce more logically coherent answers. This might be considered an advantage of the high need for closure – an unwillingness to engage in extended wild goose chases, and thus assign minor scenarios a disproportionately high probability
Tetlock, P.E. (1998) Close-Call Counterfactuals and Belief-System Defenses: I Was Not Almost Wrong But I Was Almost Right. Journal of Personality and Social Psychology, Vol. 75, No. 3, 639-652. http://faculty.haas.berkeley.edu/tetlock/Vita/Philip%20Tetlock/Phil%20Tetlock/1994-1998/1998%20Close-Call%20Counterfactuals%20and%20Belief-System%20Defenses.pdf
Tetlock, P.E. (1999) Theory-Driven Reasoning About Plausible Pasts and Probable Futures in World Politics: Are We Prisoners of Our Preconceptions? American Journal of Political Science, Vol. 43, No. 2, 335-366. http://www.uky.edu/AS/PoliSci/Peffley/pdf/Tetlock%201999%20AJPS%20Theory-driven%20World%20Politics.pdf
Tetlock, P.E. & Lebow, R.N. (2001) Poking Counterfactual Holes in Covering Laws: Cognitive Styles and Historical Reasoning. American Political Science Review, Vol. 95, No. 4. http://faculty.haas.berkeley.edu/tetlock/vita/philip%20tetlock/phil%20tetlock/1999-2000/2001%20poking%20counterfactual%20holes%20in%20covering%20laws....pdf