Today's post, Availability was originally published on 06 September 2007. A summary (taken from the LW wiki):
Availability bias is a tendency to estimate the probability of an event based on whatever evidence about that event pops into your mind, without taking into account the ways in which some pieces of evidence are more memorable than others, or some pieces of evidence are easier to come by than others. This bias directly consists in considering a mismatched data set that leads to a distorted model, and biased estimate.
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Absurdity Heuristic, Absurdity Bias, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.
Lichtenstein et aliōrum research subjects were 1) college students and 2) members of a chapter of the League of Women Voters. Students thought that accidents are 1.62 times more likely than diseases, and league members thought they were 11.6 times more likely (geometric mean). Sadly, no standard deviation was given. The true value is 15.4. Note that only 57% and 79% of students and league members respectively got the direction right, which further biased the geometric average down.
There were some messed up answers. For example, students thought that tornadoes killed more people than asthma, when in fact asthma kills 20x more people than tornadoes. All accidents are about as likely as stomach cancer (well, 1.19x more likely), but they were judged to be 29 times more likely. Pairs like these represent a minority, and subjects were generally only bad at guessing which cause of death was more frequent when the ratio was less than 2:1. These are the graphs from the paper.
The following excerpt is from Judged Frequency Of Lethal Events by Lichtenstein, Slovic, Fischhoff, Layman and Combs.
There were more instructions about relative likelihoods and scales. And there was a glossary to help the people understand some categories.
Note that there was nothing about “old age” anywhere. There is no such thing as “death by old age,” but I’ll risk generalizing from my own example to say that some people think there is. And even those who know there isn’t might think, despite the instructions, “Oh, darnit, I forgot that old people count, too.”
I wish I’d tested myself BEFORE reading the correct answer. As near as I could tell, I would’ve been correct about homicide vs. suicide, but wrong about diseases vs. accidents (“Old people count, too!” facepalm). I wouldn’t even bother guessing the relative frequency. I didn’t have a clue.
When I need to know the number of square feet in an acre, or the world population it takes me seconds to get from the question to the answer. I dutifully spent ~20 minutes googling the CDC website, looking for this. It wasn’t even some heroic effort, but it’s not something I, or most other people, would casually expend on every question that starts with, “Huh, I wonder….” (we should, but we don’t).
As for what I found: I dare you, click on my link and see table 9. (http://www.cdc.gov/NCHS/data/nvsr/nvsr58/nvsr58_19.pdf). Did you? If you did, you would’ve seen that Zubon2 was right in this comment. Accidents win by quite a margin in the 15-44 demographic. I couldn’t find 1978 data, but I’d expect it to be similar (Lichtenstein’s et al tables are no help because they pool all age groups).
I spent the last two hours looking at these tables. Ask me anything! … I won’t be able to answer. Unless I have the CDC tables in front of me, I might not even do much better on Lichtenstein et aliōrum questionnaire than a typical subject (well, at least, I know tornadoes have frequency; measles doesn’t—I’ll get that question right). I suppose that people who haven’t looked at the CDC table are getting all of their information from fragmented reports like “Drive safely! Traffic accidents is the leading cause of death among teenagers who !” or “Buy our drug! is the leading cause of death in over 55!” or “5-star exhaust pipe crash safety rating!” Humans aren’t good at integrating these fragments.
Memory is a bad guide to probability estimates. But what’s the alternative? Should we carry tables around with us?
Personally, I hope that someday data that is already out there in the public domain will be made easily accessible. I hope that finding the relative frequencies of measles-related deaths and tornado-related deaths will be as quick as finding the number of square feet in an acre or the world population, and that political squabble will focus on whether or not certain data should be in the public domain (“You can’t force hospitals to put their data online! That violates the patients’ right to privacy!” “Well, but….”)