May 28, 2011
Strategic Reliabilism is an epistemological framework that, unlike other contemporary academic theories, is grounded in psychology and seeks to give genuine advice on how to form beliefs. The framework was first laid out by Michael Bishop and J.D. Trout in their book Epistemology and the Psychology of Human Judgment. Although regular readers here won’t necessarily find a lot of new material here, Bishop and Trout provide a clear description of many of the working assumptions and goals of this community. In contrast to standard epistemology, which seeks to explain what constitutes a justified belief, Strategic Reliabilism is meant to explain excellent reasoning. In particular, reasoning is excellent to the extent it reliably and efficiently produces truths about significant matters. When combined with the Aristotelian principle that good reasoning tends to produce good outcomes in the long run (i.e. rationalists should win), empirical findings about good reasoning gain prescriptive power. Rather than getting bogged down in definitional debates, epistemology really is about being less wrong.
The book is an easily read 150 pages, and I highly recommend you find a copy, but a chapter-by-chapter summary is below. As I said, you might not find a lot of new ideas in this book, but it went a long ways in clarifying how I think about this topic. For instance, even though it can seem trivial to be told to focus on significant problems, these basic issues deserve a little extra thought.
If you enjoy podcasts, check out lukeprog’s interview with Michael Bishop. This article provides another overview of Strategic Reliabilism, addressing objections raised since the publication of the book.
Epistemology as a discipline needs to start offering practical advice. Defective epistemologies can compromise one’s ability to act in all areas, but there is little social condemnation of weak reasoning. Prescriptive epistemology might be called “critical thinking”, but this field is divorced from contemporary epistemology. This book is driven by a vision of what epistemology could be, although, of course, is only a modest first step in that direction.
Standard Analytic Epistemology (SAE) is primarily concerned with an account of knowledge and epistemic justification. This program assumes any account of justification must not radically alter our existing judgments, though this commitment to stasis is often not explicitly stated. SAE tends to proceed unproductively via tests against intuition and might provide some useful advice, but it’s goals and methods are beyond repair.
In the authors’ view, epistemology is a branch of philosophy of science and should start from Ameliorative Psychology, which encompasses parts of cognitive science, the heuristics and biases program, statistics, and artificial intelligence. This field makes recommendations about how to reason based on empirical findings. Rather than being concerned with an account of knowledge or warrant, the authors’ approach is to provide an account of reasoning excellence.
A healthy epistemological tradition will have theoretical, practical, and social components. Theory and practice should mutually inform one another. Communication of practical results to the wider public would be part of its social role. Since the authors argue epistemology is a science, but nevertheless normative, one might worry the approach is circular. This concern assumes the normative must come all at once, though. Instead, one can rely on the Aristotelian Principle as an empirical hook to accumulate justification. The Aristotelian Principle says that in the long run, good reasoning tends to lead to better outcomes than poor reasoning. Why accept the Principle? Without it, epistemology wouldn’t be useful. If bad reasoning leads to better outcomes and if there are many types of bad reasoning, how could we figure out which bad reasoning to use, except by good reasoning? We have reason to think useful epistemology is possible since we live in a stable enough environment where quality has a chance to make a difference.
Judgments are an essential part of life. Choices of whether to release a prisoner on parole, admit someone to medical school, or offer a loan are too important to be “close enough”. Only the best reasoning strategies available to us are satisfactory. Statistical prediction rules are robustly successful in these and many other high-stakes areas. In 136 studies comparing proper linear models to expert judgment, 64 clearly favored the SPR, 64 showed statistically equivalent accuracy, and 8 favored the expert 1. SPRs built explicitly to mimic experts’ judgments are more reliable than the expert, suggesting some errors are due to making exceptions to one’s own rules.
Improper linear models with unit or even random weights on standardized variables do surprisingly well. Qualitative human judgment can always be used as an input to an SPR or used to select variables and the direction of their effect. The flat maximum principle says that as long as the sign on coefficients is correct, all linear models do approximately the same. This principle applies when the problem is difficult and the inputs are reasonable predictive and redundant. Summing together inputs can be viewed as exploiting Condorcet’s jury theorem. Linear models tend to work when inputs interact monotonely, which appears to be the case in most social situations.
All this is not to say SPRs are especially good, but that humans are especially bad predictors. We pick up false patterns and are unable to consider even medium amounts of information at once. Resistance to SPR findings typically comes from a belief in epistemic exceptionalism. There is an impulse to tweak the conclusions of an SPR, which leads to worse results. It is surprising to find out we do so badly that random linear models can do better — yet another manifestation of overconfidence. Counterintuitively, experts are best suited to deviate from SPRs grounded in theory because they have a better understanding of when the SPR will apply2.
Ameliorative Psychology offers a number of useful recommendations, but its normative assumptions are rarely stated explicitly. The authors identify three factors underlying the quality of a reasoning strategy: reliability across a wide range of problems, tractability, and applicability to significant problems. Strategies need to be robustly reliable to survive changes in environments. Cheaper and easier strategies allow one to “purchase” more truths. Simple strategies like SPRs have tended to be more successful as well, possibly by avoiding overfitting, but a easy, low-quality rule is better than a high-quality one that is never used. Finally, the world is full of useless correlations, so the trick is to find important ones.
Cost-benefit relations have diminishing marginal returns. By considering possible cost-benefit curves, startup costs, and marginal expected reliability, the possible ways to improve reasoning fall into exactly four categories. Three ways consist of changing strategies, and can be seen in the following matrix:
|Same (or lower) cost||Higher Cost|
|Greater Benefit||(1) Adopt more reliable, cheaper strategy.||(2) Adopt more reliable, expensive strategy.|
|Same (or less) Benefit||(3) Adopt less reliable, but cheaper strategy.|
The first is always worth adopting. The second, where more expensive strategies are adopted, is worth it if opportunity costs are not too high. The third is worth it for fairly insignificant matters. The fourth way is to reallocate resources among existing strategies, especially towards more significant issues.
Strategic Reliabilism is the epistemological theory underlying Ameliorative Psychology which says, “epistemic excellence involves the efficient allocation of cognitive resources to robustly reliable reasoning strategies applied to significant problems”.
Reasoning strategies can be characterized by four elements: the range of objects the rule makes judgments about, the inputs used to make the judgment, the formula combining the inputs, and the target or goal of the prediction. Rules would be perfectly reliable if their ranges could be defined as precisely the cases where they are accurate, but a rule isn’t feasible if detecting the conditions of applicability isn’t feasible. Rules are more or less accurate in different environments, so to evaluate a strategy’s reliability, one must consider its expected range by a subject in an environment.
Rules are robustly reliable if they make consistent, accurate predictions over a wide range. Being consistent involves being reliable on all natural subsets of a range, not just a few. Valuing robustness is important because low-reliability rules will be filtered out quicker and robust rules are easier to implement (since their wide scope means you need fewer of them) and safer for general use.
Since we are limited creatures, any practical epistemology must consider resource allocation. Cost-benefit analysis has been criticized for attempting to compare the incomparable. However, even a flawed cost-benefit analysis forces us to slow down and reflect on what we really value.
Epistemic benefits are reflected in a problem’s significance for a person. If outcomes are mostly the same (good or bad), errors aren’t costly. For tractability, the authors propose to measure epistemic benefits in terms of reliability.
Cognitive resources may not be easily transferred across tasks, so cost accounting must acknowledge how scarce time, memory, and attention are as well as how they interact. Time is one easily measurable proxy for overall costs.
When thinking about the ways people reason badly, it may be easier to cultivate new habits rather revise how we reason. Better to stick with simple, automatic actions if possible, rather than rely on discipline.
If all you want are true beliefs, life is easy. Spend all your time sitting outside counting blimps, and you will be perfectly accurate almost all the time. Excellent reasoners must reason well about significant matters, not just arbitrary ones. Significance in general will be difficult to judge, since significance varies depending on the particular situation. Perhaps we can pick out features significant matters tend to share. For instance, not all reasoning about causality is significant, but most significant matters involve causal reasoning, so this is a skill worth improving.
The difficulty in creating an account of significance is that it can’t be too strong or too weak. We shouldn’t be able to say almost anything is significant, but we need room for substantial interpersonal differences. The authors’ view is that significance of a problem for a person is the strength of the objective reasons that person has for devoting resources to solving that problem. Epistemology must acknowledge other normative domains, and the authors assume there are objective reasons for action, i.e. the reasons hold whether or not the person in quesition recognizes them or thinks them legitimate. At a minimum, individuals have moral and prudential reasons for action. Not all reasons are tied to consequences; some reasons might be tied to duties. Knowing certain basic truths might be intrinsically valuable, so there could be purely epistemic reasons.
It’s not hard to find some reason to solve a problem, so the main question is the strength of those reasons. Even “lost causes” might be significant, especially if one accepts duty-based reasons. Some problems might be negatively significant, where one has reasons not to spend time reasoning about it. For instance, “philosophy grad student disease” (constant monitoring of how smart you are relative to your peers) is negatively significant. Since our reasons are ultimately tied to human well-being, so is epistemic significance.
One problem with this account is that people may not know the relevant reasons. Any theory incorporating signifance must deal with people lacking a good sense of it, so this is a fact of life. Part of the problem of allocating resources involves spending time determining significance to guide further allocation.
Modern versions of Standard Analytic Epistemology include foundationalism, coherentism, reliabilism, and contextualism. Most proponents of SAE agree that naturalized epistemology can’t work. The authors’ approach is naturalistic because it begins with a descriptive core and works from there. The standard objection is a descriptive theory can’t yield prescriptions. However, SAE has a descriptive theory at its core and is less likely to overcome the is-ought gap, so Strategic Reliabilism is superior to any existing theory of SAE.
Since SAE theories of justification are tested against philosophers’ considered judgments, there is an implicit stasis criterion. If we were magically granted the best SAE theory, it would essentially be a descriptive theory of these opinions. After all, epistemic judgments vary considerably across and between cultures, so it is slightly odd to focus on the intuitions of high-SES Westerners.
If SAE works from a descriptive core, how are normative consequences extracted? The authors do not contend this is impossible for theories of SAE, but the prospects aren’t good. Many criticisms of naturalism by SAE proponents apply to their own theories. In the end, everyone has to bridge the is-ought gap. Philosophers are essentially experts in their own opinions, while Ameliorative Psychologists have documented success at helping people and institutions reason better. By the Aristotelian Principle, this success is what gives Strategic Reliabilism a chance at normativity.
Strategic Reliabilism is not a theory of justification, but if it were cast in that light, it would be more worthy of belief than any available theory. If it recommends justified beliefs, no other theories are necessary. If it occasionally recommends unjustified beliefs, what would that mean? The belief is a result of excellent reasoning produces true beliefs and hence better outcomes about significant matters on average, but isn’t deemed justified by a bunch of philosophers. Would proponents of SAE have the holders of this belief adopt less reliable strategies or think about less significant problems? What would justification really buy us?
The Heuristics and Biases program revealed many systematic flaws in human reasoning. Unlike Ameliorative Psychology, philosophers have paid attention to the HB program. Some have been critical of the interpretation of the findings, arguing subjects’ performances on tasks are justified under different norms than applied by the experimenter. These reject-the-norm arguments can be made on empirical grounds, for instance claiming that subjects understand the problem differently than the experimenter intended. Reject-the-norm arguments can also be made on conceptual grounds.
One such conceptual argument is given by Gerd Gigerenzer. He begins by noting that from a frequentist point of view, it doesn’t make sense to assign probabilities to one-time events, as subjects are often asked to do. Hence subjects’ answers can’t be judged as errors since they are valid under a possible interpretation. Kahneman and Tversky argued with Gigerenzer in this narrow normative framework over whether subjects violated the laws of probability, but from a Strategic Reliabilist perspective, the fundamental issue is that people can make serious mistakes, whether or not these count as “errors”. Ironically, Gigerenzer understands perfectly well that subjects reason poorly, even if he won’t call it an error.
Another argument comes from L.J. Cohen, who argues normal adults cannot be empirically shown to be irrational. Ordinary human reasoning sets its own standards, so any flaws must be performance errors, not flaws in reasoning competence. This performance-competence distinction could be analogous to linguistics, where we might think everyone is competent at their native language, even if that isn’t perfectly reflected in performance. Then, nothing could be an error unless its author, under ideal conditions, would agree it was an error. Cohen is surely right that there is distinction between performance and competence, but language is the wrong comparison. We wouldn’t treat everyone’s baseline capabilities at painting, swimming, or math as the measure of excellence, so we should be able to recognize differences in reasoning competences. This is supported by the considerable correlation between scores on typical reasoning tasks, which are also correlated with SAT scores, although somewhat curiously, not math education 3. Even though Cohen seems to be arguing for epistemic relativism, he must actually be arguing psychologists and others are wrong.
From the perspective of Strategic Reliabilism, the quality of a reasoning strategy depends on its expected costs and benefits relative to its competitors. Hence, a demonstration of reasoning excellence involves evidence that a person chose the best strategy available, which conceptual reject-the-norm arguments ignore, so these arguments can only succeed by changing the subject, redefining “rationality” and “error” in non-useful ways.
Practical advice can only be made so far as empirical data allows, but the Strategic Reliabilist theory that reasoning strategies are better to the extent they are cheaper, more robustly reliable, and address more significant issues can tell us what evidence is missing if we want to offer guidance.
In diagnostic problems, subjects have a difficult time directly employing Bayes’ Rule. However, if probabilities are recast as natural frequencies, subjects perform much better. Even though both rely on Bayes’ Rule as a mathematical identity, as reasoning strategies Bayes’ Rule and frequency formats are very different. The start-up costs to reliably use the former are too high for many.
Overconfidence is a pervasive feature of reasoning. Monetary incentives and simple declarations to reduce bias have no effect. In controlled environments, calibration exercises can eliminate overconfidence. For individuals, the most feasible method is to consider the opposite. An effective form of this strategy involves the simple rule “Stop and consider why your judgment might be wrong”. Applying this to every facet of our lives might be too expensive, either because it requires too much discipline or makes us neurotic. Both are valid concerns from a Strategic Reliabilist view, but is likely to be worth employing for significant problems.
Compelling narratives are often accepted as causal explanations. Though they can go awry, controlled experiments provide the best way of understanding causal relationships. Considering what might happen for a control even if there isn’t one is a first step in addressing these biases. Acknowledging that a control might be impossible would lead us to accept fewer causal claims. Narratives come too easily, especially for rare or unique events.
It is not clear how well good reasoning can be taught, but there is hope. One group of researchers surprised themselves when they found formal discipline has an effect, though admitting we know very little about reasoning, how to teach it, or how much of an improvement is possible by instruction 4.
Psychology profitable divorced itself from philosophy in the mid–19th century. Philosophers have been particularly neglectful of developments in the other field, but both disciplines could benefit from increased interaction.
The authors propose three projects that would aid the development of a strong, mature epistemology. The first is to acquire a wide-range of new heuristics people can feasibly employ. Second, to guide the first project, an stronger account of human well-being is needed to highlight significant areas. Third, social institutions should be developed keeping in mind that much of our reasoning is ecological.
Philosophy might be about self-knowledge, but that knowledge is unlikely to come from introspection. Epistemologists might become theoreticians describing an applied science, but the overall discipline will be stronger for it.
Grove and Meehl (1996), “Comparative Efficiency of Informal (Subjective, Impressionistic) and Formal (Mechanical, Algorithmic) Prediction Procedures: The Clinical Statistical Controvery”, Psychology, Public Policy, and Law 2: 293—323 ↩
Lehman, Lempert, and Nisbett (1988). “The effects of graduate training on reasoning: Formal discipline and thinking about everyday-life events”. American Psychologist 43:6, 431—442. ↩