David Chapman criticizes "pop Bayesianism" as just common-sense rationality dressed up as intimidating math:
Bayesianism boils down to “don’t be so sure of your beliefs; be less sure when you see contradictory evidence.”
Now that is just common sense. Why does anyone need to be told this? And how does [Bayes'] formula help?
The leaders of the movement presumably do understand probability. But I’m wondering whether they simply use Bayes’ formula to intimidate lesser minds into accepting “don’t be so sure of your beliefs.” (In which case, Bayesianism is not about Bayes’ Rule, after all.)
I don’t think I’d approve of that. “Don’t be so sure” is a valuable lesson, but I’d rather teach it in a way people can understand, rather than by invoking a Holy Mystery.
What does Bayes's formula have to teach us about how to do epistemology, beyond obvious things like "never be absolutely certain; update your credences when you see new evidence"?
I list below some of the specific things that I learned from Bayesianism. Some of these are examples of mistakes I'd made that Bayesianism corrected. Others are things that I just hadn't thought about explicitly before encountering Bayesianism, but which now seem important to me.
I'm interested in hearing what other people here would put on their own lists of things Bayesianism taught them. (Different people would make different lists, depending on how they had already thought about epistemology when they first encountered "pop Bayesianism".)
I'm interested especially in those lessons that you think followed more-or-less directly from taking Bayesianism seriously as a normative epistemology (plus maybe the idea of making decisions based on expected utility). The LW memeplex contains many other valuable lessons (e.g., avoid the mind-projection fallacy, be mindful of inferential gaps, the MW interpretation of QM has a lot going for it, decision theory should take into account "logical causation", etc.). However, these seem further afield or more speculative than what I think of as "bare-bones Bayesianism".
So, without further ado, here are some things that Bayesianism taught me.
- Banish talk like "There is absolutely no evidence for that belief". P(E | H) > P(E) if and only if P(H | E) > P(H). The fact that there are myths about Zeus is evidence that Zeus exists. Zeus's existing would make it more likely for myths about him to arise, so the arising of myths about him must make it more likely that he exists. A related mistake I made was to be impressed by the cleverness of the aphorism "The plural of 'anecdote' is not 'data'." There may be a helpful distinction between scientific evidence and Bayesian evidence. But anecdotal evidence is evidence, and it ought to sway my beliefs.
- Banish talk like "I don't know anything about that". See the post "I don't know."
- Banish talk of "thresholds of belief". Probabilities go up or down, but there is no magic threshold beyond which they change qualitatively into "knowledge". I used to make the mistake of saying things like, "I'm not absolutely certain that atheism is true, but it is my working hypothesis. I'm confident enough to act as though it's true." I assign a certain probability to atheism, which is less than 1.0. I ought to act as though I am just that confident, and no more. I should never just assume that I am in the possible world that I think is most likely, even if I think that that possible world is overwhelmingly likely. (However, perhaps I could be so confident that my behavior would not be practically discernible from absolute confidence.)
- Absence of evidence is evidence of absence. P(H | E) > P(H) if and only if P(H | ~E) < P(H). Absence of evidence may be very weak evidence of absence, but it is evidence nonetheless. (However, you may not be entitled to a particular kind of evidence.)
- Many bits of "common sense" rationality can be precisely stated and easily proved within the austere framework of Bayesian probability. As noted by Jaynes in Probability Theory: The Logic of Science, "[P]robability theory as extended logic reproduces many aspects of human mental activity, sometimes in surprising and even disturbing detail." While these things might be "common knowledge", the fact that they are readily deducible from a few simple premises is significant. Here are some examples:
- It is possible for the opinions of different people to diverge after they rationally update on the same evidence. Jaynes discusses this phenomenon in Section 5.3 of PT:TLoS.
- Popper's falsification criterion, and other Popperian principles of "good explanation", such as that good explanations should be "hard to vary", follow from Bayes's formula. Eliezer discusses this in An Intuitive Explanation of Bayes' Theorem and A Technical Explanation of Technical Explanation.
- Occam's razor. This can be formalized using Solomonoff induction. (However, perhaps this shouldn't be on my list, because Solomonoff induction goes beyond just Bayes's formula. It also has several problems.)
- You cannot expect that future evidence will sway you in a particular direction. "For every expectation of evidence, there is an equal and opposite expectation of counterevidence."
- Abandon all the meta-epistemological intuitions about the concept of knowledge on which Gettier-style paradoxes rely. Keep track of how confident your beliefs are when you update on the evidence. Keep track of the extent to which other people's beliefs are good evidence for what they believe. Don't worry about whether, in addition, these beliefs qualify as "knowledge".
What items would you put on your list?
 See also Yvain's reaction to David Chapman's criticisms.
 ETA: My wording here is potentially misleading. See this comment thread.
The (related) way I would expand this is "if you know what you will believe in the future, then you ought to believe that now."
Quoting myself from Yvain's blog:
Anecdotal evidence is filtered evidence. People often cite the anecdote that supports their belief, while not remembering or not mentioning events that contradict them. You can find people saying anecdotes on any side of a debate, and I see no reason the people who are right would cite anecdotes more.
Of course, if you witness an anecdote with your own eyes, that is not filtered, and you should adjust your beliefs accordingly.
Unless you too selectively (mis)remember things.
Or selectively expose yourself to situations.
I think the value of anecdotes often doesn't lie so much in changing probabilities of belief but in illustrating what a belief actually is about.
That, and existence/possibility proofs, and, in the very early phases of investigation, providing a direction for inquiry.
Right, the existence of the anecdote is the evidence, not the occurrence of the events that it alleges.
It is true that, if a hypothesis has reached the point of being seriously debated, then there are probably anecdotes being offered in support of it. (... assuming that we're taking about the kinds of hypotheses that would ever have an anecdote offered in support of it.) Therefore, the learning of the existence of anecdotes probably won't move much probability around among the hypotheses being seriously debated.
However, hypothesis space is vast. Many hypotheses have never even been brought up for debate. The overwhelming majority should never come to our attention at all.
In particular, hypothesis space contains hypotheses for which no anecdote has ever been offered. If you learned that a particular hypothesis H were true, you would increase your probability that H was among those hypotheses that are supported by anecdotes. (Right? The alternative is that which hypotheses get anecdotes is determined by mechanisms that have absolutely no correlation, or even negative correlation, with the truth.) Therefore, the existence of an anecdote is evidence for the hypothesis that the anecdote alleges is true.
A typical situation is that there's a contentious issue, and some anecdotes reach your attention that support one of the competing hypotheses.
You have three ways to respond:
In almost every situation you're likely to encounter, the real danger is 3. Well-known biases are at work pulling you towards 3. These biases are often known to work even when you're aware of them and trying to counteract them. Moreover, the harm from reaching 3 is typically far greater than the harm from reaching 1. This is because the correct added amount of credence in 2 is very tiny, particularly because you're already likely to know that the competing hypotheses for this issue are all likely to have anecdotes going for them. In real-life situations, you don't usually hear anecdotes supporting an incredibly unlikely-seeming hypothesis which you'd otherwise be inclined to think as capable of nurturing no anecdotes at all. So forgoing t... (read more)
This is the problem. I know, as an epistemic matter of fact, that anecdotes are evidence. I could try to ignore this knowledge, with the goal of counteracting the biases to which you refer. That is, I could try to suppress the Bayesian update or to undo it after it has happened. I could try to push my credence back to where it was "manually". However, as you point out, counteracting biases in this way doesn't work.
Far better, it seems to me, to habituate myself to the fact that updates can by miniscule. Credence is quantitative, not qualitative, and so can change by arbitrarily small amounts. "Update Yourself Incrementally". Granting that someone has evidence for their claims can be an arbitrarily small concession. Updating on the evidence doesn't need to move my credences by even a subjectively discernible amount. Nonetheless, I am obliged to acknowledge that the anecdote would move the credences of an ideal Bayesian agent by some nonzero amount.
It is interesting that you think of this as typical, or at least typical enough to be exclusionary of non-contentious issues. I avoid discussions about politics and possibly other contentious issues, and when I think of people providing anecdotes I usually think of them in support of neutral issues, like the efficacy of understudied nutritional supplements. If someone tells you, "I ate dinner at Joe's Crab Shack and I had intense gastrointestinal distress," I wouldn't think it's necessarily justified to ignore it on the basis that it's anecdotal. If you have 3 more friends who all report the same thing to you, you should rightly become very suspicious of the sanitation at Joe's Crab Shack. I think the fact that you are talking about contentious issues specifically is an important and interesting point of clarification.
I don't think it is. If Zeus really had eaten the homework, I wouldn't expect it to be reported in those terms. Some stories are evidence against their own truth -- if the truth were as the story says, that story would not have been told, or not in that way. (Fictionally, there's a Father Brown story hinging on that.)
And even if it theoretically pointed in the right direction, it is so weak as to be worthless. To say, "ah, but P(A|B)>P(A)!" is not to any practical point. It is like saying that a white wall is evidence for all crows being black. A white wall is also evidence, in that sense, for all crows being magenta, for the moon being made of green cheese, for every sparrow falling being observed by God, and for no sparrow falling being observed by God. Calling this "evidence" is like picking up from the sidewalk, not even pennies, but bottle tops.
"Absence of evidence isn't evidence of absence" is such a ubiquitous cached thought in rationalist communities (that I've been involved with) that its antithesis was probably the most important thing I learned from Bayesianism.
I am confused. I always thought that the "Bayes" in Bayesianism refers to the Bayesian Probability Model. Bayes' rule is a powerful theorem, but it is just one theorem, and is not what Bayesianism is all about. I understand that the video being criticized was specifically talking about Bayes' rule, but I do not think that is what Bayesianism is about at all. The Bayesian probability model basically says that probability is a degree of belief (as opposed to other models that only really work with possible possible worlds or repeatable experiments). I always thought this was the main thesis of Bayesianism was "The best language to talk about uncertainty is probability theory," which agrees perfectly with the interpretation that the name comes from the Bayesian probability model, and has nothing to do with Bayes' rule. Am I using the word in a way differently than everyone else?
I didn't get a lot out of Bayes at the first CFAR workshop, when the class involved mentally calculating odds ratios. It's hard for me to abstractly move numbers around in my head. But the second workshop I volunteered at used a Bayes-in-everyday-life method where you drew (or visualized) a square, and drew a vertical line to divide it according to the base rates of X versus not-X, and then drew a horizontal line to divide each of the slices according to how likely you were to see evidence H in the world where X was true, and the world where not-X was true. Then you could basically see whether the evidence had a big impact on your belief, just by looking at the relative size of the various rectangles. I have a strong ability to visualize, so this is helpful.
I visualize this square with some frequency when I notice an empirical claim about thing X presented with evidence H. Other than that, I query myself "what's the base rate of this?" a lot, or ask myself the question "is H actually more likely in the world where X is true versus false? Not really? Okay, it's not strong evidence."
Maybe this wasn't your intent, but framing this post as a rebuttal of Chapman doesn't seem right to me. His main point isn't "Bayesianism isn't useful"--more like "the Less Wrong memeplex has an unjustified fetish for Bayes' Rule" which still seems pretty true.
While this is true mathematically, I'm not sure it's useful for people. Complex mental models have overhead, and if something is unlikely enough then you can do better to stop thinking about it. Maybe someone broke into my office and when I get there on Monday I won't be able to work. This is unlikely, but I could look up the robbery statistics for Cambridge and see that this does happen. Mathematically, I should be considering this in making plans for tomorrow, but practically it's a waste of time thinking about it.
(There's also the issue that we're not good at thinking about small probabilities. It's very hard to keep unlikely possibilities from taking on undue weight except by just not thinking about them.)
Well ... you can have an expected direction, just not if you account for magnitudes.
For example if I'm estimating the bias on a weighted die, and so far I've seen 2/10 rolls give 6's, if I roll again I expect most of the time to get a non-6 and revise down my estimate of the probability of a 6; however on the occasions when I do roll a 6 I will revise up my estimate by a larger amount.
Sometimes it's useful to have this distinction.
So to summarise in pop Bayesian terms, akin to "don’t be so sure of your beliefs; be less sure when you see contradictory evidence." :
Yes - though that idea can usefully be generalised to conservation of evidence.
I'll add the Bayesian definition of evidence an awareness of selection effects to the list.
Belongs in Main, methinks.
I suppose we all came across Bayesianism from different points of view - my list is quite a bit different.
For me the biggest one is that the degree to which I should believe in something is basically determined entirely by the evidence, and IS NOT A MATTER OF CHOICE or personal belief. If I believe something with degree of probability X, and see Y happen that is evidence for X, then the degree of probability Z which which I then should believe is a mathematical matter, and not a "matter of opinion."
The prior seems to be a get-out clause here, but... (read more)
Reading this clarified something for me. In particular, "Banish talk like "There is absolutely no evidence for that belief".
OK, I can see that mathematically there can be very small amounts of evidence for some propositions (e.g. the existence of the deity Thor.) However in practice there is a limit to how small evidence can be for me to make any practical use of it. If we assign certainties to our beliefs on a scale of 0 to 100, then what can I realistically do with a bit of evidence that moves me from 87 to 87.01? or 86.99? I don't think ... (read more)
We should unpack "banish talk of X" to mean that we should avoid assessments/analysis that would naturally be expressed in such surface terms.
Since most of us don't do deep thinking unless we use some notation or words, "banish talk of" is a good heuristic for such training, if you can notice yourself (or others can catch you) doing it.
The selection biases in anecdotes make them nearly useless for updating. A more correct version would be that you can update on the first anecdote, less on a similar second one, even less on a third, and so on. Once you have ten or so anecdotes pointing in the same directio... (read more)
One thing it apparently taught Jaynes:... (read more)
Did you learn these lessons exclusively by exposing yourself to the Bayesian ideas floating on LessWrong, or would you credit these insights at least partly to "external" sources? You mention Jaynes's book. Has reading this book taught you some of the lessons you list above? Is there other material you'd recommend for folks interested in having more Bayesian Aha! moments?
I'd just like to note that Bayes' Rule is one of the first and simplest theorems you prove in introductory statistics classes after you have written down the preliminary definitions/axioms of probability. It's literally taught and expected that you are comfortable using it after the first or second week of classes.
Honestly, I feel like if Eliezer had left out any mention of the math of Bayes' Theorem from the sequences, I would be no worse off. The seven statements you wrote seem fairly self-evident by themselves. I don't feel like I need to read that P(A|B) > P(A) or whatever to internalize them. (But perhaps certain people are highly mathematical thinkers for whom the formal epistemology really helps?)
Lately I kind of feel like rationality essentially comes down to two things:
Recognizing that as a rule you are better off believing the truth, i.e. abiding by t
It's a bit like learning thermodynamics. It may seem self-evident that things have temperatures, that you can't get energy from nowhere, and that the more you put things together, the more they fall apart, but the science of thermodynamics puts these intuitively plausible things on a solid foundation (being respectively the zeroth, first, and second laws of thermodynamics). That foundation is itself built on lower-level physics. If you do not know why perpetual motion machines are ruled out, but just have an unexplained intuition that they can't work, you will not have a solid ground for judging someone's claim to have invented one.
The Bayesian process of updating beliefs from evidence by Bayes theorem is the foundation that underlies all of these "obvious" statements, and enables one to see why they are true.
Yes, who knows how many other 'obvious' statements you might believe otherwise, such as "Falsification is a different type of process from confirmation."
I saw Yvain describe this experience. My experience was actually kind of the opposite. When I read the sequences, they seemed extremely well written, but obvious. I thought that my enjoyment of them was the enjoyment of reading what I already knew, but expressed better than I could express it, plus the cool results from the heuristics-and-biases research program. It was only in retrospect that I noticed how much they had clarified my thinking about basic epistemology.
For me, reading the first chapter of Probability Theory by Jaynes showed me that what thus far had only been a vague intuition of mine (that neither what Yvain calls Aristotelianism nor what Yvain calls Anton-Wilsonism were the full story) actually had a rigorous quantitative form that can be derived mathematically from a few entirely reasonable desiderata, which did put it on a much more solid ground in my mind.
If you are not going to do an actual data analysis, then I don't think there is much point of thinking about Bayes' rule. You could just reason as follows: "here are my prior beliefs. ooh, here is some new information. i will now adjust my believes, by trying to weigh the old and new data based on how reliable and generalizable i think the information is." If you want to call epistemology that involves attaching probabilities to beliefs, and updating those probabilities when new information is available, 'bayesian' that's fine. But, unless you h... (read more)
...Common? Maybe in successful circles of academia.
What a bizarre question. I find it difficult to believe that this person has any experience with the average/median citizen.
Matter of fact there are thresholds below which extra processing cost does not pay off (or, in case of human head, when it is extremely implausible that the processing is even going to be performed correctly).
Probabilistic reasoning is, in general, computationally expensive. In many situations, a very large number of combinations of uncertain parameters has to be processed, the cross correlations must be accounted for, et cete... (read more)
Occam's razor should be on your list. Not in the "Solomonoff had the right definition of complexity" sense, but in the sense that any proper probability distribution has to integrate to 1, and so for any definition of complexity satisfying a few common sense axioms the limit of your prior probability has to go to zero as the complexity goes to infinity.
I think you've oversimplified the phrasing of 6 (not your fault, though; more the fault of the English language). Although your expected value for your future estimate of P(H) should be the same ... (read more)
There might be a normative rule to that effect , but probabilities in your brain can't change by infinitesimal increments. Bayes as applied by cognitively limited agents like humans has to have some granularity.
Why not? Because it is not useful? Because those problems have been solved?
In the interests of full disclosure :-)
I think of Bayesianism as a philosophy in statistics, specifically one which is opposed to frequentism. The crucial difference is the answer to the question "What does 'probability' mean?"
There is also Bayesian analysis in statistics which is a toolbox of techniques and approaches (see e.g. Gelman) but which is not an "-ism".
I do not apply the label of "Bayesianism" to all the Kahneman/Tversky kind of stuff, I tend to think of it a mind bugs (in the programming sense) and mind hacks.
I do ... (read more)
Only provided you have looked, and looked in the right place.
Well, at some point the upper bound of consequences for being wrong multiplied by the likelihood that you expect to be wrong is so tiny that it's worth less than the mental overhead of keeping track of the level of certainty. Like, I'm confident enough that physics works (to the level of everyday phenomena), that keeping track of what might happen if I'm wrong about that doesn't add enough value to be worthwhile.
So it's about a protocol for language instead?
I'd just like to point out that even #1 of the OP's "lessons" is far more problematic than they make it seem. Consider the statement:
"The fact that there are myths about Zeus is evidence that Zeus exists. Zeus's existing would make it more likely for myths about him to arise, so the arising of myths about him must make it more likely that he exists." (supposedly an argument of the form P(E | H) > P(E)).
So first, "Zeus's existing would make it more likely for myths about him to arise" - more likely than what? Than "a pr... (read more)
The post on whether you're "entitled" to evidence has always annoyed me a bit... what does "entitled" even mean? If the person you're talking to isn't updating on the evidence you're giving them for some bad reason, what can you really do?
In I don't know, Eliezer isn't arguing that you shouldn't say it, but that you shouldn't think it:... (read more)
Point by point take on this:
The evidence can be weak enough and/or be evidence for an immense number of other things besides what evidence claims it is evidence for, as to be impossible to process qualitatively. If there's "no evidence", the effect size is usually pretty small, much smaller than the filtering that the anecdotes pass through, much smaller than can be inferred qualitatively, etc.
2... (read more)
That doesn't look useful to me.
By the same token, my mentioning here the name of the monster Ygafalkufeoinencfhncfc is evidence that it exists. Funnily enough, the same reasoning provides evidence for the monster Grrapoeiruvnenrcancaef and a VERY large number of other, ahem, creatures.
No it doesn't. Most of the creatures in that class have, in fact, not been mentioned by you or anyone else.