David Chapman criticizes "pop Bayesianism" as just common-sense rationality dressed up as intimidating math[1]:

Bayesianism boils down to “don’t be so sure of your beliefs; be less sure when you see contradictory evidence.”

Now that is just common sense. Why does anyone need to be told this? And how does [Bayes'] formula help?

[...]

The leaders of the movement presumably do understand probability. But I’m wondering whether they simply use Bayes’ formula to intimidate lesser minds into accepting “don’t be so sure of your beliefs.” (In which case, Bayesianism is not about Bayes’ Rule, after all.)

I don’t think I’d approve of that. “Don’t be so sure” is a valuable lesson, but I’d rather teach it in a way people can understand, rather than by invoking a Holy Mystery.

What does Bayes's formula have to teach us about how to do epistemology, beyond obvious things like "never be absolutely certain; update your credences when you see new evidence"?

I list below some of the specific things that I learned from Bayesianism. Some of these are examples of mistakes I'd made that Bayesianism corrected. Others are things that I just hadn't thought about explicitly before encountering Bayesianism, but which now seem important to me.

I'm interested in hearing what other people here would put on their own lists of things Bayesianism taught them. (Different people would make different lists, depending on how they had already thought about epistemology when they first encountered "pop Bayesianism".)

I'm interested especially in those lessons that you think followed more-or-less directly from taking Bayesianism seriously as a normative epistemology (plus maybe the idea of making decisions based on expected utility). The LW memeplex contains many other valuable lessons (e.g., avoid the mind-projection fallacy, be mindful of inferential gaps, the MW interpretation of QM has a lot going for it, decision theory should take into account "logical causation", etc.). However, these seem further afield or more speculative than what I think of as "bare-bones Bayesianism".

So, without further ado, here are some things that Bayesianism taught me.

  1. Banish talk like "There is absolutely no evidence for that belief". P(E | H) > P(E) if and only if P(H | E) > P(H). The fact that there are myths about Zeus is evidence that Zeus exists. Zeus's existing would make it more likely for myths about him to arise, so the arising of myths about him must make it more likely that he exists. A related mistake I made was to be impressed by the cleverness of the aphorism "The plural of 'anecdote' is not 'data'." There may be a helpful distinction between scientific evidence and Bayesian evidence. But anecdotal evidence is evidence, and it ought to sway my beliefs.
  2. Banish talk like "I don't know anything about that". See the post "I don't know."
  3. Banish talk of "thresholds of belief". Probabilities go up or down, but there is no magic threshold beyond which they change qualitatively into "knowledge". I used to make the mistake of saying things like, "I'm not absolutely certain that atheism is true, but it is my working hypothesis. I'm confident enough to act as though it's true." I assign a certain probability to atheism, which is less than 1.0. I ought to act as though I am just that confident, and no more. I should never just assume that I am in the possible world that I think is most likely, even if I think that that possible world is overwhelmingly likely. (However, perhaps I could be so confident that my behavior would not be practically discernible from absolute confidence.)
  4. Absence of evidence is evidence of absence. P(H | E) > P(H) if and only if P(H | ~E) < P(H). Absence of evidence may be very weak evidence of absence, but it is evidence nonetheless. (However, you may not be entitled to a particular kind of evidence.)
  5. Many bits of "common sense" rationality can be precisely stated and easily proved within the austere framework of Bayesian probability.  As noted by Jaynes in Probability Theory: The Logic of Science, "[P]robability theory as extended logic reproduces many aspects of human mental activity, sometimes in surprising and even disturbing detail." While these things might be "common knowledge", the fact that they are readily deducible from a few simple premises is significant. Here are some examples:
    • It is possible for the opinions of different people to diverge after they rationally update on the same evidence. Jaynes discusses this phenomenon in Section 5.3 of PT:TLoS.
    • Popper's falsification criterion, and other Popperian principles of "good explanation", such as that good explanations should be "hard to vary", follow from Bayes's formula. Eliezer discusses this in An Intuitive Explanation of Bayes' Theorem and A Technical Explanation of Technical Explanation.
    • Occam's razor. This can be formalized using Solomonoff induction. (However, perhaps this shouldn't be on my list, because Solomonoff induction goes beyond just Bayes's formula. It also has several problems.)
  6. You cannot expect[2] that future evidence will sway you in a particular direction. "For every expectation of evidence, there is an equal and opposite expectation of counterevidence."
  7. Abandon all the meta-epistemological intuitions about the concept of knowledge on which Gettier-style paradoxes rely. Keep track of how confident your beliefs are when you update on the evidence. Keep track of the extent to which other people's beliefs are good evidence for what they believe. Don't worry about whether, in addition, these beliefs qualify as "knowledge".

What items would you put on your list?

ETA: ChrisHallquist's post Bayesianism for Humans lists other "directly applicable corollaries to Bayesianism".


[1]  See also Yvain's reaction to David Chapman's criticisms.

[2]  ETA: My wording here is potentially misleading.  See this comment thread.

203 comments, sorted by
magical algorithm
Highlighting new comments since Today at 12:33 AM
Select new highlight date
Moderation Guidelinesexpand_more

A related mistake I made was to be impressed by the cleverness of the aphorism "The plural of 'anecdote' is not 'data'." There may be a helpful distinction between scientific evidence and Bayesian evidence. But anecdotal evidence is evidence, and it ought to sway my beliefs.

Anecdotal evidence is filtered evidence. People often cite the anecdote that supports their belief, while not remembering or not mentioning events that contradict them. You can find people saying anecdotes on any side of a debate, and I see no reason the people who are right would cite anecdotes more.

Of course, if you witness an anecdote with your own eyes, that is not filtered, and you should adjust your beliefs accordingly.

Of course, if you witness an anecdote with your own eyes, that is not filtered

Unless you too selectively (mis)remember things.

Unless you too selectively (mis)remember things.

Or selectively expose yourself to situations.

If I can always expose myself to situations in which I anecdotally experience success, isn't that Winning?

If I can always expose myself to situations in which I anecdotally experience success, isn't that Winning?

Yes. What it isn't is an unbiased scientific study. The anecdotal experience of situations which are selected to to provide success is highly filtered evidence.

I think the value of anecdotes often doesn't lie so much in changing probabilities of belief but in illustrating what a belief actually is about.

That, and existence/possibility proofs, and, in the very early phases of investigation, providing a direction for inquiry.

Anecdotal evidence is filtered evidence.

Right, the existence of the anecdote is the evidence, not the occurrence of the events that it alleges.

You can find people saying anecdotes on any side of a debate, and I see no reason the people who are right would cite anecdotes more.

It is true that, if a hypothesis has reached the point of being seriously debated, then there are probably anecdotes being offered in support of it. (... assuming that we're taking about the kinds of hypotheses that would ever have an anecdote offered in support of it.) Therefore, the learning of the existence of anecdotes probably won't move much probability around among the hypotheses being seriously debated.

However, hypothesis space is vast. Many hypotheses have never even been brought up for debate. The overwhelming majority should never come to our attention at all.

In particular, hypothesis space contains hypotheses for which no anecdote has ever been offered. If you learned that a particular hypothesis H were true, you would increase your probability that H was among those hypotheses that are supported by anecdotes. (Right? The alternative is that which hypotheses get anecdotes is determined by mechanisms that have absolutely no correlation, or even negative correlation, with the truth.) Therefore, the existence of an anecdote is evidence for the hypothesis that the anecdote alleges is true.

A typical situation is that there's a contentious issue, and some anecdotes reach your attention that support one of the competing hypotheses.

You have three ways to respond:

  1. You can under-update your belief in the hypothesis, ignoring the anecdotes completely
  2. You can update by precisely the measure warranted by the existence of these anecdotes and the fact that they reached you.
  3. You can over-update by adding too much credence to the hypothesis.

In almost every situation you're likely to encounter, the real danger is 3. Well-known biases are at work pulling you towards 3. These biases are often known to work even when you're aware of them and trying to counteract them. Moreover, the harm from reaching 3 is typically far greater than the harm from reaching 1. This is because the correct added amount of credence in 2 is very tiny, particularly because you're already likely to know that the competing hypotheses for this issue are all likely to have anecdotes going for them. In real-life situations, you don't usually hear anecdotes supporting an incredibly unlikely-seeming hypothesis which you'd otherwise be inclined to think as capable of nurturing no anecdotes at all. So forgoing that tiny amount of credence is not nearly as bad as choosing 3 and updating, typically, by a large amount.

The saying "The plural of anecdotes is not data" exists to steer you away from 3. It works to counteract the very strong biases pulling you towards 3. Its danger, you are saying, is that it pulls you towards 1 rather than the correct 2. That may be pedantically correct, but is a very poor reason to criticize the saying. Even with its help, you're almost always very likely to over-update - all it's doing is lessening the blow.

Perhaps this as an example of "things Bayesianism has taught you" that are harming your epistemic rationality?

A similar thing I noticed is disdain towards "correlation does not imply causation" from enlightened Bayesians. It is counter-productive.

These biases are often known to work even when you're aware of them and trying to counteract them.

This is the problem. I know, as an epistemic matter of fact, that anecdotes are evidence. I could try to ignore this knowledge, with the goal of counteracting the biases to which you refer. That is, I could try to suppress the Bayesian update or to undo it after it has happened. I could try to push my credence back to where it was "manually". However, as you point out, counteracting biases in this way doesn't work.

Far better, it seems to me, to habituate myself to the fact that updates can by miniscule. Credence is quantitative, not qualitative, and so can change by arbitrarily small amounts. "Update Yourself Incrementally". Granting that someone has evidence for their claims can be an arbitrarily small concession. Updating on the evidence doesn't need to move my credences by even a subjectively discernible amount. Nonetheless, I am obliged to acknowledge that the anecdote would move the credences of an ideal Bayesian agent by some nonzero amount.

...updates can by miniscule ... Updating on the evidence doesn't need to move my credences by even a subjectively discernible amount. Nonetheless, I am obliged to acknowledge that the anecdote would move the credences of an ideal Bayesian agent by some nonzero amount.

So, let's talk about measurement and detection.

Presumably you don't calculate your believed probabilities to the n-th significant digit, so I don't understand the idea of a "miniscule" update. If it has no discernible consequences then as far as I am concerned it did not happen.

Let's take an example. I believe that my probability of being struck by lightning is very low to the extent that I don't worry about it and don't take any special precautions during thunderstorms. Here is an anecdote which relates how a guy was stuck by lightning while sitting in his office inside a building. You're saying I should update my beliefs, but what does it mean?

I have no numeric estimate of P(me being struck by lightning) so there's no number I can adjust by 0.0000001. I am not going to do anything differently. My estimate of my chances to be electrocuted by Zeus' bolt is still "very very low". So where is that "miniscule update" that you think I should make and how do I detect it?

P.S. If you want to update on each piece of evidence, surely by now you must fully believe that product X is certain to enlarge your penis?

A typical situation is that there's a contentious issue, and some anecdotes reach your attention that support one of the competing hypotheses.

It is interesting that you think of this as typical, or at least typical enough to be exclusionary of non-contentious issues. I avoid discussions about politics and possibly other contentious issues, and when I think of people providing anecdotes I usually think of them in support of neutral issues, like the efficacy of understudied nutritional supplements. If someone tells you, "I ate dinner at Joe's Crab Shack and I had intense gastrointestinal distress," I wouldn't think it's necessarily justified to ignore it on the basis that it's anecdotal. If you have 3 more friends who all report the same thing to you, you should rightly become very suspicious of the sanitation at Joe's Crab Shack. I think the fact that you are talking about contentious issues specifically is an important and interesting point of clarification.

Thanks for that comment! Eliezer often says people should be more sensitive to evidence, but an awful lot of real-life evidence is in fact much weaker, noisier, and easier to misinterpret than it seems. And it's not enough to just keep in mind a bunch of Bayesian mantras - you need to be aware of survivor bias, publication bias, Simpson's paradox and many other non-obvious traps, otherwise you silently go wrong and don't even know it. In a world where most published medical results fail to replicate, how much should we trust our own conclusions?

Would it be more honest to recommend people to just never update at all? But then everyone will stick to their favorite theories forever... Maybe an even better recommendation would be to watch out for "motivated cognition", try to be more skeptical of all theories including your favorites.

The alternative is that which hypotheses get anecdotes is determined by mechanisms that have absolutely no correlation, or even negative correlation, with the truth.

Doesn't look implausible to me. Here's an alternative hypothesis: the existence of anecdotes is a function of which beliefs are least supported by strong data because such beliefs need anecdotes for justification.

In general, I think anecdotes are way too filtered and too biased as an information source to be considered serious evidence. In particular, there's a real danger of treating a lot of biased anecdotes as conclusive data and that danger, seems to me, outweighs the miniscule usefulness of anecdotes.

In general, I think anecdotes are way too filtered and too biased as an information source to be considered serious evidence.

We may agree. It depends on what work the word "serious" is doing in the quoted sentence.

In this context "serious" = "I'm willing to pay attention to it".

I would raise a hypothesis to consideration because someone was arguing for it, but I don't think anecdotes are good evidence in that I would have similar confidence in a hypothesis supported by an anecdote, and a hypothesis that is flatly stated with no justification. The evidence to raise it to consideration comes from the fact that someone took the time to advocate it.

This is more of a heuristic than a rule, because there are anecdotes that are strong evidence ("I ran experiments on this last year and they didn't fit"), but when dealing with murkier issues, they don't count for much.

The evidence to raise it to consideration comes from the fact that someone took the time to advocate it, not the anecdote.

Yes, it may be that the mere fact that a hypothesis is advocated screens off whether that hypothesis is also supported by an anecdote. But I suspect that the existence of anecdotes still moves a little probability mass around, even among just those hypotheses that are being advocated.

I mean, if someone advocated for a hypothesis, and they couldn't even offer an anecdote in support of it, that would be pretty deadly to their credibility. So, unless I am certain that every advocated hypothesis has supporting anecdotes (which I am not), I must concede that anecdotes are evidence, howsoever weak, over and above mere advocacy.

Here's a situation where an anecdote should reduce our confidence in a belief:

  • A person's beliefs are usually well-supported.
  • When he offers supporting evidence, he usually offers the strongest evidence he knows about.

If this person were to offer an anecdote, it should reduce our confidence in his proposition, because it makes it unlikely he knows of stronger supporting evidence.

I don't know how applicable this is to actual people.

I don't think this is necessarily valid, because people also know that anecdotes can be highly persuasive. So for many people, if you have an anecdote it will make sense to say so, since most people argue not to reach the truth but to persuade.

I agree that it is at least hypothetically possible that the offering of an anecdote should reduce our credence in what the anecdote claims.

... For example, if you told me that you once met a powerful demon who works to stop anyone from ever telling anecdotes about him (regardless of whether the anecdotes are true or false), then I would decrease my credence in the existence of such a demon.

Anecdotal evidence is filtered evidence.

Still evidence.

After accounting for the filtering, which way does it point? If you're left with a delta log-odds of zero, it's "evidence" only in the sense that if you have no apples you have "some" apples.

Yes, "Daaad, Zeus the Greek god ate my homework!" isn't strong evidence, certainly.

But the way it points (in relation to P(Zeus exists)) is clear. I agree with your second sentence, but I'm not sure I understand your first one.

Yes, "Daaad, Zeus the Greek god ate my homework!" isn't strong evidence, certainly.

But the way it points (in relation to P(Zeus exists)) is clear.

I don't think it is. If Zeus really had eaten the homework, I wouldn't expect it to be reported in those terms. Some stories are evidence against their own truth -- if the truth were as the story says, that story would not have been told, or not in that way. (Fictionally, there's a Father Brown story hinging on that.)

And even if it theoretically pointed in the right direction, it is so weak as to be worthless. To say, "ah, but P(A|B)>P(A)!" is not to any practical point. It is like saying that a white wall is evidence for all crows being black. A white wall is also evidence, in that sense, for all crows being magenta, for the moon being made of green cheese, for every sparrow falling being observed by God, and for no sparrow falling being observed by God. Calling this "evidence" is like picking up from the sidewalk, not even pennies, but bottle tops.

I don't think it is. If Zeus really had eaten the homework, I wouldn't expect it to be reported in those terms. Some stories are evidence against their own truth -- if the truth were as the story says, that story would not have been told, or not in that way. (Fictionally, there's a Father Brown story hinging on that.)

What I was just about to say. See also Yvain on self-defeating arguments.

A white wall is also evidence, in that sense, for all crows being magenta, for the moon being made of green cheese,

Okay, but...

for every sparrow falling being observed by God, and for no sparrow falling being observed by God.

How so?

for every sparrow falling being observed by God, and for no sparrow falling being observed by God.

How so?

Every white wall is a non-sparrow not observed by God, hence evidence for God observing every sparrow's fall. It is also a, um, no, you're right, the second one doesn't work.

Every white wall is a non-sparrow not observed by God

How do we know that the wall is not observed by God?

Ah, quite so. God sees all, sparrows and walls alike. Both of those examples are broken.

An omnipotence-omniscience paradox: "God, look away!" - "I can't!"

“There's something a human could do that God couldn't do, namely committing suicide.”

-- someone long ago, IIRC (Google is turning up lots of irrelevant stuff)

That one's easily solvable, isn't it? God could look away if he wanted to, but chose not to.

If sparrows do not exist, then "every sparrow falling is observed by God" and "no sparrow falling is observed by God" are both true. (And of course, every white wall is a tiny bit of evidence for "sparrows do not exist", although not very good evidence since there are so many other things in the universe that also need to be checked for sparrow-ness.)

Well, we could use the word "evidence" in different ways (you requiring some magnitude-of-prior-shift).

But then you'd still need a word for "that-which-[increases|decreases]-the-probability-you-assign-to-a-belief". Just because that shift is tiny doesn't render it undefined or its impact arbitrary. You can say with confidence that 1/x remains positive for any positive x however large, and be it a googolplex (btw, TIL in which case 1/x would be called a googolminex).

Think of what you're advocating here: whatever would we do if we disallowed strictly-speaking-correct-nitpicks on LW?

Well, we could use the word "evidence" in different ways (you requiring some magnitude-of-prior-shift).

There's a handy table, two of them in fact, of terminology for strength of evidence here. Up to 5 decibans is "barely worth mentioning". How many microbans does "Zeus ate my homework" amount to?

Think of what you're advocating here: whatever would we do if we disallowed strictly-speaking-correct-nitpicks on LW?

You may be joking, but I do think LW (and everywhere else) would be improved if people didn't do that. I find nitpicking as unappealing as nose-picking.

Nitpicking is absolutely critical in any public forum. Maybe in private, with only people who you know well and have very strong reason to believe are very much more likely to misspeak than to misunderstand, nitpicking can be overlooked. Certainly, I don't nitpick every misspoken statement in private. But when those conditions do not hold, when someone is speaking on a subject I am not certain they know well, or when I do not trust that everyone in the audience is going to correctly parse the statement as misspoken and then correctly reinterpret the correct version, nitpicking is the only way to ensure that everyone involved hears the correct message.

Charitably I'll guess that you dislike nitpicking because you already knew all those minor points, they were obvious to anyone reading after all, and they don't have any major impact on the post as a whole. The problem with that is that not everyone who reads Less Wrong has a fully correct understanding of everything that goes into every post. They don't spot the small mistakes, whether those be inconsequential math errors or a misapplication of some minor rule or whatever. And the problem is that just because the error was small in this particular context, it may be a large error in another context. If you mess up your math when doing Bayes' Theorem, you may thoroughly confuse someone who is weak at math and trying to follow how it is applied in real life. In the particular context of this post, getting the direction of a piece of evidence wrong is inconsequential if the magnitude of that evidence is tiny. But if you are making a systematic error which causes you to get the direction of certain types of evidence, which are usually small in magnitude, wrong, then you will eventually make a large error. And unless you are allowed to call out errors dealing with small magnitude pieces of evidence, you won't ever discover it.

I'd also like to say that just because a piece of evidence is "barely worth mentioning" when listing out evidence for and against a claim, does not mean that that evidence should be immediately thrown aside when found. The rules which govern evidence strong enough to convince me that 2+2=3 are the same rules that govern the evidence gained from the fact that when I drop an apple, it falls. You can't just pretend the rules stop applying and expect to come out ok in every situation. In part you can gain practice from applying the rules to those situations, and in part it's important to remember that they do still apply, even if in the end you decide that their outcome is inconsequential.

Nitpicking is absolutely critical in any public forum .

I disagree. Not all things that are true are either relevant or important. Irrelevancies and trivialities lower discussion quality, however impeccable their truth. There is practically nothing that anyone can say, that one could not find fault with, given sufficient motivation and sufficient disregard for the context that determines what matters and what does not.

In the case at hand, "evidence" sometimes means "any amount whatever, including zero", sometimes "any amount whatever, except zero, including such quantities as 1/3^^^3", and sometimes "an amount worth taking notice of".

In practical matters, only the third sense is relevant: if you want to know the colour of crows, you must observe crows, not non-crows, because that is where the value of information is concentrated. The first two are only relevant in a technical, mathematical context.

The point of the Bayesian solution to Hempel's paradox is to stop worrying about it, not to start seeing purple zebras as evidence for black crows that is worth mentioning in any other context than talking about Hempel's paradox.

How many microbans does "Zeus ate my homework" amount to?

Few enough that it's in the "barely worth mentioning" bracket, of course. (Under any kind of resource constraint, it wouldn't be mentioned at all, however that only relates to its infinitesimal weight, not the nature of what it is (evidence).)

You say that shouldn't be classified as evidence, I say it should. Note that the table is about strength of evidence.

Yes, in a world in which Zeus existed, people would not proclaim the importance of faith in Zeus, anymore than they proclaim the importance of faith in elephants or automobiles. Everyone would just accept that they exist.

Yes, in a world in which Zeus existed, people would not proclaim the importance of faith in Zeus

I don't know: consider the classic cargo cult. It proclaims the importance of faith in airplanes.

Or consider Christianity: people who fully believe in Jesus Christ (=from their point of view they live in the world in which Jesus exists) tend to proclaim the importance of faith in Jesus.

tend to proclaim the importance of faith in Jesus

Yes, that's the point - people don't tend to proclaim the importance of faith in things that actually exist. You won't hear them say "have faith in the existence of tables" or "have faith in the existence of chairs".

I would suspect that this is because a) everybody believes in tables and chairs (with the exception of a few very strange people, who are probably easy enough to spot), and b) nobody (again with a few odd exceptions) believes in any sort of doctrine or plan of action for chair-and-table-believers, so faith doesn't have many consequences (except for having somewhere to sit and place things on).

We, on the other hand, proclaim the importance of confidence in rational thought, for the same reasons that theists proclaim the importance of belief in their god: it is a belief which is not universal in the population, and it is a belief which we expect to have important consequences and prescriptions for action.

If you look into your spam folder you'll find plenty of evidence for penis extension pills and the availability of large amount of money in abandoned accounts at Nigerian banks.

This is actually a really tidy example of Bayesian thinking. People send various types of emails for a variety of reasons. Of those who send penis extension pill emails, there are (vaguely speaking) three possible groups:

  1. People who have invented penile embiggening pills and honestly want to sell them. (I've never confirmed anybody to be in this group, so it may be empty.)

  2. Scammers trying to find a sucker by spamming out millions of emails.

  3. Trolls.

If you see emails offering to "Eml4rge your m3mber!!", this is evidence for the existence of someone from one or more of these groups. Which group do you think is largest? Those spam emails are evidence for all of these, but not such strong evidence for choosing between them.

Don't spam algorithms actually use Bayes rule to filter spam from non-spam, updating when you click "this is spam" or "this is not spam"?

Yes, this is exactly how Paul Graham went about solving the spam problem.

The value of anecdotal evidence on a subject depends on how good the other sources are. For example, in something like medicine where something like 1 in 5 studies wind up retracted, anecdotal evidence is reasonable useful. To say nothing of the social "sciences".

You cannot expect that future evidence will sway you in a particular direction. "For every expectation of evidence, there is an equal and opposite expectation of counterevidence."

The (related) way I would expand this is "if you know what you will believe in the future, then you ought to believe that now."

Quoting myself from Yvain's blog:

Here’s a short and incomplete list of habits I would include in qualitative Bayes:

  1. Base rate attention.
  2. Consider alternative hypotheses.
  3. Compare hypotheses by likelihood ratios, not likelihoods.
  4. Search for experiments with high information content (measured by likelihood ratio) and low cost.
  5. Conservation of evidence.
  6. Competing values should have some tradeoff between them.

Each one of those is a full post to explain, I think. I also think they’re strongly reinforcing; 3 and 4 are listed as separate insights, here, but I don’t think one is very useful without the other.

Another useful thing for qualitative Bayes from Jaynes - always include a background information I in the list of information you're conditioning on. It reminds you that your estimates are fully contextual on all your knowledge, most of which is unstated and unexamined.

Actually, this seems like a General Semantics meets Bayes kind of principle. Surely Korzybski had a catchy phrase for a similar idea. Anyone got one?

Actually, this seems like a General Semantics meets Bayes kind of principle. Surely Korzybski had a catchy phrase for a similar idea. Can anyone got one?

Korzybski did "turgid" rather than "catchy", but this seems closely related to his insistence that characteristics are always left out by the process of abstraction, and that one can never know "all" about something. Hence his habitual use of "etc.", to the degree that he invented abbreviations for it.

Maybe this wasn't your intent, but framing this post as a rebuttal of Chapman doesn't seem right to me. His main point isn't "Bayesianism isn't useful"--more like "the Less Wrong memeplex has an unjustified fetish for Bayes' Rule" which still seems pretty true.

[1] See also Yvain's reaction to David Chapman's criticisms.

Chapman's follow-up.

I didn't get a lot out of Bayes at the first CFAR workshop, when the class involved mentally calculating odds ratios. It's hard for me to abstractly move numbers around in my head. But the second workshop I volunteered at used a Bayes-in-everyday-life method where you drew (or visualized) a square, and drew a vertical line to divide it according to the base rates of X versus not-X, and then drew a horizontal line to divide each of the slices according to how likely you were to see evidence H in the world where X was true, and the world where not-X was true. Then you could basically see whether the evidence had a big impact on your belief, just by looking at the relative size of the various rectangles. I have a strong ability to visualize, so this is helpful.

I visualize this square with some frequency when I notice an empirical claim about thing X presented with evidence H. Other than that, I query myself "what's the base rate of this?" a lot, or ask myself the question "is H actually more likely in the world where X is true versus false? Not really? Okay, it's not strong evidence."

"Absence of evidence isn't evidence of absence" is such a ubiquitous cached thought in rationalist communities (that I've been involved with) that its antithesis was probably the most important thing I learned from Bayesianism.

I find it interesting that Sir Arthur Conan Doyle, the author of the Sherlock Holmes stories, seems to have understood this concept. In his story "Silver Blaze" he has the following conversation between Holmes and a Scotland Yard detective:

Gregory (Scotland Yard detective): "Is there any other point to which you would wish to draw my attention?"

Holmes: "To the curious incident of the dog in the night-time."

Gregory: "The dog did nothing in the night-time."

Holmes: "That was the curious incident."

I am confused. I always thought that the "Bayes" in Bayesianism refers to the Bayesian Probability Model. Bayes' rule is a powerful theorem, but it is just one theorem, and is not what Bayesianism is all about. I understand that the video being criticized was specifically talking about Bayes' rule, but I do not think that is what Bayesianism is about at all. The Bayesian probability model basically says that probability is a degree of belief (as opposed to other models that only really work with possible possible worlds or repeatable experiments). I always thought this was the main thesis of Bayesianism was "The best language to talk about uncertainty is probability theory," which agrees perfectly with the interpretation that the name comes from the Bayesian probability model, and has nothing to do with Bayes' rule. Am I using the word in a way differently than everyone else?

Am I using the word in a way differently than everyone else?

That's how I use it. This showed up in Yvain's response:

What can we call this doctrine? In the old days it was known as probabilism, but this is unwieldy, and it refers to a variety practiced before we really understood what probability was. I think “Bayesianism” is an acceptable alternative, not just because Bayesian updating is the fundamental operation of this system, but because Bayesianism is the branch of probability that believes probabilities are degrees of mental credence and that allows for sensible probabilities of nonrepeated occurrences like “there is a God.”

I always thought this was the main thesis of Bayesianism was "The best language to talk about uncertainty is probability theory,"

That sounds too weak. Bayes is famous because of his rule - surely, Bayesianism must invoke it.

Just because Bayes did something awesome, doesn't mean that Bayesianism can't be named after other stuff that he worked on.

However, Bayesianism does invoke Bayes' Theorem as part of probability theory. Bayes' Theorem is a simple and useful part of Bayesian probability, so it makes a nice religious symbol, but I don't see it as much more than that. Saying Bayesianism is all about Bayes' Rule is like saying Christianity is about crosses. It is a small part of the belief structure.

Seems more like saying that Christianity is all about forgiveness. There's a lot more to it than that, but you're getting a lot closer than 'crosses' would suggest.

Banish talk of "thresholds of belief" ... However, perhaps I could be so confident that my behavior would not be practically discernible from absolute confidence.

While this is true mathematically, I'm not sure it's useful for people. Complex mental models have overhead, and if something is unlikely enough then you can do better to stop thinking about it. Maybe someone broke into my office and when I get there on Monday I won't be able to work. This is unlikely, but I could look up the robbery statistics for Cambridge and see that this does happen. Mathematically, I should be considering this in making plans for tomorrow, but practically it's a waste of time thinking about it.

(There's also the issue that we're not good at thinking about small probabilities. It's very hard to keep unlikely possibilities from taking on undue weight except by just not thinking about them.)

Maybe someone broke into my office and when I get there on Monday I won't be able to work. This is unlikely, but I could look up the robbery statistics for Cambridge and see that this does happen. Mathematically, I should be considering this in making plans for tomorrow, but practically it's a waste of time thinking about it.

I think about such things every time I lock a door. Or at least, I lock doors because I have thought about such things, even if they're not at the forefront of my mind when I do them. Do you not lock yours? Do you have an off-site backup for your data? Insurance against the place burning down?

Having taken such precautions as you think useful, thinking further about it is, to use Eliezer's useful concept, wasted motion. It is a thought that, predictably at the time you think it, will as events transpire turn out to not have contributed in any useful way. You will go to work anyway, and see then whether thieves have been in the night.

Tiny probabilities do not, in general, map to tiny changes in actions. Decisions are typically discontinuous functions of the probabilities.

I always lock doors without thinking, because the cost of thinking about whether it's worth my time is higher than the cost of locking the door.

"Someone will break into something I own someday" is much more likely than "someone will break into my office tonight". The former is likely enough that I do take general preparations (a habit of locking doors) but while there are specific preparations I would make to handle the intersection of that I planned to do at the office tomorrow and dealing with the aftermath of a burglary, that's unlikely enough to to be worth it.

Does locking doors generally lead to preventing break-ins? I mean, certianly in some cases (cars most notably) it does, but in general, if someone has gone up to your back door with the intent to break in, how likely are they to give up and leave upon finding it locked?

Mathematically, I should be considering this in making plans for tomorrow, but practically it's a waste of time thinking about it.

And then one day 4 years later you find out that a black swan event has occurred and because you never prepare for such things ('it's a waste of time thinking about it') you will face losses big enough to influence you greatly all at once.

Or not. - that's the thing with rare events.

You cannot expect that future evidence will sway you in a particular direction. "For every expectation of evidence, there is an equal and opposite expectation of counterevidence."

Well ... you can have an expected direction, just not if you account for magnitudes.

For example if I'm estimating the bias on a weighted die, and so far I've seen 2/10 rolls give 6's, if I roll again I expect most of the time to get a non-6 and revise down my estimate of the probability of a 6; however on the occasions when I do roll a 6 I will revise up my estimate by a larger amount.

Sometimes it's useful to have this distinction.

Well ... you can have an expected direction, just not if you account for magnitudes.

Yes, on reflection it was a poor choice of words. I was using "expect" in that sense according to which one expects a parameter to equal zero if the expected value of that parameter is zero. However, while "expected value" has a well-established technical meaning, "expect" alone may not. It is certainly reasonably natural to read what I wrote as meaning "my opinion is equally likely to be swayed in either direction," which, as you point out, is incorrect. I've added a footnote to clarify my meaning.

I'm well aware of this. My point was that there's a subtle difference between "direction of the expectation" and "expected direction".

The expectation of what you'll think after new evidence has to be the same as you think now, so can't point in any particular direction. However "direction" is a binary variable (which you might well care about) and this can have a particular non-zero expectation.

I'm being slightly ambiguous as to whether "expected" in "expected direction" is meant to be the technical sense or the common English one. It works fine for either, but to interpret it as an expectation you have to choose an embedding of your binary variable in a continuous space, which I was avoiding because it didn't seem to add much to the discussion.

So to summarise in pop Bayesian terms, akin to "don’t be so sure of your beliefs; be less sure when you see contradictory evidence." :

  1. There is always evidence; if it looks like the contrary, you are using too high a bar. (The plural of 'anecdote' is 'qualitative study'.)
  2. You can always give a guess; even if it later turns out incorrect, you have no way of knowing now.
  3. The only thing that matters is the prediction; hunches, gut feelings, hard numbers or academic knowledge, it all boils down to probabilities.
  4. Absence of evidence is evidence of absence, but you can't be picky with evidence.
  5. The maths don't lie; if it works, it is because somewhere there are numbers and rigour saying it should. (Notice the direction of the implication "it works" => "it has maths".)
  6. The more confident you are, the more surprised you can be; if you are unsure it means you expect anything.
  7. "Knowledge" is just a fancy-sounding word; ahead-of-time predictions or bust!

ETA:

  1. Choosing to believe is wishful thinking.

I'll add the Bayesian definition of evidence an awareness of selection effects to the list.

Reading this clarified something for me. In particular, "Banish talk like "There is absolutely no evidence for that belief".

OK, I can see that mathematically there can be very small amounts of evidence for some propositions (e.g. the existence of the deity Thor.) However in practice there is a limit to how small evidence can be for me to make any practical use of it. If we assign certainties to our beliefs on a scale of 0 to 100, then what can I realistically do with a bit of evidence that moves me from 87 to 87.01? or 86.99? I don't think I can estimate my certainty accurately to 1 decimal place--in fact I'm not sure I can get it to within one significant digit on many issues--and yet there's a lot of evidence in the world that should move my beliefs by a lot less than that.

Mathematically it makes sense to update on all evidence. Practically, there is a fuzzy threshold beyond which I need to just ignore very weak evidence, unless there's so much of it that the sum total crosses the bounds of significance.

Practically, there is a fuzzy threshold beyond which I need to just ignore very weak evidence, unless there's so much of it that the sum total crosses the bounds of significance.

Consider the difficulties of programming something like that:

Ignore evidence. If the accumulated ignored evidence crosses some threshold, process the whole of it.

You see the problem. If the quoted sentence is your preferred modus operandi, you'll have to restrict what you mean by "ignore". You'll still need to file it somewhere, and to evaluate it somehow, just so when the cumulative weight exceeds your given threshold, you'll be able to still update on it.