# 117

This article is an attempt to summarize basic material, and thus probably won't have anything new for the hard core posting crowd. It'd be interesting to know whether you think there's anything essential I missed, though.

You've probably seen the word 'Bayesian' used a lot on this site, but may be a bit uncertain of what exactly we mean by that. You may have read the intuitive explanation, but that only seems to explain a certain math formula. There's a wiki entry about "Bayesian", but that doesn't help much. And the LW usage seems different from just the "Bayesian and frequentist statistics" thing, too. As far as I can tell, there's no article explicitly defining what's meant by Bayesianism. The core ideas are sprinkled across a large amount of posts, 'Bayesian' has its own tag, but there's not a single post that explicitly comes out to make the connections and say "this is Bayesianism". So let me try to offer my definition, which boils Bayesianism down to three core tenets.

We'll start with a brief example, illustrating Bayes' theorem. Suppose you are a doctor, and a patient comes to you, complaining about a headache. Further suppose that there are two reasons for why people get headaches: they might have a brain tumor, or they might have a cold. A brain tumor always causes a headache, but exceedingly few people have a brain tumor. In contrast, a headache is rarely a symptom for cold, but most people manage to catch a cold every single year. Given no other information, do you think it more likely that the headache is caused by a tumor, or by a cold?

If you thought a cold was more likely, well, that was the answer I was after. Even if a brain tumor caused a headache every time, and a cold caused a headache only one per cent of the time (say), having a cold is so much more common that it's going to cause a lot more headaches than brain tumors do. Bayes' theorem, basically, says that if cause A might be the reason for symptom X, then we have to take into account both the probability that A caused X (found, roughly, by multiplying the frequency of A with the chance that A causes X) and the probability that anything else caused X. (For a thorough mathematical treatment of Bayes' theorem, see Eliezer's Intuitive Explanation.)

There should be nothing surprising about that, of course. Suppose you're outside, and you see a person running. They might be running for the sake of exercise, or they might be running because they're in a hurry somewhere, or they might even be running because it's cold and they want to stay warm. To figure out which one is the case, you'll try to consider which of the explanations is true most often, and fits the circumstances best.

Core tenet 1: Any given observation has many different possible causes.

Acknowledging this, however, leads to a somewhat less intuitive realization. For any given observation, how you should interpret it always depends on previous information. Simply seeing that the person was running wasn't enough to tell you that they were in a hurry, or that they were getting some exercise. Or suppose you had to choose between two competing scientific theories about the motion of planets. A theory about the laws of physics governing the motion of planets, devised by Sir Isaac Newton, or a theory simply stating that the Flying Spaghetti Monster pushes the planets forwards with His Noodly Appendage. If these both theories made the same predictions, you'd have to depend on your prior knowledge - your prior, for short - to judge which one was more likely. And even if they didn't make the same predictions, you'd need some prior knowledge that told you which of the predictions were better, or that the predictions matter in the first place (as opposed to, say, theoretical elegance).

Or take the debate we had on 9/11 conspiracy theories. Some people thought that unexplained and otherwise suspicious things in the official account had to mean that it was a government conspiracy. Others considered their prior for "the government is ready to conduct massively risky operations that kill thousands of its own citizens as a publicity stunt", judged that to be overwhelmingly unlikely, and thought it far more probable that something else caused the suspicious things.

Again, this might seem obvious. But there are many well-known instances in which people forget to apply this information. Take supernatural phenomena: yes, if there were spirits or gods influencing our world, some of the things people experience would certainly be the kinds of things that supernatural beings cause. But then there are also countless of mundane explanations, from coincidences to mental disorders to an overactive imagination, that could cause them to perceived. Most of the time, postulating a supernatural explanation shouldn't even occur to you, because the mundane causes already have lots of evidence in their favor and supernatural causes have none.

Core tenet 2: How we interpret any event, and the new information we get from anything, depends on information we already had.

Sub-tenet 1: If you experience something that you think could only be caused by cause A, ask yourself "if this cause didn't exist, would I regardless expect to experience this with equal probability?" If the answer is "yes", then it probably wasn't cause A.

This realization, in turn, leads us to

Core tenet 3: We can use the concept of probability to measure our subjective belief in something. Furthermore, we can apply the mathematical laws regarding probability to choosing between different beliefs. If we want our beliefs to be correct, we must do so.

The fact that anything can be caused by an infinite amount of things explains why Bayesians are so strict about the theories they'll endorse. It isn't enough that a theory explains a phenomenon; if it can explain too many things, it isn't a good theory. Remember that if you'd expect to experience something even when your supposed cause was untrue, then that's no evidence for your cause. Likewise, if a theory can explain anything you see - if the theory allowed any possible event - then nothing you see can be evidence for the theory.

At its heart, Bayesianism isn't anything more complex than this: a mindset that takes three core tenets fully into account. Add a sprinkle of idealism: a perfect Bayesian is someone who processes all information perfectly, and always arrives at the best conclusions that can be drawn from the data. When we talk about Bayesianism, that's the ideal we aim for.

Fully internalized, that mindset does tend to color your thought in its own, peculiar way. Once you realize that all the beliefs you have today are based - in a mechanistic, lawful fashion - on the beliefs you had yesterday, which were based on the beliefs you had last year, which were based on the beliefs you had as a child, which were based on the assumptions about the world that were embedded in your brain while you were growing in your mother's womb... it does make you question your beliefs more. Wonder about whether all of those previous beliefs really corresponded maximally to reality.

And that's basically what this site is for: to help us become good Bayesians.

New Comment
Some comments are truncated due to high volume. Change truncation settings

is there a simple explanation of the conflict between bayesianism and frequentialism? I have sort of a feel for it from reading background materials but a specific example where they yield different predictions would be awesome. has such already been posted before?

[-]Cyan110

Eliezer's views as expressed in Blueberry's links touch on a key identifying characteristic of frequentism: the tendency to think of probabilities as inherent properties of objects. More concretely, a pure frequentist (a being as rare as a pure Bayesian) treats probabilities as proper only to outcomes of a repeatable random experiment. (The definition of such a thing is pretty tricky, of course.)

What does that mean for frequentist statistical inference? Well, it's forbidden to assign probabilities to anything that is deterministic in your model of reality. So you have estimators, which are functions of the random data and thus random themselves, and you assess how good they are for your purpose by looking at their sampling distributions. You have confidence interval procedures, the endpoints of which are random variables, and you assess the sampling probability that the interval contains the true value of the parameter (and the width of the interval, to avoid pathological intervals that have nothing to do with the data). You have statistical hypothesis testing, which categorizes a simple hypothesis as “rejected” or “not rejected” based on a procedure assessed in terms of the sampling probability of an error in the categorization. You have, basically, anything you can come up with, provided you justify it in terms of its sampling properties over infinitely repeated random experiments.

Here is a more general definition of "pure frequentism" (which includes frequentists such as Reichenbach):

Consider an assertion of probability of the form "This X has probability p of being a Y." A frequentist holds that this assertion is meaningful only if the following conditions are met:

1. The speaker has already specified a determinate set X of things that actually have or will exist, and this set contains "this X".

2. The speaker has already specified a determinate set Y containing all things that have been or will be Ys.

The assertion is true if the proportion of elements of X that are also in Y is precisely p.

A few remarks:

1. The assertion would mean something different if the speaker had specified different sets X and Y, even though X and Y aren't mentioned explicitly in the assertion.

2. If no such sets had been specified in the preceding discourse, the assertion by itself would be meaningless.

3. However, the speaker has complete freedom in what to take as the set X containing "this X", so long as X contains X. In particular, the other elements don't have to be exactly like X, or be generated by exactly the same repeatable procedure,

...
7Mayo
I'm sorry to see such wrongheaded views of frequentism here. Frequentists also assign probabilities to events where the probabilistic introduction is entirely based on limited information rather than a literal randomly generated phenomenon. If Fisher or Neyman was ever actually read by people purporting to understand frequentist/Bayesian issues, they'd have a radically different idea. Readers to this blog should take it upon themselves to check out some of the vast oversimplifications... And I'm sorry but Reichenbach's frequentism has very little to do with frequentist statistics--. Reichenbach, a philosopher, had an idea that propositions had frequentist probabilities. So scientific hypotheses--which would not be assigned probabilities by frequentist statisticians--could have frequentist probabilities for Reichenbach, even though he didn't think we knew enough yet to judge them. He thought at some point we'd be able to judge of a hypothesis of a type how frequently hypothesis like it would be true. I think it's a problematic idea, but my point was just to illustrate that some large items are being misrepresented here, and people sold a wrongheaded view. Just in case anyone cares. Sorry to interrupt the conversation (errorstatistics.com)
1Cyan
Do you intend to be replying to me or to Tyrrell McAllister?
2PhilGoetz
Wait - Bayesians can assign probabilities to things that are deterministic? What does that mean? What would a Bayesian do instead of a T-test?
[-]wnoise250

Wait - Bayesians can assign probabilities to things that are deterministic? What does that mean?

Absolutely!

The Bayesian philosophy is that probabilities are about states of knowledge. Probability is reasoning with incomplete information, not about whether an event is "deterministic", as probabilities do still make sense in a completely deterministic universe. In a poker game, there are almost surely no quantum events influencing how the deck is shuffled. Classical mechanics, which is deterministic, suffices to predict the ordering of cards. Even so, we have neither sufficient initial conditions (on all the particles in the dealer's body and brain, and any incoming signals), nor computational power to calculate the ordering of the cards. In this case, we can still use probability theory to figure out probabilities of various hand combinations that we can use to guide our betting. Incorporating knowledge of what cards I've been dealt, and what (if any) are public is straightforward. Incorporating player's actions and reactions is much harder, and not really well enough defined that there is a mathematically correct answer, but clearly we should use that knowledge ...

1Cyan
Very nice! I'd only replace "useful" with "plausible". (Sure, it's hard to define plausibility, but usefulness is not really the right concept.)
5wnoise
"Usefulness" certainly isn't the orthodox Bayesian phrasing. I call myself a Bayesian because I recognize that Bayes's Rule is the right thing to use in these situations. Whether or not the probabilities assigned to hypotheses "actually are" probabilities (whatever that means), they should obey the same mathematical rules of calculation as probabilities. But precisely because only the manipulation rules matter, I'm not sure it is worth emphasizing that "to be a good Bayesian" you must accord these probabilities the same status as other probabilities. A hardcore Frequentist is not going to be comfortable doing that. Heck, I'm not sure I'm comfortable doing that. Data and event probabilities are things that can eventually be "resolved" to true or false, by looking after the fact. Probability as plausibility makes sense for these things. But for hypotheses and models, I ask myself "plausibility of what? Being true?" Almost certainly, the "real" model (when that even makes sense) isn't in our space of models. For example, a common, almost necessary, assumption is exchangeability: that any given permutation of the data is equally likely -- effectively that all data points are drawn from the same distribution. Data often doesn't behave like that, instead having a time drift. Coins being tossed develop wear, cards being shuffled and dealt get bent. I really do prefer to think of some models being more or less useful. Of course, following this path shades into decision theory: we might want to assign priors according to how "tractable" the models are, including both in specification (stupid models that just specify what the data will be take lots of specification, so should have lower initial probabilities). Models that take longer to compute data probabilities should similarly have a probability penalty, not simply because they're implausible, but because we don't want to use them unless the data force us to.

...shades into decision theory...Models that take longer to compute data probabilities should similarly have a probability penalty, not simply because they're implausible, but because we don't want to use them unless the data force us to.

Whoa! that sounds dangerous! Why not keep the beliefs and costs separate and only apply this penalty at the decision theory stage?

2wnoise
Well, I said shaded into the lines of decision theory... Yes, it absolutely is dangerous, and thinking about it more I agree it should not be done this way. Probability penalties do not scale correctly with the data collected: they're essentially just a fixed offset. Modified utility of using a particular method really is different. If a method is unusable, we shouldn't use it, and methods that trade off accuracy for manageability should be decided at that level, once we can judge the accuracy -- not earlier. EDIT: I suppose I was hoping for a valid way of justifying the fact that we throw out models that are too hard to use or analyze -- they never make it into our set of hypotheses in the first place. It's amazing how often conjugate priors "just happen" to be chosen...
4Cyan
Plausibility of being true given the prior information. Just as Aristotelian logic gives valid arguments (but not necessarily sound ones), Bayes's theorem gives valid but not necessarily sound plausibility assessments. That's pretty much why I wanted to make the distinction between plausibility and usefulness. One of the things I like about the Cox-Jaynes approach is that it cleanly splits inference and decision-making apart.
2wnoise
Okay, sure we can go back to the Bayesian mantra of "all probabilities are conditional probabilities". But our prior information effectively includes the statement that one of our models is the "true one". And that's never the actual case, so our arguments are never sound in this sense, because we are forced to work from prior information that isn't true. This isn't a huge problem, but it in some sense undermines the motivation for finding these probabilities and treating them seriously -- they're conditional probabilities being applied in a case where we know that what is being conditioned on is false. What is the grounding to our actual situation? I like to take the stance that in practice this is still useful -- as an approximation procedure -- sorting through models that are approximately right.
3Cyan
One does generally resort to non-Bayesian model checking methods. Andrew Gelman likes to include such checks under the rubric of "Bayesian data analysis"; he calls the computing of posterior probabilities and densities "Bayesian inference", a preceding subcomponent of Bayesian data analysis. This makes for sensible statistical practice, but the underpinnings aren't strong. One might consider it an attempt to approximate the Solomonoff prior.
0wnoise
Yes, in practice people resort to less motivated methods that work well. I'd really like to see some principled answer that has the same feel as Bayesianism though. As it stands, I have no problem using Bayesian methods for parameter estimation. This is natural because we really are getting pdf(parameters | data, model). But for model selection and evaluation (i.e. non-parametric Bayes) I always feel that I need an "escape hatch" to include new models that the Bayes formalism simply doesn't have any place for.
0Cyan
I feel the same way.
3wedrifid
I am much more comfortable leaving probability as it is but using a different term for usefulness.
1nazgulnarsil
the tendency to think of probabilities as inherent properties of objects. yeah, this was my intuitive reason for thinking frequentists are a little crazy.
5byrnema
On the other hand, it's evidence to me that we're talking about different types of minds. Have we identified whether this aspect of frequentism is a choice, or just the way their minds work? I'm a frequentist, I think, and when I interrogate my intuition about whether 50% heads / 50% tails is a property of a fair coin, it returns 'yes'. However, I understand that this property is an abstract one, and my intuition doesn't make any different empirical predictions about the coin than a Bayesian would. Thus, what difference does it make if I find it natural to assign this property? In other words, in what (empirically measurable!) sense could it be crazy?
7wnoise
http://comptop.stanford.edu/preprints/heads.pdf Well, the immediate objection is that if you hand the coin to a skilled tosser, the frequencies of heads and tails in the tosses can be markedly different than 50%. If you put this probability in the coin, than you really aren't modeling things in a manner that accords with results. You can, of course talk instead about a procedure of coin-tossing, that naturally has to specify the coin as well. Of course, that merely pushes things back a level. If you completely specify the tossing procedure (people have built coin-tossing machines), then you can repeatedly get 100%/0% splits by careful tuning. If you don't know whether it is tuned to 100% heads or 100% tails, is it still useful to describe this situation probabilistically? A hard-core Frequentist "should" say no, as everything is deterministic. Most people are willing to allow that 50% probability is a reasonable description of the situation. To the extent that you do allow this, you are Bayesian. To the extent that you don't, you're missing an apparently valuable technique.
2byrnema
The frequentist can account for the biased toss and determinism, in various ways. My preferred reply would be that the 50/50 is a property of the symmetry of the coin. (Of course, it's a property of an idealized coin. Heck, a real coin can land balanced on its edge.) If someone tosses the coin in a way that biases the coin, she has actually broken the symmetry in some way with her initial conditions. In particular, the tosser must begin with the knowledge of which way she is holding the coin -- if she doesn't know, she can't bias the outcome of the coin. I understand that Bayesian's don't tend to abstract things to their idealized forms ... I wonder to what extent Frequentism does this necessarily. (What is the relationship between Frequentism and Platonism?)
7wnoise
Oh, absolutely. The typical way is choosing some reference class of idealized experiments that could be done. Of course, the right choice of reference class is just as arbitrary as the right choice of Bayesian prior. Whereas the Bayesian would argue that the 50/50 property is a symmetry about our knowledge of the coin -- even a coin that you know is biased, but that you have no evidence for which way it is biased. Well, I don't think Bayesians are particularly reluctant to look at idealized forms, it's just that when you can make your model more closely match the situation (without incurring horrendous calculational difficulties) there is a benefit to do so. And of course, the question is "which idealized form?" There are many ways to idealize almost any situation, and I think talking about "the" idealized form can be misleading. Talking about a "fair coin" is already a serious abstraction and idealization, but it's one that has, of course, proven quite useful. That's a very interesting question.
5Blueberry
To quote from Gelman's rejoinder that Phil Goetz mentioned, So, speaking very loosely, Bayesianism is to science, inductive logic, and Aristotelianism as frequentism is to math, deductive logic, and Platonism. That is, Bayesianism is synthesis; frequentism is analysis.
1byrnema
Interesting! That makes a lot of sense to me, because I had already made connections between science and Aristotelianism, pure math and Platonism.
7Blueberry
This and this might be the kind of thing you're looking for. Though the conflict really only applies in the artificial context of a math problem. Frequentialism is more like a special case of Bayesianism where you're making certain assumptions about your priors, sometimes specifically stated in the problem, for ease of calculation. For instance, in a Frequentialist analysis of coin flips, you might ignore all your prior information about coins, and assume the coin is fair.
2nazgulnarsil
thanks, that's what I was looking for. would it be correct to say that in the frequentist interpretation your confidence interval narrows as your trials approach infinity?
4wnoise
That is a highly desired property of Frequentist methods, but it's not guaranteed by any means.
6bill
If it helps, I think this is an example of a problem where they give different answers to the same problem. From Jaynes; see http://bayes.wustl.edu/etj/articles/confidence.pdf , page 22 for the details, and please let me know if I've erred or misinterpreted the example. Three identical components. You run them through a reliability test and they fail at times 12, 14, and 16 hours. You know that these components fail in a particular way: they last at least X hours, then have a lifetime that you assess as an exponential distribution with an average of 1 hour. What is the shortest 90% confidence interval / probability interval for X, the time of guaranteed safe operation? Frequentist 90% confidence interval: 12.1 hours - 13.8 hours Bayesian 90% probability interval: 11.2 hours - 12.0 hours Note: the frequentist interval has the strange property that we know for sure that the 90% confidence interval does not contain X (from the data we know that X <= 12). The Bayesian interval seems to match our common sense better.
8cupholder
Heh, that's a cheeky example. To explain why it's cheeky, I have to briefly run through it, which I'll do here (using Jaynes's symbols so whoever clicked through and has pages 22-24 open can directly compare my summary with Jaynes's exposition). Call N the sample size and θ the minimum possible widget lifetime (what bill calls X). Jaynes first builds a frequentist confidence interval around θ by defining the unbiased estimator θ∗, which is the observations' mean minus one. (Subtracting one accounts for the sample mean being >θ.) θ∗'s probability distribution turns out to be y^(N-1) exp(-Ny), where y = θ∗ - θ + 1. Note that y is essentially a measure of how far our estimator θ∗ is from the true θ, so Jaynes now has a pdf for that. Jaynes integrates that pdf to get y's cdf, which he calls F(y). He then makes the 90% CI by computing [y1, y2] such that F(y2) - F(y1) = 0.9. That gives [0.1736, 1.8259]. Substituting in N and θ∗ for the sample and a little algebra (to get a CI corresponding to θ∗ rather than y) gives his θ CI of [12.1471, 13.8264]. For the Bayesian CI, Jaynes takes a constant prior, then jumps straight to the posterior being N exp(N(θ - x1)), where x1's the smallest lifetime in the sample (12 in this case). He then comes up with the smallest interval that encompasses 90% of the posterior probability, and it turns out to be [11.23, 12]. Jaynes rightly observes that the Bayesian CI accords with common sense, and the frequentist CI does not. This comparison is what feels cheeky to me. Why? Because Jaynes has used different estimators for the two methods [edit: I had previously written here that Jaynes implicitly used different estimators, but this is actually false; when he discusses the example subsequently (see p. 25 of the PDF) he fleshes out this point in terms of sufficient v. non-sufficient statistics.]. For the Bayesian CI, Jaynes effectively uses the minimum lifetime as his estimator for θ (by defining the likelihood to be solely a function of the
8wnoise
This example really is Bayesianism-done-straightforwardly. The point is that you really don't need to be sly to get reasonable results. A constant prior ends up using only the likelihoods. The jump straight to the posterior is a completely mechanical calculation, just products, and normalization. Each individual likelihood goes to zero for (x < θ). This means that product also does for the smallest (x1 < θ). You will get out the same PDF as Jaynes. CIs can be constructed many ways from PDFs, but constructing the smallest one will give you the same one as Jaynes. EDIT: for full effect, please do the calculation yourself.
0Cyan
I stopped reading cupholder's comment before the last paragraph (to write my own reply) and completely missed this! D'oh!
1Cyan
Jaynes does go on to discuss everything you have pointed out here. He noted that confidence intervals had commonly been held not to require sufficient statistics, pointed out that some frequentist statisticians had been doubtful on that point, and remarked that if the frequentist estimator had been the sufficient statistic (the minimum lifetime) then the results would have agreed. I think the real point of the story is that he ran through the frequentist calculation for a group of people who did this sort of thing for a living and shocked them with it.
0cupholder
You got me: I didn't read the what-went-wrong subsection that follows the example. (In my defence, I did start reading it, but rolled my eyes and stopped when I got to the claim that "there must be a very basic fallacy in the reasoning underlying the principle of confidence intervals".) I suspect I'm not the only one, though, so hopefully my explanation will catch some of the eyeballs that didn't read Jaynes's own post-mortem. [Edit to add: you're almost certainly right about the real point of the story, but I think my reply was fair given the spirit in which it was presented here, i.e. as a frequentism-v.-Bayesian thing rather than an orthodox-statisticians-are-taught-badly thing.]
1Cyan
Independently reproducing Jaynes's analysis is excellent, but calling him "cheeky" for "implicitly us[ing] different estimators" is not fair given that he's explicit on this point. It's a frequentism-v.-Bayesian thing to the extent that correct coverage is considered a sufficient condition for good frequentist statistical inference. This is the fallacy that you rolled your eyes at; the room full of shocked frequentists shows that it wasn't a strawman at the time. [ETA: This isn't quite right. The "v.-Bayesian" part comes in when correct coverage is considered a necessary condition, not a sufficient condition.] ETA: This is a really good point, and it makes me happy that you wrote your explanation. For people for whom Jaynes's phrasing gets in the way, your phrasing bypasses the polemics and lets them see the math behind the example.
0cupholder
I was wrong to say that Jaynes implicitly used different estimators for the two methods. After the example he does mention it, a fact I missed due to skipping most of the post-mortem. I'll edit my post higher up to fix that error. (That said, at the risk of being pedantic, I did take care to avoid calling Jaynes-the-person cheeky. I called his example cheeky, as well as his comparison of the frequentist CI to the Bayesian CI, kinda.) When I read Jaynes's fallacy claim, I didn't interpret it as saying that treating coverage as necessary/sufficient was fallacious; I read it as arguing that the use of confidence intervals in general was fallacious. That was made me roll my eyes. [Edit to clarify: that is, I was rolling my eyes at what I felt was a strawman, but a different one to the one you have in mind.] Having read his post-mortem fully and your reply, I think my initial, eye-roll-inducing interpretation was incorrect, though it was reasonable on first read-through given the context in which the "fallacy" statement appeared.
0Cyan
Fair point.
0nazgulnarsil
excellent paper, thanks for the link.
0Jordan
My intuition would be that the interval should be bounded above by 12 - epsilon, since the probability that we got one component that failed at the theoretically fastest rate seems unlikely (probability zero?).
2Cyan
You can treat the interval as open at 12.0 if you like; it makes no difference.
2JGWeissman
If by epsilon, you mean a specific number greater than 0, the only reason to shave off an interval of length epsilon from the high end of the confidence interval is if you can get the probability contained in that epsilon-length interval back from a smaller interval attached to the low end of the confidence interval. (I haven't work through the math, and the pdf link is giving me "404 not found", but presumably this is not the case in this problem.)
2Cyan
The link's a 404 because it includes a comma by accident -- here's one that works: http://bayes.wustl.edu/etj/articles/confidence.pdf.
0Jordan
Thanks, that makes sense, although it still butts up closely against my intuition.
3PhilGoetz
Andrew Gelman wrote a parody of arguments against Bayesianism here. Note that he says that you don't have to choose Bayesianism or frequentism; you can mix and match. I'd be obliged if someone would explain this paragraph, from his response to his parody: • “Why should I believe your subjective prior? If I really believed it, then I could just feed you some data and ask you for your subjective posterior. That would save me a lot of effort!”: I agree that this criticism reveals a serious incoherence with the subjective Bayesian framework as well with in the classical utility theory of von Neumann and Morgenstern (1947), which simultaneously demands that an agent can rank all outcomes a priori and expects that he or she will make utility calculations to solve new problems. The resolution of this criticism is that Bayesian inference (and also utility theory) are ideals or aspirations as much as they are descriptions. If there is serious disagreement between your subjective beliefs and your calculated posterior, then this should send you back to re-evaluate your model.

Nice explanation. My only concern is that by the opening statement "aiming low". It makes it difficult to send this article to people without them justifiably rejecting it out of hand as a patronizing act. When the intention for aim low is truly noble, perhaps it is just as accurately described as writing clearly, writing for non-experts, or maybe even just writing an "introduction".

5Kaj_Sotala
Good point. I changed "to aim low" to "to summarize basic material".
0[anonymous]
And besides, as a software developer with plenty of Bayesian theory behind me, I appreciate the simplicity of the article for the clarity it provides me. Thanks for "aiming low" ;-)

Great, great post. I like that it's more qualitative and philosophical than quantitative, which really makes it clear how to think like a Bayesian. Though I know the math is important, having this kind of intuitive, qualitative understanding is very useful for real life, when we don't have exact statistics for so many things.

Re: "Core tenet 1: For any given observation, there are lots of different reasons that may have caused it."

This seems badly phrased. It is normally previous events that cause observations. It is not clear what it means for a reason to cause something.

3Kaj_Sotala
Good point. That sentence structure was a carryover from Finnish, where you can say that reasons cause things. Would "Any given observation has many different possible causes" be better?
4Morthrod
Yes, that would be better.
2Kaj_Sotala
Changed.
[-]Jack70

I don't know if it belongs here or in a separate post but afaik there is no explanation of the Dutch book argument on Less Wrong. It seems like there should be. Just telling people that structuring your beliefs according to Bayes Theorem will make them accurate might not do the trick for some. The Dutch book argument makes it clear why you can't just use any old probability distribution.

8Kaj_Sotala
I thought about whether to include a Dutch Book discussion in this post, but felt it would have been too long and not as "deep core" as the other stuff. More like "supporting core". But yes, it would be good to have a discussion of that up on LW somewhere.
1wedrifid
3Jack
I'm on it.

Thanks Kaj,

As I stated in my last post, reading LW often gives me the feeling that I have read something very important, yet I often don't immediately know why what I just read should be important until I have some later context in which to place the prior content.

Your post just gave me the context in which to make better sense of all of the prior content on Bayes here on LW.

It doesn't hurt that I have finally dipped my toes in the Bayesian Waters of Academia in an official capacity with a Probability and Stats class (which seems to be a prerequisite for s...

Possible typo:

A theory about the laws of physics governing the motion of planets, devised by Sir Isaac Newton, or a theory simply stating that the Flying Spaghetti Monster pushes the planets forward>s< with His Noodly Appendage.

In the spirit of aiming low, I don't think you aimed nearly low enough. If I hadn't already read a small amount from the sequences I wouldn't have been able to pick up too much from this article. This reads as a great summary; I am not convinced it is a good explanation.

The rest of this comment is me saying the above in mo...

3Kaj_Sotala
This is excellent feedback; please, do go on. I did wonder if this was still too short and not aiming low enough. I chose to go on the side of briefness, partially because I was worried about ending up with a giant mammoth post and partially because I felt I'd just be repeating what Eliezer's said before. But yeah, looking at it now, I'm not at all convinced of how well I'd have gotten the message if my pre-OB self had read this. Interesting that you find the usage of "you" and "we" patronizing. I hadn't thought of it like that - I intended it as a way to make the post less formal and build a more comfortable atmosphere to the reader. Your rewording sounds good: not exactly the way I'd put it, but certainly something to build on. Hmm, what do people think - if we end up rewriting this, should I just edit this post? Or make an entirely new one? Perhaps keep this one as it is, but work the changes into a future one that's longer?
9MrHen
0Kaj_Sotala
Very interesting. Actually, I didn't seek to aim that low - I was targeting the average LW reader (or at least an average person who was comfortable with maths). However, I still find this to be very valuable, since I have played around with the idea of trying to write a book that'd attempt to sell (implicitly or explicitly) the idea of "maths / science, especially as applied to rationality / cognitive science is actually fun" to a lay audience. So I probably won't alter the original article as a reaction to this, but if you want to nevertheless help me in figuring out how to reach to that audience, do continue. :)
0MrHen
Haha, will do. I do realize that some of what I am bringing up is extremely petty, but I have watched some of my articles get completely derailed by what I would consider to be a completely irrelevant point. Even amongst the high quality discussions in the comments I find myself needing to back up and ask a Really Obvious Question. This is likely a fault in the way I communicate (which is accentuated online) and also a glitch where people are not willing/able to drop subjects that are bugging them. If I was fundamentally opposed to the claim that all brain tumors caused headaches I would feel compelled to point it out in the comments. (This compulsion is something I am trying to curb.) In any case, I am glad the comments are helpful and I will continue as I find the time. If you ever start drafting something like what you mentioned I am willing to proofread and comment.
7pjeby
3wnoise
Personally, I think if it's just minor stylistic changes in expressing the same material, editing the post is the way to go; if it's adding more material, or expressing it radically differently, then a new post is appropriate.
0h-H
it's fine the way it is I think, it covers enough without being too specific. great post.

A frequentist asks, "did you find enough evidence?" A Bayesian asks, "how much evidence did you find?"

Frequentists can be tricky, by saying that a very small amount of evidence is sufficient; and they can hide this claim behind lots of fancy calculations, so they usually get away with it. This makes for better press releases, because saying "we found 10dB of evidence that X" doesn't sound nearly as good as saying "we found that X".

1PhilGoetz
Since when do frequentists measure evidence in decibels?
2JGWeissman
jimrandomh claimed that frequentists don't report amounts of evidence. So you object that measuring in decibels is not how they don't report it? If they don't reports amount of evidence, then of course they don't report it in the precise way in the example.
1toto
Frequentists (or just about anybody involved in experimental work) report p-values, which are their main quantitative measure of evidence.
6JGWeissman
Evidence, as measured in log odds, has the nice property that evidence from independent sources can be combined by adding. Is there any way at all to combine p-values from independent sources? As I understand them, p-values are used to make a single binary decision to declare a theory supported or not, not to track cumulative strength of belief in a theory. They are not a measure of evidence.
Log odds of independent events do not add up, just as the odds of independent events do not multiply. The odds of flipping heads is 1:1, the odds of flipping heads twice is not 1:1 (you have to multiply odds by likelihood ratios, not odds by odds, and likewise you don't add log odds and log odds, but log odds and log likelihood-ratios). So calling log odds themselves "evidence" doesn't fit the way people use the word "evidence" as something that "adds up". This terminology may have originated here: http://causalityrelay.wordpress.com/2008/06/23/odds-and-intuitive-bayes/ I'm voting your comment up, because I think it's a great example of how terminology should be chosen and used carefully. If you decide to edit it, I think it would be most helpful if you left your original words as a warning to others :)
0JGWeissman
By "evidence", I refer to events that change an agent's strength of belief in a theory, and the measure of evidence is the measure of this change in belief, that is, the likelihood-ratio and log likelihood-ratio you refer to. I never meant for "evidence" to refer to the posterior strength of belief. "Log odds" was only meant to specify a particular measurement of strength in belief.
0Paul Crowley
Can you be clearer? Log likelihood ratios do add up, so long as the independence criterion is satisfied (ie so long as P(E_2|H_x) = P(E_2|E_1,H_x) for each H_x).
Sure, just edited in the clarification: "you have to multiply odds by likelihood ratios, not odds by odds, and likewise you don't add log odds and log odds, but log odds and log likelihood-ratios".
1Morendil
As long as there are only two H_x, mind you. They no longer add up when you have three hypotheses or more.
0Paul Crowley
Indeed - though I find it very hard to hang on to my intuitive grasp of this!
Here is the post on information theory I said I would write: http://lesswrong.com/lw/1y9/information_theory_and_the_symmetry_of_updating/ It explains "mutual information", i.e. "informational evidence", which can be added up over as many independent events as you like. Hopefully this will have restorative effects for your intuition!
Don't worry, I have an information theory post coming up that will fix all of this :)
1Cyan
There's lots of papers on combining p-values.
2JGWeissman
Well, just looking at the first result, it gives a formula for combining n p-values that as near as I can tell, lacks the property that C(p1,p2,p3) = C(C(p1,p2),p3). I suspect this is a result of unspoken assumptions that the combined p-values were obtained in a similar fashion (which I violate by trying to combine a p-value combined from two experiments with another obtained from a third experiment), which would be information not contained in the p-value itself. I am not sure of this because I did not completely follow the derivation. But is there a particular paper I should look at that gives a good answer?
0Cyan
I haven't actually read any of that literature -- Cox's theorem suggests it would not be a wise investment of time. I was just Googling it for you.
0JGWeissman
Fair enough, though it probably isn't worth my time either. Unless someone claims that they have a good general method for combining p-values, such that it does not matter where the p-values come from, or in what order they are combine, and can point me at one specific method that does all that.

I recently started working through this Applied Bayesian Statistics course material, which has done wonders for my understanding of Bayesianism vs. the bag-of-tricks statistics I learned in engineering school.

6Seth_Goldin
So I finally picked up a copy of Probability Theory: The Logic of Science, by E.T. Jaynes. It's pretty intimidating and technical, but I was surprised how much prose there is, which makes it surprisingly palatable. We should recommend this more here on Less Wrong.
2Erebus
Just remember that Jaynes was not a mathematician and many of his claims about pure mathematics (as opposed to computations and their applications) in the book are wrong. Especially, infinity is not mysterious.
0thomblake
It should be obvious that infinity (like all things) is not inherently mysterious, and equally obvious that it's mysterious (if not unknown) to most people.
0Erebus
Infinity is mysterious was intended as a paraphrase of Jaynes' chapter on "paradoxes" of probability theory, and I intended mysterious precisely in the sense of inherently mysterious. As far as I know, Jaynes didn't use the word mysterious himself. But he certainly claims that rules of reasoning about infinity (which he conveniently ignores) are not to be trusted and that they lead to paradoxes.

Bayesianism is more than just subjective probability; it is a complete decision theory.

A decent summary is provided by Sven Ove Hansson:

1. The Bayesian subject has a coherent set of probabilistic beliefs.
2. The Bayesian subject has a complete set of probabilistic beliefs.
3. When exposed to new evidence, the Bayesian subject changes his (her) beliefs in accordance with his (her) conditional probabilities.
4. Finally, Bayesianism states that the rational agent chooses the option with the highest expected utility.

3wnoise
What Bayescraft covers is a matter of tendentious definitions. I personally do not consider decision theory a necessary part of it, though it is certainly part of we're trying to capture at LessWrong.
7Douglas_Knight
I agree. The line between belief and decision is the line between 3 and 4 in that list and it is such a clean line that the von Neumann-Morgenstern axioms can be (and usually are) presented about a frequentist world.

"A might be the reason for symptom X, then we have to take into account both the probability that X caused A"

I think you have accidentally swapped some variables there

0Kaj_Sotala
Thanks, fixed.

It seems there are a few meta-positions you have to hold before taking Bayesianism as talked about here; you need the concept of Winning first. Bayes is not sufficient for sanity, if you have, say, an anti-Occamian or anti-Laplacian prior.

What this site is for is to help us be good rationalists; to win. Bayesianism is the best candidate methodology for dealing with uncertainty. We even have theorems that show that in it's domain it's uniquely good. My understanding of what we mean by Bayesianism is updating in the light of new evidence, and updating correctly within the constraints of sanity (cf Dutch books).

3Seth_Goldin
We can discuss both epistemic and instrumental rationality.
3prase
You are right that Bayesianism isn't sufficient for sanity, but why should it prevent a post explaining what Bayesianism is? It's possible to be a Bayesian with wrong priors. It's also good to know what Bayesianism is, especially when the term is so heavily used. My understanding is that the OP is doing a good job keeping concepts of winning and Bayesianism separated. The contrary would conflate Bayesianism with rationality.
3Kevin
Jonathan's post doesn't seem like much of an argument but more of criticism. There's lots more to write on this topic.

The penultimate paragraph about our beliefs isn't about Bayesianism so much as heuristics and biases. Unless you were a Bayesian from birth, for at least part of your life your beliefs evolved in a crazy fashion not entirely governed by Bayes' theorem. It is for this reason that we should be suspicious of the beliefs based on assumptions we've never scrutinized.

0Kaj_Sotala
Thanks! And interestingly, I find myself looking at my upvotes here and there and wondering what the appropriate "conversion rate" is for purposes of feeling good over a successful post. I've now gotten 31 upvotes there, but only 13 here. Obviously getting upvotes over there is easier than over here, so I shouldn't value this as much as if I'd got 13 + 31 = 46 upvotes here. On the other hand, I should probably allow myself a small bonus for writing a cross-domain post that is good enough to get upvotes on both sites. Hum. Man, this is tough.
2Kevin
By any standard you had a successful Hacker News post -- it was on the front page for most of the morning, which is good. The number of votes is not meaningful at all on Hacker News so there's no conversion rate. Also, I strongly suspect that many of the initial early votes on HN came from primary LW users following my link and then upvoting, possibly even people that didn't upvote it on LW.

The 'Intuitive Explanation' link has changed to http://yudkowsky.net/rational/bayes

Or take the debate we had on 9/11 conspiracy theories. Some people thought that unexplained and otherwise suspicious things in the official account had to mean that it was a government conspiracy. Others considered their prior for "the government is ready to conduct massively risky operations that kill thousands of its own citizens as a publicity stunt", judged that to be overwhelmingly unlikely, and thought it far more probable that something else caused the suspicious things.

Don't forget the prior: "The official account of big conflicts...

4fubarobfusco
"Governments in general, and the U.S. in specific, have a history of lying to justify war. I can think of several incidents where an official casus belli turned out to be either a lie, as in the second Gulf of Tonkin incident or the Iraqi WMD allegation; or at least significantly doubtful, such as the sinking of the Maine. In these cases, the 'conspiracy theorists' and peace activists were right; and I can't think of any where they were wrong. So they have more credibility than the official report."
2ChristianKl
Knowing that the official report contains information that's false, doesn't lead you to know what's true.

Others considered their prior for "the government is ready to conduct massively risky operations that kill thousands of its own citizens as a publicity stunt", judged that to be overwhelmingly unlikely,

Here I have to take objection: you framed it as a publicity stunt but actually 9-11 has shaped everything in the USA: domestic policies, foreign policies, military spending the identity of the nation as a whole(It's US vs. THEM) etc... So there is a lot at stake.

Btw, as far as the willingness of the government to kill its own citzens goes, more...

7Jack
The controlling feature for this prior isn't "willingness to kill own citizens" or "publicity stunt" but "massively risky". "Massively risky" is actually an incredible understatement. We're talking about people already at the top of the social hierarchy risky death and eternal shame for them and their families in hopes the hundreds of people part of the conspiracy keep quiet and that no damning evidence of a remarkable complicated plot is left behind. The government's willingness to kill it's own citizens, such as it is, less often carries over to civilians and even less often carries over to rich white people on Wall Street. And for something that has help shaped the country... well remarkably little has changed in the direction that administration wanted to things to go. Indeed, why in all those years of waning popularity, wouldn't they try something like it again (maybe foil the attempt this time). If they're so powerful why not get someone else elected President?
6Alicorn
You know, I have little interest in 9/11 Truth, but I have no patience for the "but it would be so obvious" reply to Truthers. Here is how that conversation translates in my head: Truther: I think the towers came down due to a deliberate demolition by our government. I think this because thus and so. Non-Truther: But the government would never have done anything so easy to find out about, because it would carry massive risk. Everybody would know about it. Truther: Well, if people were paying attention to thus and so, they'd know - Non-Truther: BUT SINCE I DIDN'T ALREADY KNOW ABOUT THUS AND SO IT'S CLEARLY NOT SOMETHING EVERYBODY KNOWS ABOUT AND I CAN'T HEAR YOU NANANANANANANANA.
2Jack
Just to clarify: Do you think that is what I'm doing here?
3Alicorn
It was at least strongly reminiscent, enough that under your comment seemed like a good place to put mine, but I did not intend to attack you specifically.
1PeerInfinity
obligatory XKCD comic: http://xkcd.com/690/ (actually, that's not as relevant as I first though, but I'll go ahead and post it here anyway)
0ata
A little bit more relevant: http://imgur.com/bx1th.png
1[anonymous]
I believe you were unfairly voted down. Your recasting shows that this is essentially an appeal to authority, with the authority being "everyone else".
-3roland
Well, there is a lot of evidence left behind and that has been cited over and over. AFAIK none of the people killed was exceptionally rich and/or powerful. Wait, what??? Someone else? What are you talking about, every President in the last decades has been a member of one of the same two parties. Obama has not significantly changed the foreign policy and is moving in the same direction.
0Jack
Well we're talking about the prior. Obviously we can then update on the evidence whatever that is. People will also disagree about what the evidence means but the point is this is a really unlikely even you guys are claiming took place. We can interpret the evidence but strange coincidences or some video footage not being released is not close to sufficient for me to suddenly start believing 9/11 was an inside job. I don't know what exceptionally means here but, ya know, the WTC wasn't a homeless shelter. ... Look, I have no idea what your particular conspiracy is. So it is a little hard to examine the supposed motivations. My comments made sense given certain assumptions about what the motivations of such a conspiracy would be. Obviously they aren't your assumptions so share yours.
-5roland
5Jonathan_Graehl
Well argued, but if you credit the U.S. government such brazen cruelty toward the citizens it nominally serves, then why would the government need a pretense at all? Why not invade with only forged documents and lies? No self-inflicted wound should be necessary; the U.S. military may not fear intervention by other nations' forces if they appear to only pick on a few small oil-rich nations.
3roland
Forged documents and lies are not enough to convince the public opinion or better to arouse strong emotions, something more salient is needed. You have to remember, at 9-11 basically the whole world stood still watching the events unfold. Wikipedia: http://en.wikipedia.org/wiki/September_11_attacks#cite_note-155 Btw article 5 allows the use of armed(military) force. This was the official NATO position even before there was any investigation as to who was supposedly behind the "attacks". Anyone arguing against military action can be and still is decried as unpatriotic, callous towards the families of those who died. You cannot achieve this with just a batch of documents.

I think this parenthetical statement should maybe be a footnote or something, because it makes the and part of the sentence too far away from the both part. Or maybe put it in the following sentence? I got a little lost.

Doesn't "Bayesianism" basically boil down to the idea that one can think of beliefs in terms of mathematical probabilities?

-1PhilGoetz
That's like saying that Sunni beliefs boil down to belief in Islam.
2brazil84
Following your analogy, what is the equivalent to Shia Islam? Put another way: Bayesianism as opposed to what?
2PhilGoetz
Frequentism, according to the posters here. Unless I misunderstand what you mean by thinking of a belief in terms of probabilities.
8wnoise
But the standard Frequentist stance is that probabilities are not degrees of belief, but solely long term frequencies in random experiments.
4PhilGoetz
Most "frequentists" aren't such sticklers about terminology. Most people who attach probabilities to beliefs in knowledge representations - say, AI systems - are more familiar with frequentist than Bayesian methodology.
4wnoise
Okay, so most people who use statistics don't know what they're talking about. I find that all too plausible.
-1brazil84
I looked up "Frequentism" on Wikipedia . . . .I don't understand your point. What concept am I omitting by characterizing "Bayesianism" the way I did?
4PhilGoetz
Google frequentist instead of frequentism. It's the usual way of doing statistics and working with probabilities.
0brazil84
I did and I still don't understand your point. Again my question: Exactly what concept am I omitting by characterizing "Bayesianism" the way I did?
0Cyan
I PM'ed you regarding this thread. (I mention it here because I seem to recall that you're subject to a bug that prevents you from getting message/reply notifications.)

Core tenet 3: We can use the concept of probability to measure our subjective belief in something. Furthermore, we can apply the mathematical laws regarding probability to choosing between different beliefs. If we want our beliefs to be correct, we must do so.

Frequently misunderstood. E.g. you have propositions A and B , you mistakenly consider that probably either one of them will happen, and you may give me money if you judge P(A)/P(B) > some threshold.

If both A and B happen to be unlikely, I can use that to make arguments which only prompt you to...

Sub-tenet 1: If you experience something that you think could only be caused by cause A, ask yourself "if this cause didn't exist, would I regardless expect to experience this with equal probability?" If the answer is "yes", then it probably wasn't cause A.

I don't understand this at all - if you experience something that you think could only be caused by A, then the question you're supposed to ask yourself makes no sense whatsoever: absent A, you would expect to never experience this thing, per the original condition! And if the a...

5JGWeissman
The point is that people can erroneously report, even to themselves, that they believe their experience could only be caused by cause A. Asking the question if you would still anticipate the experience if cause A did not exist is a way of checking that you really believe that your experience could only be caused by cause A. More generally, it is useful to examine beliefs you have expressed in high level language, to see if you still believe them after digging deeper into what that high level language means.
0FAWS
I think that the inconsistency of such a position was the point. It would probably be better phrased as "... something that has to be caused by cause A" (or possibly just "proof of A"), which is effectively equivalent, but IMO something that someone who would answer yes to the following question could plausibly have claimed to believe (i. e. I wouldn't be very surprised by the existence of people who are that inconsistent in their beliefs) .
[-][anonymous]00

. Further suppose that there are two reasons for why people get headaches: they might have a brain tumor, or they might have a cold.

Or, if you're very unlucky, you could have a headache and a brain tumor.... :3

A brain tumor always causes a headache, but exceedingly few people have a brain tumor. In contrast, a headache is rarely a symptom for cold, but most people manage to catch a cold every single year. Given no other information, do you think it more likely that the headache is caused by a tumor, or by a cold?

Given no other information, we don't know which is more likely. We need numbers for "rarely", "most", and "exceedingly few". For example, if 10% of humans currently have a cold, and 1% of humans with a cold have a heada...

You're missing the point. This post is suitable for an audience whose eyes would glaze over if you threw in numbers, which is wonderful (I read the "Intuitive Explanation of Bayes' Theorem" and was ranting for days about how there was not one intuitive thing about it! it was all numbers! and graphs!). Adding numbers would make it more strictly accurate but would not improve anyone's understanding. Anyone who would understand better if numbers were provided has their needs adequately served by the "Intuitive" explanation.

[-]pjeby150

Agreed, I did not find the "Intuitive Explanation" to be particularly intuitive even after multiple readings. Understanding the math and principles is one thing, but this post actually made me sit up and go, "Oh, now I see what all the fuss is about," outside a relatively narrow range of issues like diagnosing cancer or identifying spam emails.

Now I get it well enough to summarize: "Even if A will always cause B, that doesn't mean A did cause B. If B would happen anyway, this tells you nothing about whether A caused B."

Which is both a "well duh" and an important idea at the same time, when you consider that our brains appear to be built to latch onto the first "A" that would cause B, and then stubbornly hang onto it until it can be conclusively disproven.

That's a "click" right there, that makes retroactively comprehensible many reams of Eliezer's math rants and Beisutsukai stories. (Well, not that I didn't comprehend them as such... more that I wasn't able to intuitively recreate all the implications that I now think he was expecting his readers to take away.)

So, yeah... this is way too important of an idea to have math associated with it in any way. ;-)

3PlatypusNinja
Personally it bothers me that the explanation asks a question which is numerically unanswerable, and then asserts that rationalists would answer it in a given way. Simple explanations are good, but not when they contain statements which are factually incorrect. But, looking at the karma scores it appears that you are correct that this is better for many people. ^_^;
2SilasBarta
I thought Truly Part of you is an excellent introduction to rationalism/Bayesianism/Less Wrong philosophy that avoids much use of numbers, graphs, and technical language. So I think it's more appropriate for the average person, or for people that equations don't appeal to. Does anyone who meets that description agree? And could someone ask Alicorn if she prefers it?
2djcb
Hmmmm.... that's an interesting article too, but it focuses on a different question, the question what knowledge really means, and uses AI concepts to discuss that (somewhat related to Searle's Chinese Room gedankenexperiment.) However, I think the article discussed here is a bit more directly connected to Bayesianism. It's clear what Bayes Theorem means, but what many people today mean with Bayesianism, is somewhat of a loose extrapolation of that -- or even just a metaphor. I think the article does a good job at explaining the current use.
[-]woozle-30

Okay, I'm rising to the bait here...

I would really appreciate it if people would be more careful about passing on memes regarding subjects they have not researched properly. This should be a basic part of "rationalist etiquette", in the same way that "wash your hands before you handle food" is part of common eating etiquette.

I say this because I'm finding myself increasingly irritated by casual (and ill-informed) snipes at the 9/11 Truth movement, which mostly tries very hard to be rational and evidence-based:

Or take the debate we had

...
[-]Jack280

may believe it likely that the government did something horrendous, but we realize the evidence is weak and circumstantial

Did you read the actual post about Bayesianism? Part of the point is you're not allowed to do this! One can't both think something is likely and think the evidence is weak and circumstantial! Holding a belief but not arguing for it because you know you don't have the evidence is a defining example of irrationality. If you don't think the government was involved, fine. But if you do you're obligated to defend your belief.

Off Topic: I'm not going to go through every one of your positions but... how long have you been researching the issue? I haven't looked up the answer for every single thing I've heard truthers argue- I don't have the time. But every time I do look something up I find that the truthers just have no idea what they're talking about. And some of the claims don't even pass the blush test. For example, your first "unanswered" question just sounds crazy! I mean, HOLY SHIT! the hijackers names aren't on the manifest! That is huge! And yet, of course they absolutely are on the flight manifests and, indeed, they flew under their own names. Indeed, we even have seating charts. For example, Mohamed Atta was in seat 8D. That's business class, btw.

Ah, but... what are the odds that A HIJACKER WOULD FLY IN BUSINESS CLASS??!?

3wedrifid
I hear business class gives better 'final meals'.

For example, your first "unanswered" question just sounds crazy! I mean, HOLY SHIT! the hijackers names aren't on the manifest! That is huge! And yet, of course they absolutely are on the flight manifests and, indeed, they flew under their own names. Indeed, we even have seating charts. For example, Mohamed Atta was in seat 8D. That's business class, btw.

This is a crowning moment of awesome.

4Baruta07
Warning: TvTropes may ruin your life, TvTropes should be used at your discretion, (most Tropers agree that excessive use of TvTropes may be conductive to cynicism and overvaluation of most major media, Tvtropes can cause such symptoms as: Becoming dangerously genre savvy, spending increasing amounts of time on TvTropes, and a general increase in the number of tropes you use in a conversation. Please think twice before using TvTropes)
2Jack
Does this mean if we're in a simulation written for entertainment I'm about to get killed off?
1wedrifid
(Please consider, for the sake of wedrifid's productivity if nothing else, including at least the explicit use of the word 'trope' by way of warning when liking to that black hole of super-stimulus.)
3Peter_de_Blanc
One definitely can. What else is one supposed to do when evidence is weak and circumstantial? Assign probabilities that sum to less than one?
1Jack
If the evidence for a particular claim is weak and circumstantial one should assign that claim a low probability and other, competing, possibilities higher probabilities.
2Peter_de_Blanc
What if the evidence for those is also weak and circumstantial? Or what if one had assigned that claim a very high prior probability?
1wedrifid
You're really not. You are not epistemicaly obliged to accept the challenge of another individual and subject your reasoning to their judgement in the form they desire. That is sometimes a useful thing to do and sometimes it is necessary for the purpose of persuasion. Of course, it's usually more practical to attack their beliefs instead. That tends to give far more status.
1Jack
No. Wrong! You totally are obligated.
-1wedrifid
Are you being facetious or not?
0Jack
Well, a little of both. You position doesn't seem like the kind of thing it makes sense to argue about so I figured I'd make my point through demonstration and let it rest.
-1wedrifid
It seems you demonstrated my point.
0Jack
1. Normic questions just aren't the same as factual questions. There is no particular reason to expect eventual agreement on the former, even in principle, so ending conversations is just fine and to be expected. 2. *Edit: Second point was based on a misunderstanding of the objection.
0[anonymous]
I am actually quite offended at the accusation and do not believe you have due cause to make it. The presumption that individuals must accept any challenge and 'defend' their beliefs is a tactic that is commonly exploited. It can be used to imply "you have to convince me, and if I can resist believing you then I am high status". It is something that I object to vocally and is just not part of rationality as I understand it. 'Defensible', just like 'burden of proof' just isn't a bayesian concept, for all the part it plays in traditional rationality. I actually didn't think you would find my correction of a minor point objectionable. I had assumed you used the phrase 'obligated to defend' offhandedly and my reply was a mere tangent. I expected you to just revise it to something like "But if you do then don't expect to be taken seriously unless you can defend your belief". I claim two. I don't think that warranted an upvote because the point it made was not a good one and it also sub-communicated the attitude that you made explicit here. I also downvoted your original comment once it became clear that you present the normative assertion as a true part of your point rather than an accident of language. Come to think of it I originally upvoted the comment so that would count twice. I left the immediate parent untouched because although it is offensive and somewhat of a reputational attack in that sense it at least is forthright and not underhanded. Outside of this context the last comment of yours I recall voting on is this one, which I considered quite insightful. Please refrain from making such accusations again in the future without consideration. That I disagree with a single phrase doesn't warrant going personal. I didn't even take note of which author had said 'are obligated to defend' when I replied, much less seek to steal their status.
2[anonymous]
Whoa! On reflection this looks like an extended misunderstanding. This isn't especially surprising as we've had trouble communicating before. I apologize for offending you. In making the comment I truly didn't mean it as a personal insult- though I can see how it came off that way. There is a not insignificant tendency around here to A) place truth-seeking as secondary to winning and B) reduce things to status games. So in your comment I pattern matched this with that tendency. And so in saying that persuasion and status seemed to be what you were concerned with I thought I was basically just recognizing the position you had taken. There isn't an explicit transition to this second part. I can see in retrospect that this was a comment about defending beliefs. You're saying, no it is not an obligation, just sometimes a good idea, here is when it is (pragmatically) a good idea. What I saw the first time was "No, there isn't any obligation like this. Here are the concerns that should instead enter into the decision to defend beliefs: Status and persuasion." Even if the expectation that someone defends their beliefs doesn't rise to the level of an obligation it still seems like the pro-social reasons for doing it have to do with truth-seeking and sharing information. So when all I see is persuasion and status I inferred that you weren't concerned with these other things. Does that make it clear where I was getting it from, even if I got it wrong? It wasn't a particularly deliberate phrasing. That said, I think it is a defensible, even obvious, rule of discourse. Of course, one way of describing what happens to someone when they don't obey such rules is just that they are no longer taken seriously. Your tone in the first comment, didn't suggest to me that you were only making a minor point and is part of the reason I interpreted it as differing from my own view more radically than it apparently does. And, I mean, an obligation that people be prepared to give reasons f
0wedrifid
Hi Jack, thanks for that. I deleted my reply. I can see why you would object to that first interpretation. I too like to keep my 'winning' quite separate from my truth seeking and would join you in objecting to exhortations that people should explain reasons for their beliefs only for pragmatic purposes. It may be that my firm disapproval of mixing epistemic rationality with pragmatics was directed at you, not the mutual enemy so pardon me if that is the case. I certainly support giving explanations and justifications for beliefs. The main reason I wouldn't support it as an obligation is for the kind of thing that you thought I was doing to you. Games can be played with norms and I don't want people who are less comfortable with filtering out those sort of games to feel obligated to change their beliefs if they cannot defend them according to the criteria of a persuader.
-7woozle

Well, the main thing that'd cause me to mistrust your judgment there, as phrased, is A8. Pre-9/11, airlines had an explicit policy of not resisting hijackers, even ones armed only with boxcutters, because they thought they could minimize casualties that way. So taking over an airplane using boxcutters pre-9/11 is perfectly normal and expected and non-anomalous; and if someone takes exception to that event, it probably implies that in general their anomaly-detectors are tuned too high.

I also suspect that some of these questions are phrased a bit promptingly, and I would ask others, like, "Do you think that malice is a more likely explanation than stupidity for the level of incompetence displayed during Hurricane Katrina? What was to be gained politically from that? Was that level of incompetence more or less than the level of hypothesized government incompetence that you think is anomalous with respect to 9/11?" and so on.

-2woozle
That is a valuable point, and I have amended my A8 response to "MAYBE". The one detail I'm still not sure of is whether pilots would have relinquished control under those circumstances. Can anyone point to the actual text of the "Common Strategy"? "Pilots for 911 Truth" has this to say: "Screw Loose Change" seems to find this statement incredibly offensive, but offers only an emotional argument in response (argument from outrage?) and ignores the original point that these pilots were experienced in this sort of combat and certainly could have fought off attackers with boxcutters, with the "Common Strategy" being the only possible constraint on doing so. I've added your proposed questions to the questionnaire, somewhat modified. My answers are: * NO: not more likely, just possible -- what actually happened must be determined by the evidence. David Brin, for example, argues that said incompetence was a by-product of a "war on professionalism" waged by the Bush administration. (I would also argue that the question as phrased implies that it is reasonable to judge the question of {whether malice was involved} entirely on the basis of {how "likely" it seems}, and that this is therefore privileging the hypothesis that malice was not involved.) * "starving the beast", albeit in a somewhat broader sense than described by Wikipedia: shrink the government by rendering it incompetent, thus eroding support (and hence funding) for government activities * I'm not sure what you're getting at here; my immediate answer is "THAT DEPENDS" -- given the range of possible scenarios in which the government is complicit, the incompetence:malice ratio has a wide range of possible values. I don't know if I'm answering the question in the spirit in which it was asked, however. I've rephrased that last question as a matter of consistency: "Do you believe that the levels of government malice OR stupidity/incompetence displayed regarding Katrina are consistent with whatever levels of go
-6David_J_Balan

The problem you have is the one shared by everyone from devotees of parapsychology to people who believe Meredith Kercher was killed in an orgy initiated by Amanda Knox: your prior on your theory is simply way too high.

Simply put, the events of 9/11 are so overwhelmingly more likely a priori to have been the exclusive work of a few terrorists than the product of a conspiracy involving the U.S. government, that the puzzling details you cite, even in their totality, fail to make a dent in a rational observer's credence of (more or less) the official story.

You might try asking yourself: if the official story were in fact correct, wouldn't you nevertheless expect that there would be strange facts that appear difficult to explain, and that these facts would be seized upon by conspiracy theorists, who, for some reason or another, were eager to believe the government may have been involved? And that they would be able to come up with arguments that sound convincing?

I want to stress that it is not the fact that the terrorists-only theory is officially sanctioned that makes it the (overwhelming) default explanation; as the Kercher case illustrates, sometimes the official story is an impl...

"Not silencing skeptical inquiry" is a great-sounding applause light

The main issue with it has been noted multiple times by people like Dawkins: there is an effort asymmetry between plucking a false but slightly believable theory out of thin air, and actually refuting that same theory. Making shit up takes very little effort, while rationally refuting random made-up shit takes the same effort as rationally refuting theories whose refutation yields actual intellectual value. Creationists can open a hundred false arguments at very little intellectual cost, and if they are dismissed out of hand by the scientific establishment they get to cry "suppression of skeptical inquiry".

This feels related to pjeby's recent comments about curiosity. The mere feeling that "there's something odd going on here", followed by the insistence that other people should inquire into the odd phenomenon, isn't valid curiosity. That's only ersatz curiosity. Real curiosity is what ends up with you actually constructing a refutable hypothesis, and subjecting it to at least the kind of test that a random person from the Internet would perform - before actually publishing your hypothesis, and insisting that others should consider it carefully.

Inflicting random damage on other people's belief networks isn't promoting "skeptical inquiry", it's the intellectual analogue of terrorism.

6Paul Crowley
I like this comment lots, but I think this comparison is inadvisable hyperbole.
4Morendil
Perhaps "asymmetric warfare" would be a better term than "terrorism". More general, and without the connotations which I agree make that last line something of an exaggeration.
-1woozle
Again, you're addressing a straw man -- not my actual arguments. I do not claim that the government was responsible for 9/11; I believe the evidence, if properly examined, would probably show this -- but my interest is in showing that the existing explanations are not just inadequate but clearly wrong. So, okay, how would you tell the difference between an argument that "sounds convincing" and one which should actually be considered rationally persuasive? My use of the "applause light" was an attempt to use emotion to get through emotional barriers preventing rational examination. Was it inappropriate? I agree. Many of the conclusions reached by the 9/11 Commission are, however, not among that small proportion. Many questions to which we need answers were not even addressed by the Commission. (Your statement here strikes me as a "curiosity stopper".) This is the problem, yes. What's your point? None that I can think of. Again, what's your point? I am not "dismissing" the dominant conclusion, I am questioning it. I have, in fact, done substantial amounts of research (probably more than anyone reading this). If anyone is actually dismissing an idea with substantial numbers of adherents, it is those who dismiss "truthers" without actually listening to their arguments. Are you arguing that "people are irrational, so you might as well give up"?
7komponisto
This is a flat-out Bayesian contradiction. It's not an easy problem, in general -- hence LW! But we can always start by doing the Bayesian calculation. What's your prior for the hypothesis that the U.S, government was complicit in the 9/11 attacks? What's your estimate of the strength of each of those pieces of evidence you think is indicative of a conspiracy? You misunderstood. I was talking about your failure to dismiss 9/11 conspiracy theories. I was asking whether there were any conspiracy theories that you would be willing to dismiss without research.
-3woozle
Again, I think this question is a diversion from what I have been arguing; its truth or falseness does not substantially affect the truth or falseness of my actual claims (as opposed to beliefs mentioned in passing). That said, I made a start at a Bayesian analysis, but ran out of mental swap-space. If someone wants to suggest what I need to do next, I might be able to do it. Also vaguely relevant -- this matrix is set up much more like a classical Bayesian word-problem: it lists the various pieces of evidence which we would expect to observe for each known manner in which a high-rise steel-frame building might run down the curtain and join the choir invisible, and then shows what was actually observed in the cases of WTC1, 2, and 7. Is there enough information there to calculate some odds, or are there still bits missing? No, not really. I think of that as my "job" at Issuepedia: don't dismiss anything without looking at it. Document the process of examination so that others don't have to repeat it, and so that those who aren't sure what to believe can quickly see the evidence for themselves (rather than having to go collect it) -- and can enter in any new arguments or questions they might have. Does that process seem inherently flawed somehow? I'm not sure what you're suggesting by your use of the word "failure" here.
9komponisto
3Morendil
This thread doesn't belong under the "What is Bayesianism" post. I advise taking it to the older post that discussed "Truthers".
-8roland
7Douglas_Knight
I would add to Eliezer's comment about A8 that it suggests that your community is bad at filtering good arguments from bad. Similarly, your failure to distance yourself from words like "Truther" is another failure of filtering. It suggests that you are less interested in being listened to than in passing some threshold that allows you to be upset about being ignored. It's like a Hindu whining about being persecuted for using a swastika. Maybe it's not "fair." Life isn't fair. That's normal. Most news stories contain non-explanations. When there's an actual opposition, the non-explanations take over. If you want to calibrate, you could look at Holocaust and HIV denial. I'm told they are well described by the above quote. or any medical controversy. Often it is best to silence incompetent skeptical inquiry.
-2woozle
I used the term "truther" as an attempt to be honest -- admitting that I pretty much agree with them, rather than trying to pretend to be a devil's advocate or fence-sitter. I don't see how that's a failure of filtering. The rest of your first paragraph is basically ad-hominem, as far as serious discussion of this issue goes. If I'm upset, I try not to let it dominate the conversation -- this is a rationalist community, after all, and I am a card-carrying rationalist -- but I also believe it to be justified, for reasons I explained earlier. "That's normal" -- so are you in the "people aren't rational so you might as well give up" camp along with komponisto? What's your point? Holocaust denial and HIV denial are easily refuted by the available evidence -- along with global warming denial, evolution denial, moon landing denial, and most religions. 9/11 anomalies manifestly are not, given that I've been trying for years to elicit rational rebuttals and have come up with precious little. Please feel free to send me more. Do you really believe this? Why? Who determines that it is incompetent?
6Douglas_Knight
Even the Frequentists (remember Bayes? It's a song about Bayes) agree that the probability of the evidence given the null hypothesis is an important number to consider. That is why I talk about what is normal, and why it is relevant that "Conspiracy theorists will find suspicious evidence, regardless of whether anything suspicious happened." Yet people don't bother to refute them. Instead they pretend to respond.
-5woozle
2Kaj_Sotala
Sorry. I was merely trying to provide an example, not to snipe. If you want to provide a reformulation of that paragraph that better reflects your views, I'll change it.
-3woozle
Kaj, I've always enjoyed your posts, so I felt bad picking on you and I apologize if I jumped down your throat. It seemed time to say something about this because I've been seeing it over and over again in lots of otherwise very rational/reality-based contexts, and your post finally pushed that button. For reformulating your summary, I'd have to go read the original discussion, but you didn't link to it. It's not that it needs to reflect my views, it's that I think we need a more... rigorous? systematic?... way of looking at controversies. Yes, many of them can be dismissed without further discussion -- global warming denial, evolution denial, holocaust denial, et freaking cetera -- but there are specific reasons we can dismiss them, and I don't think those reasons apply to 9/11 (not even to the official story -- parts of it seem very likely to be true). Proposed Criteria for Dismissing a Body of Belief Terminology: * a "claim" is an argument favoring or supporting the body of belief * a "refutation" is a responding argument which shows the claim to be invalid (in a nested structure -- responses to refutations are also "claims", responses to those claims are also "refutations", etc) Essential criteria: * the work has been done of examining the claims and refuting them * no claims remain unrefuted A further cue, sufficient but not necessary: * those promoting the ideology never bring up the refutations of their claims unless forced to do so, even though there is reason to believe they are well aware of those refutations Any objection to those ground rules? The first set is required so that the uninformed (e.g. those new to the discussion) will have a reference by which to understand why the seemingly-persuasive arguments presented in favor of the given belief system are, in fact, wrong; the final point is a sort of short-cut so we don't waste time dealing with people who are clearly being dishonest. I submit that, by these rules, we can safely dismis
2Kaj_Sotala
Sure, no problem. The original 9/11 discussion began as a thread in The Correct Contrarian Cluster and was then moved to The 9/11 Meta-Truther Conspiracy Theory. Your criteria sound good in principle. My only problem with them is that determining when a claim has really been refuted isn't trivial, especially for people who aren't experts in the relevant domain.
-10roland
0Kevin
What were your thoughts on Eliezer's Meta-Truther Conspiracy post? If there were a conspiracy, government inaction given foreknowledge of the attacks seems orders of magnitude more likely than any sort of controlled demolition, even for WTC7. http://lesswrong.com/lw/1kj/the_911_metatruther_conspiracy_theory/
0woozle
He brings up a lot of hypotheses; let me see if I can (paraphrase and) respond to the major ones. * "9/11 conspiracy theorists" are actually acting on behalf of genuine government conspirators. Their job is to plant truly unbelievable theories about what happened so that people will line up behind the official story and dismiss any dissenters as "just loony conspiracy theorists". Well, yes, there's evidence that this is what has happened; it is discussed extensively here. * The idea that the towers were felled by controlled demolition is loony. No, it isn't. There is now a great deal of hard evidence pointing in this direction. It may turn out to be wrong, but it is absolutely not loony. See this for some lines of reasoning. * "This attack would've had the same political effect whether the buildings came down entirely or just the top floors burned." If anyone really believes that, I'll be happy to explain why I don't. * The actual government involvement was to stand aside and allow the attack, which was in fact perpetrated by middle eastern agents, to succeed. This is the lesser of the two major "conspiracy" theories, known as "let it happen on purpose" (LIHOP) and "make it happen on purpose" (MIHOP). MIHOP is generally presumed to be a core belief of all "truthers", though this is not in fact the case; there does not appear to be any clear consensus about which scenario is more likely, and (as I said earlier) the actual core belief which defines the "truther" movement is that the official story is significantly wrong and a proper investigation is needed in order to determine what really happened. Imagine, for example, what the Challenger investigation would have found if Richard Feynman hadn't been there. * Conspiracy theorists are all (or mostly) anti-government types. Well, I can't speak for the rest of them, but I'm not. I strongly dislike how the government operates, but I see it as an essential invention -- something to repair, not discard. The
9dripgrind

Oh, and to try and make this vaguely on topic: say I was trying to do a Bayesian analysis of how likely woozle is to be right. Should I update on the fact that s/he is citing easily debunked facts like "the hijackers weren't on the passenger manifest", as well as on the evidence presented?

5LucasSloan
Yes. A bad standard of accepting evidence causes you to lose confidence in all of the other evidence.
-3woozle
I am "happy to take it as fact" until I find something contradictory. When that happens, I generally make note of both sources and look for more authoritative information. If you have a better methodology, I am open to suggestions. The "Wikipedia standard" seems to work pretty well, though -- didn't someone do a study comparing Wikipedia's accuracy with Encyclopedia Britannica's, and they came out about even? I wasn't intending to be snide; I apologize if it came across that way. I meant it sincerely: Jack found an error in my work, which I have since corrected. I see this as a good thing, and a vital part of the process of successive approximation towards the truth. I also did not cite the 6 living hijackers as a "killer anomaly" but specifically said it didn't seem to be worth worrying about -- below the level of my "anomaly filter". Just as an example of my thought-processes on this: I haven't yet seen any evidence that the "living hijackers" weren't simply people with the same names as some of those ascribed to the hijackers. I'd need to see some evidence that all (or most) of the other hijackers had been identified as being on the planes but none of those six before thinking that there might have been an error... and even then, so what? If those six men weren't actually on the plane, that is a loose end to be explored -- why did investigators believe they were on the plane? -- but hardly incriminating. I verify when I can, but I am not paid to do this. This is why my site (issuepedia.org) is a wiki: so that anyone who finds errors or omissions can make their own corrections. I don't know of any other site investigating 9/11 which provides a wiki interface, so I consider this a valuable service (even if nobody else seems to). The idea that this is unlikely is one I have seen repeatedly, and it makes sense to me: if someone came at me with a box-cutter, I'd be tempted to laugh at them even if I wasn't responsible for a plane-load of passengers -- and I've n

I am "happy to take it as fact" until I find something contradictory. When that happens, I generally make note of both sources and look for more authoritative information. If you have a better methodology, I am open to suggestions.

So your standard of accepting something as evidence is "a 'mainstream source' asserted it and I haven't seen someone contradict it". That seems like you are setting the bar quite low. Especially because we have seen that your claim about the hijackers not being on the passenger manifest was quickly debunked (or at least, contradicted, which is what prompts you to abandon your belief and look for more authoritative information) by simple googling. Maybe you should, at minimum, try googling all your beliefs and seeing if there is some contradictory information out there.

I wasn't intending to be snide; I apologize if it came across that way. I meant it sincerely: Jack found an error in my work, which I have since corrected. I see this as a good thing, and a vital part of the process of successive approximation towards the truth.

I suggest that a better way to convey that might have been "Sorry, I was wrong" rather than &qu...

8Jack
As a matter of fact there are conspiracy theorists about many important public events, cf the moon-landing, JFK etc. Before there even was a 9/11 Truth movement people could have predicted there would be a conspiracy theorists. It is just that kind of society-changing event that will generate conspiracy theories. Given that, the existence of conspiracy theorists pointing out anomalies in the official story isn't evidence the official story is substantially wrong since it would be happening whether or not the official story was substantially wrong. It's like running a test for a disease that will say positive 50% of the time if the patient has the disease and negative 50% of the time if the patient doesn't have the disease. That test isn't actually testing for that disease and these anomalies aren't actually providing evidence for or against the official account of 9/11. (I think this comment is Bayesian enough that it is on topic, but the whole 9/11 conversation needs to be moved to the comments under Eliezer's Meta-truthers post. Feel free to just post a new comment there.)
0woozle
Correct. What is evidence that the official story is substantially wrong is, well, the evidence that the official story is substantially wrong. (Yes, I need to reorganize that page and present it better.) Also, does anyone deny that some "conspiracy theories" do eventually turn out to be true? (Can comment-threads be moved on this site?)
4Jack
Comments can't be moved. Just put a hyperlink in this thread (at the top, ideally) and link back with a hyperlink in the new thread. That list of evidence is almost all exactly the kind of non-evidence we're talking about. In any event like this one would expect to find weird coincidences and things that can't immediately be explained- no matter how the event actually happened. That means your evidence isn't really evidence. Start a new thread an I'll try and say more.
-8roland
-3roland
I just read this comment, I'm so glad that I'm not the only one who is very skeptic in regard to the official account, here is the comment I wrote: http://lesswrong.com/lw/1to/what_is_bayesianism/1omc

I guess this is the wrong place for this comment but i don't know where else to put it and after reading the extensive threads on 9/11 below i felt this was a valid point. If someone objects to this being here i'll move it to somewhere more appropriate. It looks like i'm a bit out of date with the discussion anyway.

Firstly I should say i'm still very undecided on the matter. Iv'e heard a lot of convincing evidence for both sides of the story, and I know many intelligent people who's opinion i respect on both sides of the fence. I do however think that it...