Is this a fair representation of frequentists versus bayesians? I feel like every time the topic comes up, 'Bayesian statistics' is an applause light for me, and I'm not sure why I'm supposed to be applauding.

1) The neutrino detector is evidence that the Sun has exploded. It's showing an observation which is 36^H^H 35 times more likely to appear if the Sun has exploded than if it hasn't (likelihood ratio of 35:1). The Bayesian just doesn't think that's strong enough evidence to overcome the prior odds, i.e., after multiplying the prior odds by 35 they still aren't very high.

2) If the Sun has exploded, the Bayesian doesn't lose very much from paying off this bet.

Can you explain that more clearly? It seems that the sun exploding is so
unlikely that the outcome doesn't matter. Perhaps you are confusing odds and
probability?

Because the stupider the prediction is that somebody is making, the harder it is to get them to put their money where their mouth is. The Bayesian is hoping that $50 is a price the other guy is willing to pay to signal his affiliation with the other non-Bayesians.

I'll bet $50 that the sun hasn't just gone nova even in the presence of a
neutron detector that says it has.
If I lose, I lose what $50 is worth in a world where the sun just went nova. If
I win, I win $50 worth in a world where it didn't. That's a sucker bet even as
the odds of the sun just having gone nova approach 1-Epsilon.

So, what's the default clause on a contract for booze for immediate delivery?
What makes you think a rational agent will fulfill the contract?

7gwern11y

Maybe the ship's markets are built on Bitcoin and smart contracts with
capability-based architectures & automation - they can't not deliver. (Hey, it's
the future...)

-1roland11y

If the Sun has exploded wouldn't you also feel the heat wave when the neutrino
detector has gone off since heat radiation moves at the speed of light? So if
you are not feeling/seeing anything it means the Sun hasn't exploded, right?

8JoshuaZ11y

No. For example, in a supernova, the neutrinos leave the star a few hours before
the light does (since they don't get slowed down by all the mass in between).
That's why for example we were able to detect SN 1987A
[http://en.wikipedia.org/wiki/SN_1987A]'s neutrinos before the light arrived.
Similarly, the Supernova Early Warning System
[http://en.wikipedia.org/wiki/Supernova_Early_Warning_System] has been set up so
that astronomers can point their telescopes in the right direction before any of
the light gets to us (because we can detect and pinpoint a close supernova from
the burst of neutrinos).

The main thing that jumps out at me is that the strip plays on a caricature of frequentists as unable or unwilling to use background information. (Yes, the strip also caricatures Bayesians as ultimately concerned with betting, which isn't always true either, but the frequentist is clearly the butt of the joke.) Anyway, Deborah Mayo has been picking on the misconception about frequentists for a while now: see here and here, for examples. I read Mayo as saying, roughly, that of course frequentists make use of background information, they just don't do it by writing down precise numbers that are supposed to represent either their prior degree of belief in the hypothesis to be tested or a neutral, reference prior (or so-called "uninformative" prior) that is supposed to capture the prior degree of evidential support or some such for the hypothesis to be tested.

I agree, but noticing 2 requires looking into how they've done the calculations,
so simply knowing its bayesian isn't enough.

0khafra11y

It might be enough. If it's published in a venue where the authors would get
called on bullshit priors, the fact that it's been published is evidence that
they used reasonably good priors.

1JonathanLivengood11y

The point applies well to evidentialists but not so well to personalists. If I
am a personalist Bayesian -- the kind of Bayesian for which all of the nice
coherence results apply -- then my priors just are my actual degrees of belief
prior to conducting whatever experiment is at stake. If I do my elicitation
correctly, then there is just no sense to saying that my prior is bullshit,
regardless of whether it is calibrated well against whatever data someone else
happens to think is relevant. Personalists simply don't accept any such
calibration constraint.
Excluding a research report that has a correctly elicited prior smacks of
prejudice, especially in research areas that are scientifically or politically
controversial. Imagine a global warming skeptic rejecting a paper because its
author reports having a high prior for AGW! Although, I can see reasons to allow
this sort of thing, e.g. "You say you have a prior of 1 that creationism is
true? BWAHAHAHAHA!"
One might try to avoid the problems by reporting Bayes factors as opposed to
full posteriors or by using reference priors accepted by the relevant community
or something like that. But it is not as straightforward as it might at first
appear how to both make use of background information and avoid idiosyncratic
craziness in a Bayesian framework. Certainly the mathematical machinery is
vulnerable to misuse.

3JonathanLivengood11y

That depends heavily on what "the method" picks out. If you mean that the
machinery of a null hypothesis significance test against a fixed-for-all-time
significance level of 0.05, then I agree, the method doesn't promote good
practice. But if we're talking about frequentism, then identifying the method
with null hypothesis significance testing looks like attacking a straw man.

3Luke_A_Somers11y

I know a bunch of scientists who learned a ton of canned tricks and take the
(frequentist) statisticians' word on how likely associations are... and the
statisticians never bothered to ask how a priori likely these associations were.
If this is a straw man, it is one that has regrettably been instantiated over
and over again in real life.

4ChristianKl11y

If not using background information means you can publish your paper with
frequentists methods, scientists often don't use background information.
Those scientifists who don't use less background information get more
significant results. Therefore they get more published papers. Then they get
more funding than the people who use more background information. It's publish
or perish.

3JonathanLivengood11y

You could be right, but I am skeptical. I would like to see evidence --
preferably in the form of bibliometric analysis -- that practicing scientists
who use frequentist statistical techniques (a) don't make use of background
information, and (b) publish more successfully than comparable scientists who do
make use of background information.

Andrew Gelman on whether this strip is fair to frequentists:

I think the lower-left panel of the cartoon unfairly misrepresents frequentist statisticians. Frequentist statisticians recognize many statistical goals. Point estimates trade off bias and variance. Interval estimates have the goal of achieving nominal coverage and the goal of being informative. Tests have the goals of calibration and power. Frequentists know that no single principle applies in all settings, and this is a setting where this particular method is clearly inappropriate.

...the test with 1/36 chance of error is inappropriate in a classical setting where the true positive rate is extremely low.

The error represented in the lower-left panel of the cartoon is not quite not a problem with the classical theory of statistics—frequentist statisticians have many principles and hold that no statistical principle is all-encompassing (see here, also the ensuing discussion), but perhaps it is a problem with textbooks on classical statistics, that they typically consider the conditional statistical properties of a test (type 1 and type 2 error rates) without discussing the range of applicability of the method. In the conte

No, it's not fair. Given the setup, the null hypothesis would be, I think, 'neither the Sun has exploded nor the dice come up 6', and so when the detector goes off we reject the 'neither x nor y' in favor of 'x or y' - and I think the Bayesian would agree too that 'either the Sun has exploded or the dice came up 6'!

Um, I don't think the null hypothesis is usually phrased as, "There is no effect and our data wasn't unusual" and then you conclude "our data was unusual, rather than there being no effect" when you get data with probability < .05 if the Sun hasn't exploded. This is not a fair steelmanning.

I don't follow. The null hypothesis can be phrased in all sorts of ways based on
what you want to test - there there's no effect, that the effect between two
groups (eg. a new drug and an old drug) is the same etc.
I don't know that my frequentist example does conclude the 'data was unusual'
rather than 'there was an effect'. I am not sure how a frequentist would break
apart the disjunction, or indeed, if they even would without additional data and
assumptions.

0[anonymous]11y

Null hypotheses are phrased in terms of presumed stochastic data-generating
mechanism; they do not address the data directly. That said, you are right about
the conclusion one is to draw from the test. Fisher himself phrased it as

4Manfred11y

Given the statement of the problem, this null hypothesis is not at all
probabilistic - we know it's false using deduction! This is an awful strange
thing for a hypothesis to be in a problem that's supposed to be about
probabilities.

0gwern11y

Since probabilistic reasoning is a superset of deductive logic (pace our Saint
Jaynes, RIP), it's not a surprise if some formulations of some problems turn out
that way.

0Manfred11y

Ah, you mean like in chapter 1 of his book? :P
Anyhow, I think this should be surprising. Deductive logic is all well and good,
but merely exercising it, with no mention of probabilities, is not the
characteristic behavior of something called an "interpretation of probability."
If I run a vaccine trial and none of the participants get infected, my deductive
conclusion is "either the vaccine worked, or it didn't and something else made
none of the participants get infected - QED." And then I would submit this to
The Lancet, and the reviewers would write me polite letters saying "could you do
some statistical analyses?"

0gwern11y

And you might say 'well, I don't know what 'something else' is, I can't define
it as a limit of any frequency!' At least, not with more info than is presented
in a 3 panel comic. ('I am pretty darn sure about that disjunction, though.')

0MixedNuts11y

"The machine has malfunctioned."

6Eliezer Yudkowsky11y

Why, I deny that, for the machine worked precisely as XKCD said it did.

2noen11y

I think the null hypothesis is "the neutrino detector is lying" because the
question we are most interested in is if it is correctly telling us the sun has
gone nova. If H0 is the null hypothesis and u1 is the chance of a neutrino event
and u2 is the odds of double sixes then H0 = µ1 - µ2. Since the odds of two die
coming up sixes is vastly larger than the odds of the sun going nova in our
lifetime the test is not fair.

2gwern11y

I don't think one would simply ignore the dice, and what data is the frequentist
drawing upon in the comic which specifies the null?

-2noen11y

How about "the probability of our sun going nova is zero and 36 times zero is
still zero"?
Although... continuing with the XKCD theme if you divide by zero perhaps that
would increase the odds. ;)

2Cyan11y

Since the sun going nova is not a random event, strict frequentists deny that
there is a probability to associate with it.

2noen11y

Among candidate stars for going nova I would think you could treat it as a
random event. But Sol is not a candidate and so doesn't even make it into the
sample set. So it's a very badly constructed setup. It's like looking for a
needle in 200 million haystacks but restricting yourself only to those haystacks
you already know it cannot be in. Or do I have that wrong.

0Cyan11y

I'm going to try the Socratic method...
Is a coin flip a random event?

2[anonymous]11y

taboo random event.
it's deterministic, but you don't know how it will come out.

Another clarification: Induction and deduction in Bayesian data analysis
[http://www.stat.columbia.edu/~gelman/research/unpublished/philosophy_online4.pdf].

Only if the moon is, in fact, in the night sky at the time.

0brilee11y

No... because the time it takes the sun's increased brilliance to reach the moon
and reflect to the Earth is the same as the time it takes for the Earth to be
wiped out by the energy wave.

0tim11y

This assumes that the supernova is expanding at the speed of light.
According to wikipedia:
<The explosion expels much or all of a star's material at a velocity of up to
30,000 km/s (10% of the speed of light), driving a shock wave into the
surrounding interstellar medium.

Some frequentists (not they) do those things. Most others are busy with good
work, e.g. discovering things like the Higgs. (Concerning ESP, this
[http://dbem.ws/ResponsetoWagenmakers.pdf] also springs to mind.)

4A1987dM11y

I think they discovered the Higgs in spite of being frequentists (or, more
likely, of paying lip service to frequentism), not thanks to being frequentists.

3AlexSchell11y

I agree, but notice that now you're no longer talking about the practitioners,
you're talking about the doctrine. The existence of some ghost-hunting
frequentists doesn't prevent the frequentist in the comic from being a straw man
of actual frequentists (because they're highly atypical; cf. the existence of
ghost-hunting Bayesians).

I found it hilarious, I think it's the first time I've seen bayesians mentioned outside LW, and since it seems to be a lot of betting, wagers, problems hinging on money, I think both are equally approporiate. Insightful for being mostly entertainment (the opposite of the articles here - aiming to be insightful, usually ending up entertaining as well?), but my warning light also went off. Perhaps I'm already too attached to the label... I'll try harder than usual to spot cult behaviour now.

I felt that the comic is quite entertaining. This marks the first time that I have seen Bayes be mentioned in the mainstream (if you can call XKCD the internet mainstream). Hopefully it will introduce Bayes to a new audience.

I feel that it is a good representation of frequentists and Bayesians. A Bayesian would absolutely use this as an opportunity to make a buck.

My general impression is that Bayes is useful in diagnosis, where there's a relatively uncontroversially already-known base rate, and frequentism is useful in research, where the priors are highly subject to disagreement.

Why this isn't necessarily true:
If we look at Bayes' theorem (that picture above, with P(A|B) pronounced
"probability of A if we learn B"), our probability of A after getting evidence B
is equal to P(A) before you saw the evidence (the "prior probability"), times a
factor P(B|A)/P(B).
This factor is called the "likelihood ratio," and it tells you how much impact
the evidence should have on your probability - what it means is that the more
unexpected the evidence would be if A wasn't true, the more the evidence
supports A. Like how UFO abduction stories aren't very convincing, because we'd
expect them to happen even if there weren't any aliens (so P(B|A)/P(B) is close
to 1, so multiplying by that factor doesn't change our belief).
Anyhow, because Bayes' theorem can be split up into parts like this, research
papers don't have to rely on priors! Each paper could just gather some evidence,
and then report the likelihood ratio - P(evidence | hypothesis)/P(evidence).
Then people with different priors would just multiply their prior, P(A), by the
likelihood ratio, and that would be Bayes' theorem, so they would each get
P(A|B). And if you want to gather evidence from multiple papers, you can just
multiply them together.
Although, that's only in a fairy-tale world with e.g. no file-drawer effect. In
reality, more care would be necessary - the point is just that differing priors
don't halt science.

9Cyan11y

That's not true in general
[http://lesswrong.com/lw/dlq/where_did_mathematics_begin_to_disagree_between/718r].

2Manfred11y

Fair enough. Can I take your point to be "when things get super complicated,
sometimes you can make conceptual progress only by not worrying about keeping
track of everything?" The only trouble is that once you stop keeping track of
probability/significance, it becomes difficult to pick it up again in the future
- you'd need to gather additional evidence in a better-understood way to check
what's going on. Actually, that's a good analogy for hypothesis generation, with
the "difficult to keep track of" stuff becoming the problem of uncertain priors.

3Cyan11y

My point is more like: If scientific interest only rests on some limited aspect
of the problem, you can't avoid priors by, e.g., simpy reporting likelihood
ratios. Likelihood ratios summarize information about the entire problem,
including the auxiliary, scientifically uninteresting aspects. The Bayesian way
of making statements free of the auxiliary aspects (marginalization
[http://en.wikipedia.org/wiki/Marginal_distribution]) requires, at the very
least, a prior over those aspects.
I'm not sure if I agree or disagree with the third sentence on down because I
don't understand what you've written.

0jsalvatier11y

You can also do Bayesian analysis with 'non-informative' priors or
weakly-informative priors. As an example of the latter: if you're trying to
figure out the mean change earth's surface temperature you might say 'it's
almost certainly more then -50C and less than 50C'.

2Pavitra11y

Unfortunately, if there is disagreement merely about how much prior uncertainty
is appropriate, then this is sufficient to render the outcome controversial.

3jsalvatier11y

I think your initial point is wrong.
There are 3 situations
1. Clear prior info: Bayes works well.
2. Controversial prior info, but posterior dominated by likelihood: Choose weak
enough priors to convince skeptics. Bayes works well.
3. Controversial prior info, posterior not dominated by likelihood: If you
choose very weak priors skeptics won't be convinced. If you choose strong
priors skeptics won't be convinced. Bayes doesn't work well. Frequentism
will also not work well unless you sneak in strong assumptions.

2Cyan11y

You can get frequentism to work well by its own lights by throwing away
information. The canonical example here would be the Mann-Whitney U
[http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U] test. Even if the prior
info and data are both too sparse to indicate an adequate sampling
distribution/data model, this test will still work (for frequentist values of
"work").

Cox's theorem is a theorem. I get that the actual Bayesian methods can be infeasible to compute in certain conditions so people like certain approximations which apply when priors are non-informative, samples are large enough, etc., but why can't they admit they're approximations to something else, rather than come up with this totally new, counter-intuitive epistemology where it's not allowed to assign probabilities to fixed but unknown parameters, which is totally at odds with commonsensical usage (norma... (read more)

Because they don't accept the premises of Cox's theorem -- in particular, the
one that says that the plausibility of a claim shall be represented by a single
real number. I'm thinking of Deborah Mayo here (referenced upthread
[http://lesswrong.com/r/discussion/lw/fe5/xkcd_frequentist_vs_bayesians/7skv]).

7Paul Crowley11y

Have you tried offering de Finetti's choice to them? I had a go at one
probability-resister here
[http://ciphergoth.livejournal.com/372288.html?thread=3986496#t3986496] and
basically they squirmed like a fish on a hook.

3Cyan11y

Mayo sees the process of science as one of probing a claim for errors by
subjecting it to "severe" tests. Here the severity of a test (vis-a-vis a
hypothesis) is the sampling probability that the hypothesis passes fails to pass
the test given that the hypothesis does not, in fact, hold true. (Severity is
calculated holding the data fixed and varying hypotheses.) This is a
process-centred view of science: it sees good science as founded on
methodologies that rarely permit false hypotheses to pass tests.
Her pithy slogan for the contrast between her view and Bayesian epistemology is
"well-probed versus highly probable". I expect that even she were willing to
offer betting odds on the truth of a given claim, she would still deny that her
betting odds have any relevance to the process of providing a warrant for
asserting the claim.

0roystgnr11y

You know, it's actually possible for a rational person to be unable to give
consistent answers to de Finetti's choice under certain circumstances. When the
person offering the bet is a semi-rational person who wants to win money and who
might have unknown-to-me information, that's evidence in favor of the position
they're offering to take. Because I should update in the direction of their
implied beliefs no matter which side of the bet they offered me, there will be a
range around my own subjective probability in which I won't want to take any
bet.
Sure, when you're 100% sure that the person offering the bet is a nerd who's
solely trying to honestly elicit some Bayesian subjective probability estimate,
then you're safe taking either side of the same probability bet. But I'll bet
your estimate of that likelihood is less than 100%.

4JonathanLivengood11y

I don't see how this applies to ciphergoth's example. In the example under
consideration, the person offering you the bet cannot make money, and the person
offered the bet cannot lose money. The question is, "For which of two events
would you like to be paid some set amount of money, say $5, in case it occurs?"
One of the events is that a fair coin flip comes up heads. The other is an
ordinary one-off occurrence, like the election of Obama in 2012 or the sun
exploding tomorrow.
The goal is to elicit the degree of belief that the person has in the one-off
event. If the person takes the one-off event when given a choice like this, then
we want to say (or de Finetti wanted to say, anyway) that the person's prior is
greater than 1/2. If the person says, "I don't care, let me flip a coin," like
ciphergoth's interlocutor did, then we want to say that the person has a prior
equal to 1/2. There are still lots of problems, since (among other things) in
the usual personalist story, degrees of belief have to be infinitely precise --
corresponding to a single real number -- and it is not clear that when a person
says, "Oh, just flip a coin," the person has a degree of belief equal to 1/2, as
opposed to an interval-valued degree of belief centered on 1/2 or something like
that.
But anyway, I don't see how your point makes contact with ciphergoth's.

0roystgnr11y

For a rational person with infinite processing power, my point doesn't apply.
You can also neglect air resistance when determining the trajectory of a
perfectly spherical cow in a vacuum.
For a person of limited intelligence (i.e. all of us), it's typically necessary
to pick easily-evaluated heuristics that can be used in place of detailed
analysis of every decision. I last used my "people offering me free stuff out of
nowhere are probably trying to scam me somehow" heuristic while opening mail a
few days ago [http://www.fodors.com/news/story_4845.html]. If ciphergoth's
interlocuter had been subconsciously thinking the same way, then this time they
missed a valuable opportunity for introspection, but it's not immediately
obvious that such false positive mistakes are worse than the increased
possibility of false negatives that would be created if they instead tried to
successfully outthink every "cannot lose" bet that comes their way.

0lmm11y

The person offering the bet still (presumably) wants to minimize their loss, so
they would be more likely to offer it if the unknown occurrence was impossible
than if it was certain.

3Sniffnoy11y

So, as to Savage's theorem...?

5Cyan11y

I'd expect Mayo to say something along the lines of -- translating on the fly
into LW-ese -- preferences ought not to enter into the question of how best to
establish map-territory correspondence.

1Matt_Simpson11y

I poked around, but couldn't find anything where Mayo talked about Cox's Theorem
and it's premises. Did you have something particular in mind?

5Cyan11y

Ah, found it
[http://errorstatistics.com/2012/08/27/knowledgeevidence-are-not-captured-by-mathematical-probability/]:

3Matt_Simpson11y

Thanks!

2Cyan11y

As far as I know, she is not familiar with Cox's Theorem at all, nor does she
explicitly address the premise in question. I've been following her blog from
the start, and I tried to get her to read about Cox's theorem two or three
times. I stopped after I read a post which made it clear to me that she thinks
that encoding the plausibility of a claim with a single real number is not
necessary -- not useful, even -- to construct an account of how science uses
data to provide a warrant for a scientific claim. Unfortunately I don't remember
when I read the post...

In my opinion, sort of. Munroe probably left out the reasoning of the Bayesian for comic effect.

But the answer is that the Bayesian would be paying attention to the prior probability that the sun went out. Therefore, he would have concluded that the sun didn't actually go out and that the dice rolled six twice for a completely different reason.

The p-value for this problem is not 1/36. Notice that, we have the following two hypotheses, namely

H0: The Sun didn't explode,
H1: The Sun exploded.

Then,

p-value = P("the machine returns yes", when the Sun didn't explode).

Now, note that the event

"the machine returns yes"

is equivalent to

"the neutrino detector measures the Sun exploding AND tells the true result" OR "the neutrino detector does not measure the Sun exploding AND lies to us".

Assuming that the dice throwing is independent of the neutrino detector measuremen... (read more)

(1/36)(1+34p0) is bounded by 1/36, I think a classical statistician would be
happy to say that the evidence has a p-value of 1/36 her. Same for any test
where H_0 is a composite hypothesis, you just take the supremum.
A bigger problem with your argument is that it is a fully general
counter-argument against frequentists ever concluding anything. All data has to
be acquired before it can be analysed statistically, all methods of acquiring
data have some probability of error (in the real world) and the probability of
error is always 'unknowable', at least in the same sense that p0 is in your
argument.
You might as well say that a classical statistician would not say the sun had
exploded because he would be in a state of total Cartesian doubt about
everything.

0patriota9y

For this problem, the p-value is bounded by 1/36 from below, that is, p-value >
1/36. The supremum of (1/36)(1+34p0) is 35/36 and the infimum is 1/36.
Therefore, I'm not taking the supremum, actually the cartoon took the infimum,
when you take the infimum you are assuming the neutrino detector measures
without errors and this is a problem. The p-value, for this example, is a number
between 1/36 and 35/36.
I did not understand "the big problem" with my argument...

The point depends on differences between confidence intervals and credible
intervals.
Roughly, frequentist confidence intervals, but not Bayesian credible intervals,
have the following coverage guarantee: if you repeat the sampling and analysis
procedure over and over, in the long-run, the confidence intervals produced
cover the truth some percentage of the time corresponding to the confidence
level. If I set a 95% confidence level, then in the limit, 95% of the intervals
I generate will cover the truth.
Bayesian credible intervals, on the other hand, tell us what we believe (or
should believe) the truth is given the data. A 95% credible interval contains
95% of the probability in the posterior distribution (and usually is centered
around a point estimate). As Gelman points out, Bayesians can also get a kind of
frequentist-style coverage by averaging over the prior. But in Wasserman's
cartoon, the target is a hard-core personalist who thinks that probabilities
just are degrees of belief. No averaging is done, because the credible intervals
are just supposed to represent the beliefs of that particular individual. In
such a case, we have no guarantee that the credible interval covers the truth
even occasionally, even in the long-run.
Take a look here
[http://stats.stackexchange.com/questions/2272/whats-the-difference-between-a-confidence-interval-and-a-credible-interval]
for several good explanations of the difference between confidence intervals and
credible intervals that are much more detailed than my comment here.

1gwern11y

Right. This is what my comment there was pointing out: in his very own example,
physics, 95% CIs do not get you 95% coverage since when we look at particle
physics's 95% CIs, they are too narrow. Just like his Bayesian's 95% credible
intervals. So what's the point?

0JonathanLivengood11y

I suspect you're talking past one another, but maybe I'm missing something. I
skimmed the paper you linked and intend to come back to it in a few weeks, when
I am less busy, but based on skimming, I would expect the frequentist to say
something like, "You're showing me a finite collection of 95% confidence
intervals for which it is not the case that 95% of them cover the truth, but the
claim is that in the long run, 95% of them will cover the truth. And the claim
about the long run is a mathematical fact."
I can see having worries that this doesn't tell us anything about how confidence
intervals perform in the short run. But that doesn't invalidate the point
Wasserman is making, does it? (Serious question: I'm not sure I understand your
point, but I would like to.)

0gwern11y

Well, I'll put it this way - if we take as our null hypothesis 'these 95% CIs
really did have 95% coverage', would the observed coverage-rate have p<0.05? If
it did, would you or him resort to 'No True Scotsman' again?
(A hint as to the answer: just a few non-coverages drive the null down to
extremely low levels - think about multiplying 0.05 by 0.05...)

0JonathanLivengood11y

Yeah, I still think you're talking past one another. Wasserman's point is that
something being a 95% confidence interval deductively entails that it has the
relevant kind of frequentist coverage. That can no more fail to be true than 2+2
can stop being 4. The null, then, ought to be simply that these are really 95%
confidence intervals, and the data then tell against that null by undermining a
logical consequence of the null. The data might be excellent evidence that these
aren't 95% confidence intervals. Of course, figuring out exactly why they aren't
is another matter. Did the physicists screw up? Were their sampling assumptions
wrong? I would guess that there is a failure of independence somewhere in the
example, but again, I haven't read the paper carefully or really looked at the
data.
Anyway, I still don't see what's wrong with Wasserman's reply. If they don't
have 95% coverage, then they aren't 95% confidence intervals.
So, is your point that we often don't know when a purportedly 95% confidence
interval really is one? Or that we don't know when the assumptions are satisfied
for using confidence intervals? Those seem like reasonable complaints. I wonder
what Wasserman would have to say about those objections.

0gwern11y

I'm saying that this stuff about 95% CI is a completely empty and broken
promise; if we see the coverage blown routinely, as we do in particle physics in
this specific case, the CI is completely useless - it didn't deliver what it was
deductively promised. It's like have a Ouija board which is guaranteed to be
right 95% of the time, but oh wait, it was right just 90% of the time so I guess
it wasn't really a Oujia board after all.
Even if we had this chimerical '95% confidence interval', we could never know
that it was a genuine 95% confidence interval. I am reminded of Borges:
It is universally admitted that the 95% confidence interval is a result of good
coverage; such is declared in all the papers, textbooks, biographies of
illustrious statisticians and other texts whose authority is unquestionable...
(Given that "95% CIs" are not 95% CIs, I will content myself with honest
credible intervals, which at least are what they pretend to be.)

My immediate takeaway from the strip was something like: "I'm the only one I know who's going to get the joke, and there is something cool about that."

The satisfaction that thought gives me makes me suspect I'm having a mental error, but I haven't identified it yet.

Two subtleties here:

1) The neutrino detector

isevidence that the Sun has exploded. It's showing an observation which is 36^H^H 35 times more likely to appear if the Sun has exploded than if it hasn't (likelihood ratio of 35:1). The Bayesian just doesn't think that's strong enough evidence to overcome the prior odds, i.e., after multiplying the prior odds by 35 they still aren't very high.2) If the Sun

hasexploded, the Bayesian doesn't lose very much from paying off this bet.Nitpick, the detector lies on double-six regardless of the outcome, so the likelihood ratio is 35:1, not 36:1.

I just want to know why he's only betting $50.

Because the stupider the prediction is that somebody is making, the harder it is to get them to put their money where their mouth is. The Bayesian is hoping that $50 is a price the other guy is willing to pay to signal his affiliation with the other non-Bayesians.

Fair? No. Funny? Yes!

The main thing that jumps out at me is that the strip plays on a caricature of frequentists as unable or unwilling to use background information. (Yes, the strip also caricatures Bayesians as ultimately concerned with betting, which isn't always true either, but the frequentist is clearly the butt of the joke.) Anyway, Deborah Mayo has been picking on the misconception about frequentists for a while now: see here and here, for examples. I read Mayo as saying, roughly, that of course frequentists make use of background information, they just don't do it by writing down precise numbers that are supposed to represent either their prior degree of belief in the hypothesis to be tested or a neutral, reference prior (or so-called "uninformative" prior) that is supposed to capture the prior degree of evidential support or some such for the hypothesis to be tested.

And bad Bayesians use crazy priors,

1) There is no framework so secure that no one is dumb enough to foul it up.

2) By having to use a crazy prior explicitly, this brings the failure point forward in one's attention.

Andrew Gelman on whether this strip is fair to frequentists:

... (read more)No, it's not fair. Given the setup, the null hypothesis would be, I think, 'neither the Sun has exploded nor the dice come up 6', and so when the detector goes off we reject the 'neither x nor y' in favor of 'x or y' - and I think the Bayesian would agree too that 'either the Sun has exploded or the dice came up 6'!

Um, I don't think the null hypothesis is usually phrased as, "There is no effect and our data wasn't unusual" and then you conclude "our data was unusual, rather than there being no effect" when you get data with probability < .05 if the Sun hasn't exploded. This is not a fair steelmanning.

Two clarifications: Frequentist vs Bayesian breakdown: interpretation vs inference; Beyond Bayesians and Frequentists.

Regarding the comic: if the sun exploded and it's nighttime, you could still find out by looking to see if the moon just got a lot brighter.

Y'all are/were having a better discussion here than we've had on my blog for a while....came across by chance. Corey understands error statistics.

I

wishthe frequentist were a straw man, but they do do stuff nearly that preposterous in the real world. (ESP tests spring to mind.)I found it hilarious, I think it's the first time I've seen bayesians mentioned outside LW, and since it seems to be a lot of betting, wagers, problems hinging on money, I think both are equally approporiate. Insightful for being mostly entertainment (the opposite of the articles here - aiming to be insightful, usually ending up entertaining as well?), but my warning light also went off. Perhaps I'm already too attached to the label... I'll try harder than usual to spot cult behaviour now.

I felt that the comic is quite entertaining. This marks the first time that I have seen Bayes be mentioned in the mainstream (if you can call XKCD the internet mainstream). Hopefully it will introduce Bayes to a new audience.

I feel that it is a good representation of frequentists and Bayesians. A Bayesian would absolutely use this as an opportunity to make a buck.

My general impression is that Bayes is useful in diagnosis, where there's a relatively uncontroversially already-known base rate, and frequentism is useful in research, where the priors are highly subject to disagreement.

Cox's theorem is a

theorem. I get that the actual Bayesian methods can be infeasible to compute in certain conditions so people like certain approximations which apply when priors are non-informative, samples are large enough, etc., but why can't theyadmitthey're approximations to something else, rather than come up with this totally new, counter-intuitive epistemology where it's not allowed to assign probabilities to fixed but unknown parameters, which is totally at odds with commonsensical usage (norma... (read more)In my opinion, sort of. Munroe probably left out the reasoning of the Bayesian for comic effect.

But the answer is that the Bayesian would be paying attention to the prior probability that the sun went out. Therefore, he would have concluded that the sun didn't actually go out and that the dice rolled six twice for a completely different reason.

The p-value for this problem is not 1/36. Notice that, we have the following two hypotheses, namely

H0: The Sun didn't explode, H1: The Sun exploded.

Then,

p-value = P("the machine returns yes", when the Sun didn't explode).

Now, note that the event

"the machine returns yes"

is equivalent to

"the neutrino detector measures the Sun exploding AND tells the true result" OR "the neutrino detector does not measure the Sun exploding AND lies to us".

Assuming that the dice throwing is independent of the neutrino detector measuremen... (read more)

Can someone help me understand the point being made in this response? http://normaldeviate.wordpress.com/2012/11/09/anti-xkcd/

My immediate takeaway from the strip was something like: "I'm the only one I know who's going to get the joke, and there is something cool about that."

The satisfaction that thought gives me makes me suspect I'm having a mental error, but I haven't identified it yet.