"Friends do not let friends compute p values."

3lessdazed

2janos

0DeevGrape

3asr

New Comment

To test whether DID-patients were really aﬀected by interidentity amnesia or whether they were simulating their amnesia, the authors assessed the performance of four groups of subjects on a multiple-choice recognition test. The dependent measure was the number of correct responses. The ﬁrst group were the DID-patients, the second group were Controls, the third group were controls instructed to simulate interidentity amnesia (Simulators), and the fourth group were controls who had never seen the study list and were therefore True amnesiacs.

...

For instance, consider again the case of the Huntjens et al. study on DID discussed in Section 9.2.6 and throughout this book. For the data from the study, hypothesis H1a states that the mean recognition scores µ for DID-patients and True amnesiacs are the same and that their scores are higher than those of the Simulators: µcon > {µamn = µpat} > µsim, whereas hypothesis H1b states that the mean recognition scores µ for DID-patients and Simulators are the same and that their scores are lower than those of the True amnesiacs: µcon > µamn > {µpat = µsim}. Within the frequentist paradigm, a comparison of these models is problematical. Within the Bayesian paradigm, however, the comparison is natural and elegant

What does this mean? How does nesting work? How does frequentism fail? How does Bayesianism succeed? I do not understand the example at all.

You can comfortably do Bayesian model comparison here; have priors for µcon, µamn, and µsim, and let µpat be either µamn (under hypothesis Hamn) or µsim (under hypothesis Hsim), and let Hamn and Hsim be mutually exclusive. Then integrating out µcon, µamn, and µsim, you get a marginal odds-ratio for Hamn vs Hsim, which tells you how to update.

The standard frequentist method being discussed is nested hypothesis testing, where you want to test null hypothesis H0 with alternative hypothesis H1, and H0 is supposed to be nested inside H1. For instance you could easily test null hypothesis µcon >= µamn >= µpat = µsim against µcon >= µamn >= µpat >= µsim. However, for testing non-nested hypotheses, the methodology is weaker, or at least less standard.

I'm a beginning experimental linguist currently enrolled in a frequentist statistics course in my PhD program. I need to be able to use statistical methods to show that my experiments are valid and show real effects.

Could I successfully use Bayesian statistical analysis in lieu of ANOVAs and p-levels in real theoretical work? I have other reasons to want to drop this statistics class (like taking a different class that interests me more), so if learning frequentist statistics in this class is really going to be less useful than learning Bayesian methods on my own, I would love to know that.

Any input, particularly from someone with experience in academics, would be greatly appreciated.

Science papers are surprisingly conformist. If you want to get published, you do it the way everybody else does. If you want to push Bayesian analyses, you are probably better off doing it alongside p values, instead of as a replacement for them.

Do bear in mind, though, that p-values and ANOVAs aren't *wrong*. They're specialized tools that researchers tend to misuse. Having a Bayesian background should help you understand what they are and aren't suitable for.

LWers may find useful two recent articles summarizing (for cognitive scientists) why Bayesian inference is superior to frequentist inference.

Kruschke - What to believe: Bayesian methods for data analysis

Wagenmakers et al - Bayesian versus frequentist inference

(The quote "Friends do not let friends compute p values" comes from the first article.)