Practical rationality in surveys

"Statistically significant results" mean that there's a 5% chance that results are wrong

Wrong. It means that the researcher defined a class of results such that the class had less than a 5% chance of occurring if the null hypothesis were true, and that the actual outcome fell into this class.

There are all sorts of things that can go wrong with that, but, even leaving all those aside, it doesn't mean there's a 5% chance the results are wrong. Suppose you're investigating psychic powers, and that the journals have (as is usually the case!) a heavy publication bias toward positive results. Then the journal will be full of statistically significant results and they will all be wrong.

[-]cousin_it16y20

In fairness, your last point isn't really about confidence levels. A journal that only accepted papers written in the Bayesian methodology, but had the same publication bias, would be just as wrong.

[-]Eliezer Yudkowsky16y50

A journal that reported likelihood ratios would at least be doing better.

A journal that actually cared about science would accept papers before the experiment had been done, with a fixed statistical methodology submitted with the paper in advance rather than data-mining the statistical significance afterward.

[-]Zack_M_Davis16y70

A journal that actually cared about science

Is this meant to suggest that journal editors literally don't care about science that much, or simply that "people are crazy, the world is mad"?

[-]RobinZ16y40

Not an objection, but a lot of the articles in that journal would be "here's my reproduction of the results I got last year and published then".

...which is a really good thing, on reflection.

[-]taw16y00

I'm confused by your remark. "5% chance of false positive" obviously means P(positive results|null hypothesis true)=5%, P(null hypothesis true|positive results) is subjective and has no particular meaningful value, so I couldn't have talked about that.

[-]AllanCrossman16y40

"Statistically significant results" mean that there's a 5% chance that results are wrong

Hmm. Assuming the experiment was run correctly, it means there's a less than 5% chance that data this extreme would have been generated if the null hypothesis - that nothing interesting was happening - were true. The actual chance can be specified as e.g. 1%, 0.01%, or whatever.

Also, assuming everything was done correctly, it's really the conclusions drawn from the results, rather than the results themselves, that might be wrong...

[-]taw16y20

The point is that this chance, no matter how small, is in addition to massive number of things that could have gone wrong.

And with negative results you don't even have that.

[-]Mike Bishop16y20

Yes, classical hypothesis testing is of questionable value - estimated precisely enough we will almost always reject the null hypothesis, but who cares? I think that "the chance the results are wrong" is not the most helpful way to think about research in many areas.

Of course, it is important to remind ourselves of the many types of mistakes possible in our research.

[-]jimmy16y20

5 minutes one google didn't turn up the study I'm thinking of, but I remember reading a study that claimed that around 2/3 of all published studies were false positives due to publication bias. (the p value they reported was sufficiently small to believe it).

I did, however, find a metastudy that studied publication bias on papers about publication bias.link.

They found "statistically insignificant" (p = 0.13) evidence for false positives there too.

The way I tend to deal with them now is treating them as weak evidence unless I'm interested enough to look further.

If the P value not really low, I'll make guesses at how popular a topic of study it is (how many times can you try for a positive result?), how I heard about the study (more possibility for selection bias), how controversial the topic is (how strong is the urge to fudge something?), and what my prior probability would be.

For example, when someone tells me about a study that claims "X causes cancer" and 1) P = .04 2) it would somehow benefit the person if the claim were true 3) I see no prior reason for a link between X and cancer and 4) see possible other causes for the correlation that were not obviously corrected for, then I assign very little weight to this evidence.

If I find a study by googling the topic, P = 0.001, the topic isn't all that controversial, and no one would even think to test it if they did not assign high prior probability, then I'll file it under "known".

[-]Mike Bishop16y50

I think you are thinking of Ioannidis et al - Why Most Published Research Findings are False

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

-3

Practical rationality in surveys

-3

-3