Why epidemiology will not correct itself



We're generally familiar here with the appalling state of medical and dietary research, where most correlations turn out to be bogus. (And if we're not, I have collected a number of links on the topic in my DNB FAQ that one can read, see http://www.gwern.net/DNB%20FAQ#flaws-in-mainstream-science-and-psychology - probably the best first link to read would be Ioannidis's “Why Most Published Research Findings Are False”.)

I recently found a talk arguing that this problem was worse than one might assume, with false positives in the >80% range, and more interestingly, why the rate is so high and will remain high for the foreseeable future. Young asserts, pointing to papers and textbooks by epidemiologists, that they are perfectly aware of what the Bonferroni correction does (and why one would use it) and that they choose to not use it because they do not want to risk any false negatives. (Young also conducts some surveys showing less interest in public sharing of data and other good things like that, but that seems to me to be much less important than the statistical tradeoffs.)

There are three papers online that seem representative:

  1. Rothman (1990)
  2. Perneger (1998)
  3. Vandenbroucke, PLoS Med (2008)

Reading them is a little horrifying when one considers the costs of the false positives, all the people trying to stay healthy by following what is only random noise, and the general (and justified!) contempt for science by those aware of the false positive rate. (I enlarge on this vein of thought on Reddit. The recent kerfluffle about whether salt really is bad for you - medical advice that has stressed millions and will cost more millions due to New York City's war on salt - is a reminder of what is at stake.)

The take-away, I think, is to resolutely ignore anything to do with diet & exercise that is not a randomized trial. Correlations may be worth paying attention to in other areas but not in health.