Even if you have a nail, not all hammers are the same



(Related to Over-ensapsulation and Subtext is not invariant under linear transformation)

Between 2004 and 2007, Goran Bjelakovic et al. published 3 famous meta-analysis of vitamin supplements, concluding that vitamins don't help people but instead kill people.  This is now the accepted dogma; and if you ask your doctor about vitamins, she's likely to tell you not to take them, based on reading either one of these articles, or one of the many summaries of these articles made in secondary sources like The Mayo Clinic Journal.

The 2007 study claims that beta-carotene and vitamins A and E are positively correlated with death - the more you take, the more likely you are to die. Therefore, vitamins kill.  The conclusion on E requires a little explanation, but the data on beta-carotene and A is simple and specific:

Univariate meta-regression analyses revealed significant influences of dose of beta carotene (Relative Risk (RR), 1.004; 95% CI, 1.001-1.007; P = .012), dose of vitamin A (RR, 1.000006; 95% CI, 1.000002-1.000009; P = .003), ... on mortality.

This appears to mean that, for each mg of beta carotene that you take, your risk of death increases by a factor (RR) of 1.004; for each IU of vitamin A that you take, by a factor of 1.000006.  "95% CI, 1.001-1.007" means that the standard deviation of the sample indicates a 95% probability that the true RR lies somewhere between 1.001 and 1.007.  "P = .012" means that there's only a 1.2% chance that you would be so unlucky as to get a sample giving that result, if in fact the true RR were 1.

A risk factor of 1.000006 doesn't sound like much; but I'm taking 2,500 IU of vitamin A per day.  That gives a 1.5% increase in my chance of death!  (Per 3.3 years.)  And look at those P-values: .012, .003!

So why do I still take vitamins?

What all of these articles do, in excruciating detail with regard to sample selection (though not so much with regard to the math), is to run a linear regression on a lot of data from studies of patients taking vitamins.  A linear regression takes a set of data where each datapoint looks like this:

     Y = a1X1 + c

and a multiple linear regression takes a set of data where each datapoint usually looks like this:

     Y = a1X1 + a2X2 + ... anXn + c

where Y and all the Xi's are known.  In this case, Y is a 1 for someone who died and a 0 for someone who didn't, and each Xi is the amount of some vitamin taken.  In either case, the regression finds the values for a1, ... an, c that best fit the data (meaning they minimize the sum, over all data points, of the squared error of the value predicted for Y, (Y - (a1X1 + a2X2 + ... anXn + c)2).

Scientists love linear regression.  It's simple, fast, and mathematically pure.  There are lots of tools available to perform it for you.  It's a powerful hammer in a scientists' toolbox.

But not everything is a nail.  And even for a nail, not every hammer is the right hammer.  You shouldn't use linear regression just because it's the "default regression analysis".  When a paper says they performed "a regression", beware.

A linear analysis assumes that if 10 milligrams is good for you, then 100 milligrams is ten times as good for you, and 1000 milligrams is one-hundred times as good for you.

This is not how vitamins work.  Vitamin A is toxic in doses over 15,000 IU/day, and vitamin E is toxic in doses over 400 IU/day (Miller et al. 2004, Meta-Analysis: High-Dosage Vitamin E Supplementation May Increase All-Cause Mortality;  Berson et al. 1993, Randomized trial of vitamin A and vitamin E supplementation for retinitis pigmentosa.). The RDA for vitamin A is 2500 IU/day for adults. Good dosage levels for vitamin A appear to be under 10,000 IU/day, and for E, less than 300 IU/day. (Sadly, studies rarely discriminate in their conclusions between dosage levels for men and women.  Doing so would give more useful results, but make it harder to reach the coveted P < .05 or P < .01.)

Quoting from the 2007 JAMA article:

The dose and regimen of the antioxidant supplements were: beta carotene 1.2 to 50.0 mg (mean, 17.8 mg) , vitamin A 1333 to 200 000 IU (mean, 20 219 IU), vitamin C 60 to 2000 mg (mean, 488 mg), vitamin E 10 to 5000 IU (mean, 569 IU), and selenium 20 to 200 μg (mean 99 μg) daily or on alternate days for 28 days to 12 years (mean 2.7 years).

The  mean  values used in the study of both A and E are in ranges known to be toxic. The maximum values used were ten times the known toxic levels, and about 20 times the beneficial levels.

17.8 mg of beta-carotene translates to about 30,000 IUs of vitamin A, if it were converted to vitamin A. This is also a toxic value. It is surprising that beta-carotene showed toxicity, though, since common wisdom is that beta-carotene is converted to vitamin A only as needed.

Vitamins, like any medicine, have an inverted-J-shaped response curve. If you graph their health effects, with dosage on the horizontal access, and some measure of their effects - say, change to average lifespan - on the vertical axis, you would get an upside-down J. (If you graph the death rate on the vertical axis, as in this study, you would get a rightside-up J.) That is, taking a moderate amount has some good effect; taking a huge a mount has a large bad effect.

If you then try to draw a straight line through the J that best-matches the J, you get a line showing detrimental effects increasing gradually with dosage. The results are exactly what we expect. Their conclusion, that "Treatment with beta carotene, vitamin A, and vitamin E may increase mortality," is technically correct. Treatment with anything may increase mortality, if you take ten times the toxic dose.

For a headache, some people take 4 200mg tablets of aspirin. 10 tablets of aspirin might be toxic. If you made a study averaging in people who took from 1 to 100 tablets of aspirin for a headache, you would find that "aspirin increases mortality".

(JAMA later published 4 letters criticizing the 2007 article.  None of them mentioned the use of linear regression as a problem.  They didn't publish my letter - perhaps because I didn't write it until nearly 2 months after the article was published.)

Anyone reading the study should have been alerted to this by the fact that all of the water-soluble vitamins in the study showed no harmful effects, while all of the fat-soluble vitamins "showed" harmful effects. Fat-soluble vitamins are stored in the fat, so they build up to toxic levels when people take too much for a long time.

A better methodology would have been to use piecewise (or "hockey-stick") regression, which assumes the data is broken into 2 sections (typically one sloping downwards and one sloping upwards), and tries to find the right breakpoint, and perform a separate linear regression on each side of the break that meets at the break.  (I almost called this "The case of the missing hockey-stick", but thought that would give the answer away.)

Would these articles have been accepted by the most-respected journals in medicine if they evaluated a pharmaceutical in the same way?  I doubt it; or else we wouldn't have any pharmaceuticals.  Bias against vitamins?  You be the judge.

Meaningful results have meaningful interpretations

The paper states the mortality risk in terms of "relative risk" (RR).  But  relative risk  is used for studies of 0/1 conditions, like smoking/no smoking, not for studies that use regression on different dosage levels.  How do you interepret the RR value for different dosages?  Is it RR x dosage?  Or RRdosage (each unit multiplies risk by RR)?  The difference between these interpretations is trivial for standard dosages.  But can you say you understand the paper if you can't interpret the results?

To answer this question, you have to ask exactly what type of regression the authors used.  Even if a linear non-piecewise regression were correct, the best regression analysis to use in this case would be a logistic regression, which estimates the probability of a binary outcome conditioned on the regression variables. The authors didn't consider it necessary to report what type of regression analysis they performed; they reported only the computer program (STATA) and the command ("metareg").  The  STATA metareg manual  is not easy to understand, but three things are clear:

  • It doesn't use the word "logistic" anywhere, and it doesn't use the logistic function, so it isn't logistic regression.
  • It does regression on the log of the risk ratio between two binary cases, a "treatment" case and a "no-treatment" case; and computes regression coefficients for possibly-correlated continuous treatment variables (such as vitamin doses).
  • It doesn't directly give relative risk for the correlated variables.  It gives regression coefficients telling the change in log relative risk per unit of (in this case) beta carotene or vitamin A.  If anything, the reported RR is probably er, where  r is the computed regression coefficient.  This means the interpretation is that risk is proportional to RRdosage.

Since there is no "treatment/no treatment" case for this study, but only the variables that would be correlated with treatment/no treatment, it would have been impossible to put the data into a form that metareg can use.  So what test, exactly, did the authors perform?  And what do the results mean?  It remains a mystery to me - and, I'm willing to bet, to every other reader of the paper.


Bjelakovic et al. 2007,  "Mortality in randomized trials of antioxidant supplements for primary and secondary prevention: Systematic review and meta-analysis",  Journal of the American Medical Association, Feb. 28 2007. See a commentary on it  here.

Bjelakovic et al. 2006, "Meta-analysis: Antioxidant supplements for primary and secondary prevention of colorectal adenoma", Alimentary Pharmacology & Therapeutics 24, 281-291.

Bjelakovic et al. 2004, "Antioxidant supplements for prevention of gastrointestinal cancers: A systematic review and meta-analysis," The Lancet 364, Oct. 2 2004.