RAND Health Insurance Experiment critiques

by Dustin1 min read18th Feb 201217 comments

5

Personal Blog

I have neither the qualifications nor the access to properly understand these two paywalled critiques of the RAND Health Insurance Experiment.

Health Plan Switching and Attrition Bias in the RAND Health Insurance Experiment

The Rand Health Insurance Study: A Summary Critique

Has there been any talk about either of these on OB/LW?  If not, why not and could anyone with access to the papers make any comments about how much weight they carry?

I post this here because the RAND results are brought up so often in discussions here, I hope others find it to be an appropriate venue.

17 comments, sorted by Highlighting new comments since Today at 2:38 AM
New Comment

The attrition bias has been mentioned repeatedly, but I haven't seen a good response from Robin.

but I haven't seen a good response from Robin.

That is...disappointing.

Robin's a ton of fun to read and he publishes some pretty good stuff - but the more I read of him, the more I get the impression he likes to look at an interesting idea (like the RAND study showing lots of medical spending to be useless) and then not do any followup or avoid confirmation bias.

His daily blog posts might not be the best way to assess that. It can be hard to fill that quota regularly. Do you have some other examples?

Well, the majority of his apparent output is that blog, so I don't think it's unfair to point to it; and it being a blog doesn't mean he cannot cover followups and null results or results contradicting his pet theories like construal theory. If anything, those should be cheap posts to make - they have built-in interest as contradictions.

Besides that, there's... what? His papers, which I've already praised and which I point to routinely (eg. about 10 comments ago). As a professor, you'd think he'd have grad students, but in the many years I've been reading OB, I think there was just one double-post on a grad student's thesis, and he mentions his publications about as often, so either he's not publishing very much or he is remarkably modest about it.

The point was that some posts get much less work than others because of the fixed pseudo-daily schedule. I agree that he tends to publicly ignore counterevidence, null results, etc. Any particular topics on the blog stand out for you?

Not off-hand; I haven't kept lists or done any explicit tests of him.

Some of his theses are so incredibly broad I'm sure there must be plenty of counter-evidence or null results, like his farmer-forager thesis, but none of them overlap with my own particular areas of interest such that I could confidently 'yes, that is clearcut confirmation bias' (like I can with some advocates of dual n-back or various supplements).

The only example I can think of right now is the topic of perpetuities/compound-interest charity: he had the thesis that they are helpful and also stymied by laws. I provided additional examples of the latter, which he happily posted to OB; I recently provided partial counter-examples to the former, the Islamic world's experience with the perpetual charities called _waqf_s, which he has - as far as I can tell - ignored twice now.

I'm curious about the n-back confirmation bias and the evidence those folk neglected.

Oh, that's easily explained. By going through the Brain Workshop ML archives and then keeping on top of all subsequent emails, I've managed to compile a fair number of failures-to-replicate in http://www.gwern.net/DNB%20FAQ#criticism and also deeply troubling criticism of studies that were reported at complete face value in places like Wired (for Jaeggi 2008) or the Wall Street Journal (for Jaeggi 2011, which we criticized here).

And I know that the failures to replicate are not widely known because I also have a Google Alerts set up for dual n-back and I see how it's being discussed on blogs and forums, which invariably cite - if they cite anything - only the positive results. Then there are the people on the mailing list, who enjoy discussing positive results but ignore or insult the other results. (I fear Moody's essay has caused his name to be taken in vain more than once over the years.)

Thanks. What's your take on the claim that stereotype encouragement, along the lines of "you're part of a group that's good at math" or "this is a test that you'll be good at," can boost performance above baseline on high-stakes tests? I've heard this claimed with regard to men and Asian-Americans, but worried about publication and reporting biases.

I don't know much about stereotype encouragement. Mostly I hear about stereotype threat, which strikes me as more than a little suspicious - smells like a Clever Hans or publication bias sort of situation.

There was earlier discussion of publication bias on this here (which makes sense given the attractiveness of the claim, along with general psychology research standards). This article is paywalled, but if it matches the abstract and is itself kosher, it shows a devastating pattern in the published studies:

The summary (abstract) of the paper Can stereotype threat explain the gender gap in mathematics performance and achievement?: Men and women score similarly in most areas of mathematics, but a gap favoring men is consistently found at the high end of performance. One explanation for this gap, stereotype threat, was first proposed by Spencer, Steele, and Quinn (1999) and has received much attention. We discuss merits and shortcomings of this study and review replication attempts. Only 55% of the articles with experimental designs that could have replicated the original results did so. But half of these were confounded by statistical adjustment of preexisting mathematics exam scores. Of the unconfounded experiments, only 30% replicated the original. A meta-analysis of these effects confirmed that only the group of studies with adjusted mathematics scores displayed the stereotype threat effect. We conclude that although stereotype threat may affect some women, the existing state of knowledge does not support the current level of enthusiasm for this as a mechanism underlying the gender gap in mathematics. We argue there are many reasons to close this gap, and that too much weight on the stereotype explanation may hamper research and implementation of effective interventions.

That was a surprisingly difficult article to obtain; Google Scholar failed, my UWash access as usual didn't work on Psycnet, a straight fulltext search failed, and when I finally found the journal in Ebscohost, the PDF download didn't work! After a while, I figured out that I could email the citation - with PDF attached - to myself. Here it is:

http://dl.dropbox.com/u/5317066/2012-stoet.pdf

EDIT: I had to laugh at this from the conclusion (or should that be cry?):

Third, we only included published studies. We believe that this is reasonable, because it is difficult to determine the scientific credibility of unpublished data. Furthermore, we do not think that a possible file drawer effect, which is the likelihood of missing articles that have not been published, would change our conclusion. More likely than not, unpublished studies would have found no differences between experimental conditions, although we can only speculate about this.

[-][anonymous]9y 4

Back when he was arguing against smoking being unhealthy, he was noticeably more reluctant to read anti-smoking papers than pro-smoking papers (eg, ignoring an anti-smoking paper because the author was a "partisan", but accepting a pro-smoking tobacco-industry-funded paper because it "looked professional")

Without addressing the substance of the papers, I can say that in practice most health economists still think the RAND experiment is really important. Robin Hanson is not unusual in this regard, the RAND experiment was featured prominently in the 2 graduate health econ classes I have taken, and in recent textbooks and papers.

[-][anonymous]5y 0

What happens when health care providers are also health insurers? Hayen finds:

negative link between integration and costs. Regarding quality, evidence is mixed.