What Would it Take to "Prove" a Speculative Cause?
My previous essay Why I'm Skeptical About Unproven Causes (And You Should Be Too) generated a lot of discussion here and on the Effective Altruist blog. Some related questions that came up a lot was: what does it take to prove a cause? What separates "proven" from "speculative" causes? And how do you get a "speculative" cause to move into the "proven" column? I've decided that this discussion is important enough that it merits a bit of elaboration at length, so I'm going to do that in this essay.
Proven Cause vs. Speculative Cause
My prime example of proven causes are GiveWell's top charities. These organizations -- The Against Malaria Foundation (AMF), GiveDirectly, and Schistosomiasis Control Initiative (SCI) -- are rolling out programs that have been the target of significant scientific scrutiny. For example, delivering long-lasting insecticide-treated anti-malaria nets (what AMF does) has been studied by 23 different randomized, controlled trials (RCTs). GiveWell has also published thorough reviews of all three organizations (see reviews for AMF, GiveDirectly, and SCI).
On the other hand, a speculative cause is a cause where the case is made entirely by intuition and speculation, with zero scientific study. For some of these causes, scientific study may even be impossible.
Now, I think 23 RCTs is a very high burden to meet. Instead, we should recognize that being "proven" is not a binary yes or no, but rather a sliding scale. Even AMF isn't proven -- there still are some areas of concern or potential weaknesses in the case for AMF. Likewise, other organizations working in the area, like Nothing But Nets, also are nearly as proven, but don't have key elements of transparency and track record to make myself confident enough. And AMF is a lot more proven that GiveDirectly, which is potentially more proven than SCI given recent developments in deworming research.
Ideally, we'd take a Bayesian approach, where we have a certain prior estimate about how cost-effective the organization is, and then update our cost-effectiveness estimate based on additional evidence as it comes in. For reasons I argued earlier and GiveWell has argued in "Why We Can't Take Expected Value Estimates Literally (Even When They're Unbiased)", "Maximizing Cost-Effectiveness Estimates via Critical Inquiry"</a>, "Some Considerations Against More Investment in Cost-Effectiveness Estimates", I think our prior estimate should be quite skeptical (i.e. expect cost-effectiveness to be not as good as AMF / much closer to average than naïvely estimated) until proven otherwise.
Right now, I consider AMF, GiveDirectly, and SCI to be the only sufficiently proven interventions, but I'm open to other organizations also entering this area. Of course, this doesn't mean that all other organizations must be speculative -- instead there is a middle ground of organizations that are neither speculative or "sufficiently proven".
From Speculative to Proven
So how does a cause become proven? Through more measurement. I think this is best described through examples:
Vegan Outreach and The Humane League work to advertise people reducing the amount of meat in their diets in order to avoid cruelty in factory farms. They do this through leafleting and Facebook ads. Naïve cost-effectiveness estimates would guess that, even under rather pessimistic assumptions, this kind of advocacy is very cost-effective, perhaps around $0.02 to $65.92 to reduce one year of suffering on a factory farm.
But we can't be sure enough about this and I don't think this estimate is reliable. But we can make it better with additional study. I think that if we ran three or so more studies that were relatively independent (taking place in different areas and run by different researchers), addressed current problems with the studies (like lack of a control group), had longer time-frames and larger sample sizes, and still pointed toward a conversion rate of 1% or more, than I would start donating to this kind of outreach instead, believing it to be "sufficiently proven".
Another example could be 80,000 Hours, an organization that runs careers advice and encourages people to shoot for higher impact careers using their free careers advising and resources. One could select a group of people that seem like good candidates for careers advice, give them all an initial survey asking them specific things about their current thoughts on careers, and then randomly accept or deny them to get careers advice. Then follow up with everyone a year or two later and see what initial careers they ended up in, how they got the jobs, and for the group that got advising, how valuable in retrospect the advising was. With continued follow up, one could measure the difference in expected impact between the two groups and figure out how good 80K is at careers advice.
Perhaps even The Machine Intelligence Research Institute (MIRI) could benefit from more measurement. The trouble is that it's working on a problem (making sure that advanced artificial intelligence goes well for humanity) that's so distant, it's difficult to get feedback. But they still potentially could assess the success or failures of their attempt to influence the AI community and they still could try to solicit more external reviews of their work from independent AI experts. I'm not close enough to MIRI to know whether these would be good or bad ideas, but it seems plausible at first glance that even MIRI could be better measured.
And it wouldn't be too difficult to expand this to other areas. For example, I think GiveWell's tracking of money moved is reliable enough and their commitment to self-evaluation (and external review) strong enough that I would strongly consider funding them before any of their top charities, if they ever had any room for more funding (which they currently do not and urge you to donate to their top charities instead). Effective Animal Activism could do the same and I think have even higher success, because I think it's moderately likely that if someone starts donating to animal charities after joining EAA, there are few other things that could have influenced them.
Of course, these forms of measurement have their problems, and no measurement -- even two dozen RCTs -- will be perfect. But some level of feedback and measurement is incredibly necessary to avoid our own biases and failures in naïve estimation.
The Proven and The Promising: My Current Donation Strategy
My current donation strategy is to separate organizations into three categories: proven, promising, and not promising.
Proven organizations are the ones that I talked about earlier -- AMF, GiveDirectly, and SCI.
Promising organizations are organizations I think have a hope of becoming proven, someday. They're organizations practicing interventions that intuitively seem like they would have high upside (like 80K Hours in getting people into better careers and The Humane League in persuading a bunch of people to become vegetarian), have a good commitment to transparency and self-measurement (The Humane League shines here), and have opportunities for additional money to be converted into additional information on their impact.
My goal in donating would be to first ensure the survival of all promising organizations (make sure they have enough funding to stay around) and then try to buy information from promising organizations as much as I can. For example, I'd be interested in funding more studies about vegetarian outreach or making sure 80K has the money they need to hire a new careers advisor.
Once these needs are met, I'll save a fair amount of my donation to meet future needs down the road. But then, I'll spend some on proven organizations to (a) achieve impact, (b) continue the incentive for organizations to want to be proven, and (c) show public support for those organizations and donating in general.
...Now I just need to actually get some more money.
I'd like to thank Jonas Vollmer for having the critical conversation with me that inspired this piece.