I am highly grateful to Alexey Morgunov and Adam Casey for reviewing and commenting on an earlier draft of this post, and pestering me into migrating the content from many emails to a somewhat coherent post.
Will Crouch has posted about the Centre for Effective Altruism and in a follow up post discussed questions in more detail. The general sense of the discussion of that post was that the arguments were convincing and that donating to CEA is a good idea. Recently, he visited Cambridge, primarily to discuss 80,000 hours, and several Cambridge LWers spoke with him. These discussions caused a number of us to substantially downgrade our estimates of the effectiveness of CEA, and made our concerns more concrete.
We're aware that our kind often don't cooperate well, but we are concerned that at present CEA's projects are unlikely to cash out into large numbers of people changing their behaviour. Ultimately, we are concerned that the space for high impact meta-charity is limited, and that if CEA is suboptimal this will have large opportunity costs. We want CEA to change the world, and would prefer this happens quickly.
The key argument in favour of donating money to CEA which was presented by Will was that by donating $1 to CEA you produce more than $1 in donations to the most effective charities. We present some apparent difficulties with this remaining true on the margin. We also present more general worries with CEA as an organisation under these headings:
Cost effectiveness estimates
Impact of 80,000 hours advice
Content of 80,000 hours advice
The 80,000 hours pledge
Scope and Goals
Speed of growth
It is worrying how little of the key information about CEA is publicly available. This makes assessment hard. By contrast to GiveWell, CEA programs are not particularly open about where their money is spent, what their marginal goals are, or what they are doing internally. As presented online, the majority of both 80,000 hours' and GWWC's day to day activity is maintaining blogs. These blogs are not substantial by comparison to, say, OB in terms of their frequency of content or their frequency of insight. Concretely, it does not seem that CEA is being tranparent in the sense of GiveWell.
Qn: How does CEA think its programs would score on a GiveWell assessment?
Qn: Does CEA think that GiveWell’s assessments systemically go wrong?
Qn: Does CEA consider the blogs to be a substantial source of impact? What external assessments or objective data support a claim of impact from the blogs?
Cost effectiveness estimates
As presented online and in person, CEA does not present as having credible models for their future impact. The GWWC site, for example, claims that from 291 members there will be £72.68M pledged. This equates to £250K / person over the course of their life. Claiming that this level of pledging will occur requires either unreasonable rates of donation or multi-decade payment schedules. If, in line with GWWC's projections, around 50% of people will maintain their donations, then assuming a linear drop off the expected pledge from a full time member is around £375K. Over a lifetime, this is essentially £10K / year. It seems implausible that expected mean annual earnings for GWWC members is of order £100K.
Qn: On what basis does GWWC assert that its near 300 members are credibly precommitted to donating £72.68M?
Looking at valuing marginal impacts, it would be hoped CEA's programs are better. For example, it has been stated that GWWC has an internal price of around £1700 for new pledges. This does not appear to extend to new programs, or to portions of 80,000 hours. In recent conversation with Will Crouch, he was asked what marginal value was placed on having a new intern in Cambridge (UK). There was no numerate response. Indeed, the assorted estimates do not cohere. If new pledges are worth £10K / year in expectation, and even 10% of the donations flow into CEA, then an intern generating 20 marginal pledges is a winning proposition for CEA at their stated wage level. If the horizon time for 20 pledges from one worker is larger than CEA can afford to wait, then it is not clear that CEA has an effective program for using their interns.
Qn: What does GWWC or 80,000 hours see as the marginal impact of one additional grad student in full time labour?
Qn: What is the horizon against which CEA programs are acting?
The primary less visible activity of both GWWC and 80,000 hours is research. For GWWC, there are questions to be resolved about how best to Earn to Give, whether there are other activities which are less immediately fiscal but of high impact, and broadly how to identify near optimal opportunities for donation. For 80,000 hours, there is a need to establish how to optimise career paths for a broad set of potential terminal goals. Neither project appears to be bearing visible fruit. In conversation with Will Crouch, he observed that 80,000 hours don't know much about the burnout rates of various careers, the wage progressions or the likelihood of career progression.
At present, this means that 80,000 hours is not publicly presenting things which are better than other sources of advice. There is a need for the best current knowledge to be available quickly; there are people who are deciding careers now who are unlikely to do reliably better than average on the basis of the information that 80,000 hours has made public. It seems implausible that new results are reliably coming in so quickly that the time spent publishing the internal state of the art will substantively slow down further improvements. There is a strong sense in which they are being graded on their speed, with publication being the mediator of impact. It also seems plausible that the publishing and research are substantially orthogonal, and would use different people. Hence from the outside, the lack of published concrete advice seems to be a substantial reason to think that there is no internal art.
Qn: What is 80,000 hours producing with their current research time? What is the planned schedule? What constitutes success?
There is a similar concern with the output of GWWC. Of their listed papers, only one (by Toby Ord) is substantive. The remainder are not written as if there is a pressing need to have results that are concise, clear and better than other available materials. For example, there is an extended article on investing vs. giving, and another on the distinction between income and happiness. The former does little more than list factors that might be relevant, with no attempt to discern which of these effects are largest or a sense of what ranges of reasonable looking assumptions give. The second observes correctly that the impact of monetary loss on donors may be overestimated, but then doesn't even question how impacts on recipients should be converted into hedonic terms. As a document, it seems to have been written to convince rather than elucidate truth. Neither paper drives an update to a belief that the current researchers at GWWC are effectively seeking to identify close to optimal opportunities or to reason coherently about the impact of interventions.
More worrying is the absolute lack of material. Whilst the number of active researchers is difficult to discern from the website, it seems plausible that GWWC has had at least 6 people researching for it for at least the last year. There is no matching level of output; in academe one would expect to see several papers per year per person, and the primary claim of GWWC is that there are low hanging fruit in terms of the optimisation of donation and the ability of people to donate. So a priori, if GWWC was efficiently researching, one would expect it to be finding and publicising their results.
Qn: What is GWWC producing with their current research time? What is the planned schedule? What constitutes success?
Impact of the 80,000 hours advice
In conversation with Will, he asserted that on the basis of self-reports, something like 20-25% of those involved in 80K have changed or substantially rethought their career choice. This implies immediately that 75-80% haven't, and in practise that number will be higher care of the self-reporting. This substantially reduces the likely impact of 80,000 hours as a program. Indeed, it seems to be a near fatal problem for GWWC, in that if the 80,000 hours population is representative of pledges, then most of the GWWC pledges are earning in line with typical post grads, which makes it much harder to raise the mean value of each pledge to £250K as is required.
Methods of achieving this impact do not seem to be well attested. Will was asked what the internal value of a paid worker in Cambridge might be. A broad response was that it might improve the ability to give advice, but it was not suggested that this was based on hard data. This is a little troubling, because it implies that the effectiveness of 1-1 Skype interventions or 1-1 in person interventions are not known on a per hour basis. Absent this kind of data, it's difficult to see how 80,000 hours can be effectively optimising their impact.
Qn: Does 80,000 hours have data on the relative effectiveness of their activities?
Qn: How does CEA square a lack of reported career changes with GWWC's numbers, given background over-life earnings?
Content of 80,000 hours’ advice
In conversation, earning to give was suggested as being the baseline to measure against. Will noted explicitly that it's hard to know what kinds of careers are substantively better than others in a data-driven way. He was then very quick to hedge that by saying that of course research was valuable, and of course political activism could be valuable, and of course being a program manager at the world bank could be valuable (which would naturally require you do a PhD first), and of course being an entrepreneur could be valuable. It was not suggested that clearly at most one of these was optimal, or that people might ultimately be in a position where they trade off what they would have chosen to do in isolation against world optimising goals. We came away from this with the concern that 80,000 hours is not being epistemically vicious, and so is not willing to say things that might cause people to be unhappy. In particular, it seemed that there was more pressure to preserving the fuzzies that people were getting out of being affiliated to 80,000 hours than there was to make the advice good, and so most potential career paths were deemed to be OK.
Qn: Does 80,000 hours offer information that causes a substantial reduction in the space of careers that are considered optimal?
Qn: How does 80,000 hours square a lack of reported career changes with their advice being good?
The 80,000 hours pledge
It was noted that the pledge had been substantially weakened, to "I intend, at least in part, to use my career in an effective way to make the world a better place.". My recollection says that it used to be more like "I will use my career to most effectively reduce global poverty". There wasn't any particular defence of the choice of wording or any indication that there had been deep thought about precisely what that pledge should constitute.
The core mechanism by which 80,000 hours or GWWC will achieve long term impact has to be maintaining people's desire to act over a long period. In turn, it seems that the primary intermediate goal is to build a strong social structure to encourage adherence to these pledges. The pledges are then the key totems around which a community will be built, and so there should be massive pressure to optimise these and the surrounding social structures. This does not seem to have occurred.
Qn: What are the design decisions behind the pledge, and what motivated the change in pledge?
Qn: To what extent is the wording of the pledge thought to be important?
Scope and goals
Speed of growth
It was stated that around 1/3 of the Oxford undergraduate population (~4000 people) are on the mailing lists. Of that, there are around 300 members and a few dozen are coming to each event. By comparison, enterprising college societies in Cambridge (TMS, TCSS) have well in excess of 1000 undergraduates on their mailing lists, and get 80-100 people to their talks. When TCSS advertised an event to 1/3 of Cambridge, upwards of 600 people attended. From some organisational point of view, 80,000 hours Oxford could probably extract another factor of 5-10 out of its talk attendance. Whilst that won't factor through directly to the pledges, it seems unlikely that there would not be substantial growth there. In both of the Cambridge societies, the operating scale of the society has been doubled in a single year, by ensuring a reliable stream of events and getting networks in place to advertise widely.
It does not seem like the organisations are optimising for growth and retention of a population of attendees. This would provide a pool of people broadly on board with the aims of the organisations, and substantially enriched for likely pledges. It is very plain that such optimisation has not been codified and sent to other new chapters; the Cambridge GWWC chapter does not behave as if such guidance exists.
Qn: What optimisation has GWWC / 80,000 hours attempted in terms of the structure of their chapters?
Taking a larger scale view, lots of these concerns ultimately cash out in a concern that a large fraction of the people involved with 80,000 hours or GWWC behave like dilettantes. There is an apparent desire to feel comfortable about career choice, think about dealing with poverty and get involved with 501(c)(3)'s/NGO's/UK charities. However as organisations, they are not behaving as we would expect for a bunch of people that seriously expect to vector hundreds of millions of pounds over the next decade, which is what continued linear growth would imply.
Nor do they seem to act as if they wish to seriously optimise the world. For example, the world bank throws ~$43B/year around. Which is easier: To upscale GWWC by a factor of ~17000, or double the mean effectiveness of the world bank? This should not be a hypothetical question; it should be answered. There doesn't seem to be an acceptance that large social structures are going to be needed to support GWWC style donation for a lifetime, in the fashion of say the rotary clubs.
Qn: Where does CEA see its projects in 10 years? 20? 40?
Did you send this article to Will or somebody else at CEA before posting it? Holden Karnofsky let me comment on a copy of his critique of SI before he published it. That procedure is what I would call "common courtesy," and also it reduces the chance that you'll grossly mislead readers about an organization that you know far less about than the organization's principals do.
The primary source of the post was an extensive email exchange with Adam Casey (currently working full time at CEA). Since we are friends, this was not primarily in an official capacity. I also asked Adam to cross check the numbers whilst wearing a more official hat.
I was encouraged by him and Alexey Morgunov (Cambridge LWer) to make the substance of this public immediately after Will Crouch came up to Cambridge.
Can I just make clear my role here. 1) I've had general conversation with Jonathan about CEA and MIRI, in which several of these criticisms were raised. 2) I checked over the numbers for the GWWC impact assessment. 3) I've also said that criticism in public is often productive, and that public scrutiny of CEA on LW would be helpful for people choosing between it and MIRI. 4) I saw a draft of the post before it was published.
I want to make it very clear that: 1) I do not endorse this post. 2) I did not do detailed fact-checking for it. 3) I do not want this post to be seen as my view, either in a private capacity or with my CEA hat on. 4) That I'm deeply sorry if my actions have harmed the reputation of CEA. This was never my intention.
First off, thanks for putting so much time into writing this extensive list of questions and doubts you have about CEA. Unlike for-profit activities, we don't have immediate feedback effects telling us when we're doing well and when we're doing badly, so criticism is an important countermeasure to make sure we do things as well as possible. We therefore really welcome people taking a critical eye to our activities.
As the person who wrote the original CEA material here on LessWrong, and the person who you mention above, I feel I should be the one to collate a response to your questions. However, because of other commitments (managing; fundraising; writing my first piece for a magazine column), it will be a few days before I can get this to you in a form I'd feel happy with. I hope that's ok.
Before then I'll just mention a few things in order to make things a bit clearer to the audience.
In what you wrote a couple of comments made it sound as if you'd had an in-depth conversation with me on these issues; whereas really the context of the only exchange we've had is my giving a short talk to a group of about 15 people, of very varied backgrounds. You asked a few questions and there was discussion afterwards, but this must have only taken up about 10-15 minutes of time. Though I would very much like to, I haven't ever spoken with you or Alexey one-on-one.
Similarly, in your response to Luke you say that Adam works full-time at CEA. I think there's some disagreement between the two of you on the extent to which he had signed off on the content. But, at any rate, it's worth noting that Adam is an intern at CEA. This means he does contribute a full working week for CEA, but he is not an employee. He's therefore not the person to go when it comes to high-level evaluation of CEA.
You mention an internal estimate of £1700 for the value of a new pledge. None of us are familiar with this figure, and we're confused about where it could have come from.
You suggest that CEA has ~4000 people on its mailing lists. The correct figure is less than half that (unless you include TLYCS, which you might have been thinking of, which does have in excess of 4000 on its mailing list).
You estimate GWWC's research capacity at 6 staff for last year. This is actually more than an order of magnitude higher than the true figure. In fact, the average number of paid employees (full-time equivalent) we have had working on all aspects of 80,000 Hours and Giving What We Can over the last six months is only 3.7.
As a more general point, I think we should also be careful to distinguish whether CEA has acted optimally in terms of utility-maximization (to which the answer is certainly not), and whether it gets a return on investment which is better than 1:1.
In my follow-up comment, I'll talk about some of the many concerns you've raised that we share, and the issues over which we might be making big mistakes. I'll also be able to give a bit more background about our activities, and I'll be able to answer your questions. Thanks again for taking the time to comment.
I'm glad to hear that a general response is being collated; if there are things where CEA can improve it would seem like a good idea to do them, and if I'm wrong I would like to know that. Turning to the listed points:
I went into that conversation with a number of questions I sought answers to, and either asked them or saw the data coming up from other questions. I knew your time was valuable and mostly targeted at other people there.
Adam explicitly signed off on my comment to Luke. He saw the draft post, commented on it, recommended it be put here and received the original string of emails in the context of being a friend, and person I knew would have a closer perspective on the day to day running of CEA than myself.
£1700 came from Jacob (Trefethen), in conversation shortly after you were in Cambridge, and purporting to be from internal numbers. I had asked whether CEA has an internal price at which new pledges would be bought, on the basis that one should exist, and it would be important for valuing a full-time Cambridge position.
~4K is 1/3 of the Oxford undergrad population, which was the figure I had heard quoted in the discussion in Cambridge.
GWWC lists 8 people as a sample of past-and-present researchers, a research manager and a research director. I estimated that half of the former set would have moved on, and thus that 6 people were at least engaged in part time research for GWWC.
I am concerned both about utility-maximisation and the ROI. It seems easier to fix efficiency problems whilst institutions are still small, or create alternate more efficient institutions if need be; ideally groups akin to CEA's projects are going to move budgets of O(10^9 / year), and I want to see that used as effectively as possible.
In terms of ROI, I don't put large weight in the estimated returns absent a calculation or substantial trust in the instrumental rationality of the organisation making the claims. To take the canonical example, GiveWell provides some measure of each; CEA's projects need to be at least as credible.
Thanks again for taking the critique in the spirit that was intended.
(part 1) Summary Thanks once again, Jonathan, for taking the time to write publicly about CEA, and to make some suggestions about ways in which CEA might be falling short. In what follows I’ll write a candid response to your post, which I hope you’ll take as a sign of respect — this is LW and I know that honesty in this community is valued far more than sugarcoating. Ultimately, we’re all aiming here to proportion our beliefs to our evidence, and beating around the bush doesn’t help with that aim.
In your post you raise some important issues — often issues that those within CEA have also been thinking about. In general, however, the methodology by which you researched and wrote your post was poor. For this reason, there are crucial factual errors in your post that could easily have been avoided, and errors of argumentation that border on embarrassing. This is unfortunate. Powerful criticism of CEA’s activities is extremely important to us: in fact, in the absence of more direct forms of feedback (like profit and loss), it’s vital. But writing poorly researched and poorly thought-through criticism adds more noise than signal; this makes it harder for us in the future to distinguish the incisive and well-evidenced criticism from the rest, which just harms everyone.
I’ll mention some of the issues that you’ve raised that I think are important to think about, before going on to detail some of the mistakes you make in your post. I’ll note just now that, because of other commitments, this post will be the last I make on this thread.
Some Important Points
Individuals vs Large Organizations You ask why we focus on individuals, rather than large foundations, or governments, or intergovernmental institutions like the World Bank. This is a good question, and something we wrestle with. Indeed, it’s also something we’ve pursued. The media attention generated by Giving What We Can has provided a platform for Dr Toby Ord, the principal founder of GWWC, to travel to and speak to the UK Secretary of State for Development, the UK’s Department for International Development, the Centre for Global Development, 10 Downing Street, the Disease Control Priorities Network, the WHO and as it happens, the World Bank, about aid cost effectiveness and how to increase it. He has already had some success in this regard, which wouldn’t have been possible without GWWC, and he expects to spend a significant proportion of his career on this issue.
The question of whether to spend marginal resources influencing individuals versus governmental and international organisations is non-trivial to answer: international organisations have larger budgets, but are more difficult to access and more difficult to influence. If you think it obvious that we should be influencing the latter, I’d be interested to know your reasons. Later in this response, I’ll discuss your suggestion in more depth.
Transparency You raised concerns about the transparency of GWWC and 80,000 Hours. I agree that this is something that both organisations could work on. We have taken steps so far in the direction of transparency, especially in making the organisations transparent to donors and potential donors. Both 80k and GWWC have in-depth 6-monthly reviews, where their progress is assessed internally by the trustees (myself, Nick Beckstead and Toby Ord), and externally, by people, often donors, within the effective altruism community who are not closely involved with the running of the organisation. GWWC has posted on this here, and noted that if you wanted to read the reports from the review you are able to request them. 80,000 hours will make a similar post soon.
In addition, at the request of Giles, I opened CEA up for questioning on LessWrong, and wrote a detailed response to the questions posted there. I try to provide in-depth responses to any questions I receive via e-mail. And I provide the spreadsheet and explanation of an in-depth calculation of GWWC’s impact per dollar to anyone who asks (accurate as of ~March 2012 – we plan to do this annually).
One issue in keeping a start-up organisation transparent is that the nature of our activities changes rapidly. The very idea of 80,000 Hours as primarily a service organization, providing free careers advice, was only thought up in early July 2012. People switch positions regularly while we get a better understanding of whose comparative advantage lies where. It’s difficult to be transparent and non-misleading when you know that the facts might change radically within the space of a few months. There are also many things to be done, and investing in increased transparency has to be weighed against raising more money, or pledges, or making more career changes. So far, we’ve focused on being transparent to our donors and potential donors, which I still think is the right call — but it’s important to think about and reassess this on a regular basis. I’d welcome further thoughts on if you think that we’ve made the wrong trade-off here.
Publishing You briefly suggest the idea that we should use publications as a metric of research output. This is also something that’s worth thinking about. Publishing increases one’s academic reputability, and the scrutiny of peer review improves the reliability of one’s research. However, it is far more time-consuming than one might expect, because one has to tailor one’s research to the norms of the journal, and is especially slow if one is publishing within philosophy journals. (A paper of mine was under review for 10 months from one journal.) It also biases research towards ideas that are publishable, even if less important. So it’s a difficult issue.
For reasons of time, GiveWell don’t publish at all (but the resulting lack of peer review is something I’ve raised as a concern about their research); whereas, in order to boost reputation, MIRI are aiming to publish. At the moment, publishing isn’t a high priority for us, but we do some. I’ve published the central argument in favour of earning to give (it’s forthcoming in Ethical Theory and Moral Practice, available here), and I’m planning to write a book on effective altruism over the next year, from which I might publish a few articles. But beyond that, we’d rather focus on getting the ideas right. However, that’s something we could easily be mistaken about, and is worthy of discussion.
With these points noted, I’ll move on to the mistakes made within the post.
Some misleading aspects of the post
Factual Errors I mention these in my other comment on this post.
One other thing to note is that the 80k pledge was never focused on global poverty. The previous declaration was: I declare that I aim to pursue a career as an effective altruist.
This means that I intend to: (i) Devote a significant proportion of my time or resources to helping others. (ii) Use the time or resources I give as effectively as possible in helping others. (iii) Choose my career based at least in part on how it enables me to further my altruistic aims. And prior to that the declaration was: I pledge that, over my lifetime, I will dedicate 10% of my time or money (or any combination of the two) to those causes that I believe will do the most good with the resources I give them. I understand that it is difficult to know the best way of doing good in the world, and so I will choose those cause(s) on the basis of the best evidence that is available to me at the time. Further, I will deliberately pursue a career that will considerably improve my ability to further those causes I believe to be best.
The new declaration is: "I intend, at least in part, to use my career in an effective way to make the world a better place." More discussion on these changes later.
Misleading statements “In recent conversation with Will Crouch”… “In conversation with Will Crouch”… “these discussions” I mention this in my other post but it’s worth repeating. Though your post suggests that we had at least two one-on-one conversations, this never happened. We spoke only during a question-and-answer session after a short talk I gave.
“There wasn't any particular defence of the choice of wording [of the 80k declaration of intent] or any indication that there had been deep thought about precisely what that pledge should constitute.” This is technically true. However, it’s misleading insofar as I wasn’t asked why the declaration of intent was changed, nor was I asked how much time had gone into thinking about revising the declaration of intent.
“The key argument in favour of donating money to CEA which was presented by Will was that by donating $1 to CEA you produce more than $1 in donations to the most effective charities. We present some apparent difficulties with this remaining true on the margin.” This suggests that your post was primarily about difficulties with inferring marginal cost-effectiveness from past average cost-effectiveness. I think that that’s a very important topic (hey, maybe 99.9% of the value of CEA comes from me! In which case marginal cost-effectiveness would be much lower than past average cost-effectiveness), but as far as I can tell in your post you don’t address that issue anywhere.
(part 3; final part)
Second: The GWWC Pledge. You say:
“The GWWC site, for example, claims that from 291 members there will be £72.68M pledged. This equates to £250K / person over the course of their life. Claiming that this level of pledging will occur requires either unreasonable rates of donation or multi-decade payment schedules. If, in line with GWWC's projections, around 50% of people will maintain their donations, then assuming a linear drop off the expected pledge from a full time member is around £375K. Over a lifetime, this is essentially £10K / year. It seems implausible that expected mean annual earnings for GWWC members is of order £100K.”
Again, there are quite a few mistakes:
First, in comments you twice say that “£112.8M” has been pledged rather than “$112.8M”. I know that’s just a typo but it’s an important one.
Second, you say that the GWWC site claims that, “there will be £72.68M pledged” (future tense). It doesn’t, it says, “$112.8mn pledged” (past tense). It’s a pretty important difference – the pledging is something that has happened, not something that will happen. This might partly explain the confusion discussed in point 4, below. Third, and more substantively, you don’t consider the idea, raised in other comments, that some donors might be donating considerably more than 10%, or that some donors might be donating considerably more than the mean. Both are true of GWWC pledgers.
Fourth, you seem to wilfully misunderstand the verb ‘to pledge’. I regularly make the following statement: “I have pledged to give everything I earn above £20 000 p.a. [PPP and inflation-adjusted to Oxford 2009]”. Am I lying when I say that? Using synonyms, I could have said “I promise to give…”, “I commit to give…” or “I sincerely intend to give…”. None of these entail “I am certain that I will donate everything above £20 000 p.a.”. Using my belief that I will earn on average over £42 000 p.a. [PPP and inflation-adjusted to Oxford 2009] over the course of my life, and that I will work until I’m 68, I can infer that I’ve pledged to give over £1 000 000 over the course of my life, which is also something I say. Am I lying when I say that? (Also note that if only 73 people made the same pledge as me, then we would have jointly pledged the current GWWC amount).
Fifth, I don’t know why you took us to use the $100mn pledged figure as an estimate of our impact. In fact you had evidence to the contrary. In a blog post that you cite I said: “As of last March, we’d invested $170 000’s worth of volunteer time into Giving What We Can, and had moved $1.7 million to GiveWell or GWWC top-recommended development charities, and raised a further $68 million in pledged donations. Taking into account the facts that some proportion of this would have been given anyway, there will be some member attrition, and not all donations will go to the very best charities (and using data for all these factors when possible), we estimate that we had raised $8 in realised donations and $130 in future donations for every $1’s worth of volunteer time invested in Giving What We Can.” (emphasis added).
Finally, I think that the GWWC pledge is misleading only if it’s taken to be a measure of our impact. But we don’t advertise it as that. We could try to make it some other number. We could adjust the number downwards, in order to take into account: how much would have been given anyway; member attrition; a discount rate. Or we could adjust the number upwards, in order to take into account: overgiving; real growth of salaries, and inflation. It could also be adjusted downward to take into account that not all donations are to GW or GWWC recommended charities, or (perhaps) upwards to take into account the idea that we will have better evidence about the best giving opportunities in a few years’ time, and thereby be able to donate to charities better than AMF, SCI or DtW. But any number we gave based on these adjustments would be more misleading and arbitrary than the literal amount pledged. It would also be more confusing for the large majority of our website viewers who haven’t thought about things like counterfactual giving or whether the discount rate should be positive or negative over the next few years; they’re used to the social norm which is to advertise pledges as stated. Until you, no-one who does understand issues such as counterfactual giving and discount rates has understood the amount pledged figure as an impact-assessment.
In comments there was some uncertainty about how we come up with the total pledged figure. What we do is as follows. Each member, when they return their pledge form, states a) what percentage they commit to (or, if taking the Further Pledge, the baseline income above which they give everything); b) their birthdate; c) their expected average earnings per annum. Assuming a (conservative) standard retirement age, that allows us to calculate their expected donations. In some cases, members understandably don’t want to reveal their expected earnings. What we used to do, in such cases, is to use the mean earnings of all the other members who have given their incomes. However, when, recently, one member joined with very large expected earnings (pursuing earning to give), we raised the question whether this method suffers from sample bias, because people who expect to earn a lot will be more likely to report. I’m not sure that’s true: I could imagine that people who earn more often don’t want to flaunt that fact. However, wanting to be conservative, we decided instead to use the mean earnings of the country in which the member works.
Bottom Line for Readers If you’re interested in the question of whether 80,000 Hours and Giving What We Can have acted optimally or will act optimally in the future, the answer is simple: certainly not. We inevitably do some things worse than we could have done, and we value your input on concrete suggestions about how our organisations can improve.
If you’re interested in the question of whether $1 invested in 80,000 Hours or Giving What We Can produces more than $1’s worth of value for the best causes, read here, here, here and here and, most of all, contact me for the calculations and, if you’d like, our latest business plan, at will dot crouch at 80000hours.org. So far, I haven’t seen any convincing arguments to the conclusion that we fail to have a ROI greater than 1; however, it’s something I’d love additional input on, as the outside view makes me wary about believing that I work for the best charity I know of.
Thanks for writing this. I found it illuminating.
In the future, I'd suggest posting multipart comments like this as replies to one another, so it's easier to read them in order.
(part 2) The most important mistakes in the post
Bizarre Failures to Acquire Relevant Evidence As lukeprog noted, you did not run this post by anyone within CEA who had sufficient knowledge to correct you on some of the matters given above. Lukeprog describes this as ‘common courtesy’. But, more than that, it’s a violation of a good epistemic principle that one should gain easily accessible relevant information before making a point publicly.
The most egregious violation of this principle is that, though you say you focus on the idea that donating to CEA has a ROI greater than 1, and though you repeatedly ask for a ‘calculation’ of impact and claim that CEA is not credible for not being able to provide such a calculation, you haven’t contacted me for the calculation of GWWC’s impact per dollar invested. This isn’t something I’ve been shy about — in a blog post that you link to (as well as elsewhere) I prominently describe this calculated impact-assessment, and invite people to contact me if they want the spreadsheet with the calculation. Insofar as this was the cornerstone of your concern, it’s odd that you didn’t contact me for the spreadsheet. Comments on that impact-assessment would have been helpful, but as far as I’m aware you haven’t read it.
Another example is where you suggest that little thought went into the change of the 80,000 Hours’ declaration of intent. Again, this is information that would have been easily accessible via a quick email to me or Ben Todd. As it happens, the declaration has gone through several iterations; there has been discussion on the core 80,000 Hours’ lists; Ben, myself and other have independently written proposals; and we commissioned one of our best interns to research the topic as part of our general marketing strategy. We concluded that having a lower initial barrier to entry was wise, because it would increase the total number of members, allow us to be more mainstream, and increase the total (though not the proportion) of members who make significant changes to their careers and thereby make the world a significantly better place. (We are also currently discussing whether to introduce a further pledge along the lines of “I intend to dedicate my life to whatever does the most good.”) It wouldn’t be an underestimate to say that several person-weeks of thought and research have gone into the pledges.
A further example is where you guess the number of researchers we have. Again, you could have e-mailed for this information, rather than trying to guess on the basis of the names listed on the website. For this reason, you substantially overestimated how many person-hours we command. Between CEA, over the last six months we have had the equivalent of 3.7 full-time staff. The first 2.6 of these started in July last year, another joined in late September and another in January. GWWC currently has the equivalent of two full-time staff; 80,000 Hours has the equivalent of two and a half full-time staff. For this reason (and perhaps also the planning fallacy), I think you severely overestimate the amount of research we could reasonably expect to deliver in that time.
Another example is where you quote the number of people we have on our mailing lists. This is a good example, because it’s one where I spoke incorrectly in Cambridge. I said that one third of Oxford students were on our mailing list; what I should have said was that about 20% of students coming through fresher’s fair were on our mailing list. It’s precisely errors like these — easy to make in the context of an impromptu group discussion — that show the value of making sure that one’s evidence is reliable.
A further example is where you say “it has been stated that GWWC has an internal price of around £1700 for new pledges” and then, in your response to my query about where this number came from, said that it came from Jacob Trefethen — a volunteer at a chapter, and not currently involved with core GWWC and 80k activities. Again, this is not the sort of evidence on which it’s rational to base a critique — when the option of simply asking me or someone else who works on strategy within CEA was merely an email away.
Another example was: “a large fraction of the people involved with 80,000 hours or GWWC behave like dilettantes”… “Nor do they seem to act as if they wish to seriously optimise the world.” But, as far as I know, you know only one person who works at CEA, Adam Casey, who is an unpaid intern, and you have about one hour’s worth of contact with me. I doubt that, if you knew us personally, and not through material written for an audience encountering the ideas of effective altruism for the first time, you would doubt our intention and commitment to "seriously optimise the world" as you put it. Seeing as this is LessWrong, I'll quote Eliezer Yudkowsky (stated in an independent internet conversation on Ycombinator). In response to the question, “What application of $4B would, right now, generate the most utility for humanity?” he replied: “If you know the word "utility", the people who actually seriously try to figure out the answer to that question live at:
Embarrassingly Poor Arguments First: You ask: “For example, the world bank throws ~$43B/year around. Which is easier: To upscale GWWC by a factor of ~17000, or double the mean effectiveness of the World Bank? This should not be a hypothetical question; it should be answered.”
There are a few mistakes here:
First, your comment suggests that you know that we haven’t thought about this. But that’s misleading, because you haven’t ever asked us if we’ve thought about it.
Second, I have no idea where your numbers come from. After searching (inc. here) I still don’t know where $43bn number comes from. And, after trying to figure it out, I also don’t know where your “17 000” figure comes from. GWWC has so far moved $2.5 million and raised $100mn in pledges. Even discounting the literal pledges by 99% and valuing them at $1mn (which would be far too steep in my view), the appropriate figure would be 12 300. So, whatever the basis, 17 000 seems too high.
Third, even neglecting the above points, your figure would only be correct if the cost-effectiveness of the World Bank’s spending were the same as the cost-effectiveness of GWWC top-recommended charities. But we think, and presumably you agree, that the cost-effectiveness of GWWC’s top-recommended charities are significantly better than the World Bank’s mean cost-effectiveness. Aside from anything else, there’s a major difference between donations and loans. Fourth, if you want to maximize impact yours is not the correct question to ask. If it will get progressively harder to grow GWWC, and if one think that the likelihood of achieving either outcome is very low (both reasonable assumptions), then it could be true that (i) it is easier to double the mean effectiveness of the World Bank than to increase GWWC’s size by a factor of 17000 and (ii) that one ought to use one’s marginal time and resources to grow GWWC. The reason these could both be true is that the marginal benefits from growing GWWC are greater than the marginal benefits of trying to double the effectiveness of the World Bank. Given this, it’s unclear why this question “should be answered”. Fifth, the question implicitly neglects the fact that growing GWWC has substantial knock-on benefits, including increasing the ability of some GWWC members to influence major international organisations like the World Bank (see the background on Toby’s activities, above).
In general: i) Starting with something smaller and easier to achieve has instrumental cumulative benefits and option value in a way that staking everything on one big goal does not. ii) Directly doubling the effectiveness of the World Bank – and other similar projects – is not the comparative advantage of existing EAs in Oxford. Given our success generating and mobilising talented altruists, I think the team here will have greater success taking an indirect route than by attempting to do it directly ourselves. We can use e.g. 80,000 Hours to identify precisely those who have or could develop the requisite skills, credentials and values, and provide them the encouragement, information and practical assistance required to get into positions of major influence over aid effectiveness. Finding and convincing someone to pursue this career is much easier than dedicating your entire life to it yourself, which is what led us to set up 80,000 Hours in the first place.
That’s not to say we aren’t open to the idea. It’s one of my main concerns about my current activities. But it’s misleading to suggest that you have good evidence to believe that we haven’t considered it.
Eliezer's HN comment: http://news.ycombinator.com/item?id=4726651
What track record is Eliezer referring to? Is there some external organization that evaluates Givewell? I don't see any quick, easy way to evaluate their impact or figure out if they're working as advertised.
The $43bn figure (the amount the World Bank (WB) lent in 2011) can be found on the WB website here, the factor of 17000 comes (I think) from dividing $43 bn by the expected annual donations from the pledges ($43 bn / ($112 mn in pledges / 45 years of work) ~ 17000).
However, obviously, as you state, doubling the effectiveness of WB activities will not have the same impact as bringing CEA up to the size of the WB, unless one (unrealistically) assumes that the GWWC recommended charities are only twice as effective as the average WB intervention (though ideally one should take into account the diminishing marginal returns of GWWC and 80k).
I think that the question that this article is trying to answer should have been made clearer, because there are very different answers to the questions:
What impact has CEA produced so far?
What impact has CEA produced per dollar invested so far?
What impact has CEA produced per dollar invested, assuming each volunteer hour costs an average of $x, so far?
What impact would CEA produce if I gave them $1000?
What impact would CEA produce if I gave them £1m?
I think that the most useful question to answer is probably #4 or #5, but many of the criticisms here seem to be along the lines of #1, identifying poor outputs without any reference to what the inputs were. For example, the comments on lack of transparency, poor research outputs and poor chapter growth could all just be because CEA was using its limited resources more effectively elsewhere. Indeed, if you think CEA should be more transparent etc., maybe you should give them some money so they've got resources to put into this ;)
This also goes for digs at Will at not having certain Fermi estimates to hand and, in fact, I wouldn't be surprised if Ben (80k Executive Director) had these estimates and Will did not, since Ben is full-time 80k and Will is trying to put no more than 1 day a week into CEA as a whole, last I heard. Ben also suggested to me recently that I come up with BOTE estimates for The Life You Can Save's (TLYCS) top potential activities, so he's clearly thinking in the right way.
The second main point I wanted to make is that I don't think LWers are CEA's target audience; it seems more likely to me that CEA wants to create a load of "dilettantes" over a few improved LW-types. So GWWC acts like most charities and simply announces on their website how much money has been pledged rather than giving an estimate for money that their members so far probably will donate along with a complicated explanation of where that figure comes from; come on, even the typical man on the street looks at a figure like that on a website and thinks "I bet much less will actually be donated", it's not like GWWC are deliberately trying to deceive people. And maybe 80k are focusing on the best way to market EtG, and are less concerned with coming up with detailed advice for people who already accept EtG as a baseline, because they're trying to bring lots of people on the fringes of the EA movement a bit closer, rather than improving a few existing hardcore EAs (which seems to be more the Leverage Research approach). Which would make things like "There is a strong sense in which they are being graded on their speed, with publication being the mediator of impact.” just not the case. Maybe I'm projecting a bit though, since TLYCS is aiming to make a ton of potential EAs into EAs more than improving the effectiveness of existing EAs.
A couple more little things:
The Peter Singer talk got about 150 people, which I think was poor. But then, I organised it, and I'm no longer part of CEA ;)
I got the impression that a lot of thought went into both version's of 80k's declaration
Disclaimer 1: This is my first comment on LW, so apologies if I haven't got the swing of the conventions yet.
Disclaimer 2: I volunteer for TLYCS and have volunteered for GWWC in the past. There is a weak relationship between TLYCS and CEA at the moment but since this post does not mention TLYCS and since I know relatively little about the operations of GWWC/80k lately, I am treating TLYCS as separate from CEA in this thread and please do not take me to be representing CEA.
Disclaimer 3: Since this article was just about CEA's weaknesses, though of course strengths exist, I in turn have only tried to deny some of the weaknesses, though of course weaknesses exist.
I agree that this is very difficult, and am not entirely happy with the latest version. But it's hard to come up with an acceptable version that doesn't just re-state consequentialism. For example, if you explicitly mention global poverty, you force people to donate there even if Xrisk charities might be better for the world as a whole.
disclaimer: I volunteer for GWWC. As it would be tiresome to mention this on every thread in this conversation, I shan't do so again, and will trust this suffices for transparency, especially as I am not a salaried employee.
Reduces it from what? There's a point at which it's more cots effective to just find new people than carrying on working to persuade existing ones. My intuition doesn't say much about whether this happy point is above or below 25%.
Good point about self-reporting potentially exaggerating the impact though.
The pledging back-of-the-envelope calculation got me curious, because I had been assuming GWWC wouldn't flat out lie about how much had been pledged (they say "We currently have 291 members ... who together have pledged more than 112 million dollars" which implies an actual total not an estimate).
On the other hand, it's just measuring pledges, it's not an estimate of how much money anyone expects to actually materialise. It hadn't occurred to me that anyone would read it that way - I may be mistaken here though, in which case there's a genuine issue with how the number is being presented.
Anyway, I still wasn't sure the pledge number made sense so I did my own back-of-the-envelope:
£72.68M pledged 291 members £250K pledged per person over the course of their life 40 years average expected time until retirement (this may be optimistic. I get the impression most members are young though) £6.2K average pledged per member per year
That would mean people are expecting to make £62K per year averaged over their entire remaining career, which still seems very optimistic. But:
So I think this passes the laugh test for me, as a measure of how much people might conceivably have pledged, not how much they'll actually deliver.
Incidentally, in case it's useful to anyone... The way I originally processed the $112M figure (or $68M as it then was), was something along the lines of:
aha! This is money that's expected to roll in over the next several decades. We really have no idea what the EA movement will turn into over that time, so should apply big future discounting when it comes to estimating our impact
(note it looks like Will was more optimistic, applying 67% cynicism to get from $400 to $130)
I agree that it'd be good to get more people. However, the chapters are operating under a constraint that other socieities are not. Science societies can put on any talk they think will be interesting; GWWC has to put on talks about Effective Altruism that people will find interesting.
And while I don't know about TCSS, when OUSS holds talks, they'll get fewer than 100 people - there's at most a factor of 2 there. Again, I don't know about TCMS, but OUIS get an average of 40 people or so, not suggesting any capacity for improvement. True, they'll get around 150 for the very big names (think Sir Roger Penrose), but there just isn't anyone equivalently famous for GWWC to use, except Peter Singer - and I think when he was invited there was a very big crowd (though I forget how large).
disclaimer: I volunteer for GWWC.
Talking about effective altruism is a constraint, as is talking about mathematics. Being a subject society makes it easier to get people from that subject to attend; it also makes it harder to convince people from outside that subject to even consider coming.
TMS pulls 80+ people to most of its talks, which are not generally from especially famous mathematicians. TCSS got 600 people for a Pensrose-Rees event. Both TCSS and TMS have grown rapidly in 18-24 months, having existed for far longer. This seems to indicate that randomly selected student societies have low hanging fruit. It doesn't seem incongruous to suggest that OUIS, OUSS and GWWC have the capacity to at least double their attendances -- the TMS did in one term, and doubled the number of events (so a x4 in person-talks).
One constrains you to a subject with thousands of high-status practitioners and hundreds of students - the other restricts you to a subject with one high-status practitioner and no students.
Whose status ordering are you using? Getting someone who is not a mathematician to TMS is harder; within the Natural Sciences it is possible, and there are O(1) Computer Scientists, philosophers or others. For the historians, classicists or other subjects, mathmos are not high status. In terms of EtG, these groups are valuable - most traders are not quants.
This doesn't follow. If you were already going to be an Investment Banker, you wouldn't change your goals based on 80k's advice, but you'd still earn more thanthe average graduate.
This holds for graduates who earn less than average as well. Is there data showing that the predominant source of career changes are people who would otherwise earn substantially less than mean? Is there data suggesting that the career changes are increasing incomes substantially?
Qn: On what basis does GWWC assert that its near 300 members are credibly precommitted to donating £72.68M?
While I don't know what the official answer would be, I think there is basically a linguistic one. To have pledged does not require one to have credibly precommitted. I can pledge to resist torture for 40 days just by saying so, but this doesn't mean I've credibly precommitted to doing so.
Of course, this only clears GWWC of accusations of excessive optimism. The question of how many members will maintain the pledge remains, and I don't see how we can get a very good idea for a few years (untill we have enough data points to extrapolate the perameters of the distribution).
In that case, having a claim on every page of the GWWC site claiming that £112.8M have been pledged seems deceptive. 291 people have pledged, and [by a black box that doesn't trivially correspond to reality] that's become £112.8M. I know that at least 3 people in Cambridge have seen that statistic and promptly laughed at GWWC. The numbers are implausible enough that <5s Fermi estimates seem to refute it, and then the status of GWWC as somewhat effective rational meta-charity is destroyed. Why would someone trust GWWC's assessment of charities or likely impact over, say, GiveWell, if the numbers GWWC display are so weird and lacking in justification?
If someone has pledged to give 10% of X, and you estimate X to be about Y, I think it's not totally unreasonable to suggest Y*0.1 has been pledged, especially if that person doesn't disagree with your estimate of X.
Do you really think that their pledge total is something other than the (undiscounted) sum of what those 291 people have pledged to give over the course of their lives? You think they're just making it up?
I do not think that they are "making it up"; that phrase to me seems to attach all sorts of deliberate malfeasance that I do not wish to suggest. I think that to an outside observer the estimate is optimistic to the point of being incredible, and reflecting poorly on CEA for that.
These 291 people haven't pledged dollar values. They've pledged percentage incomes. To turn that into a dollar value you need to estimate whole-life incomes. Reverse engineering an estimate of income (assuming that most people pledge 10%, and a linear drop off in pledgers with 50% donating for 40 years), yields mean lifetime earnings of ~£100K. That's about the 98th centile for earnings in the UK.
Ah; I thought that people were pledging a percent and then listing their estimated income. Looking at an old email from gwwc I see in their discussion of pledge calculations:
which makes me think they are working off of self-reported estimated lifetime incomes. Though they might be extrapolating from people who submitted income amounts to the people who didn't.
I don't think that's the right way to interpret "pledge". If 100 people pledge to keep smoking and then only 50 end up going through with it, I would still say that you had 100 pledges.
£112.8M over 291 people and 40 years is ~£10K, or an estimated income of £100K at 10%. (Which is the same number you got, which I think means you didn't actually apply your 50% figure.)
An average income of £100K is high, but looking over their members, remembering that their students are Oxford students, and figuring that people who go into banking or something via EtG might give larger percents (while still having more to keep for themselves) I think their numbers are not "optimistic to the point of being incredible".
I am Alexey Morgunov and I do publicly endorse this post as accurately representing my views, questions and concerns. I was part of many discussions involving (but not limited to) Jonathan Lee and Adam Casey that ultimately led to this post, I was involved (to a small extent) in writing this post and I approved of its content before it was published.
Some further disclaimers: I am not working or volunteering for any of the CEA/GWWC/80,000h or related organisations. I am technically an 80,000h member, but having still signed the old pledge, I no longer consider myself to be officially a member, given the changed nature of the organisation.
The aim of this post is not to rain fire and destruction on CEA, who are ultimately doing a job whose goals I strongly support and agree with. The aim of this post is to initiate public discussion about the concerns that myself and others have had with the CEA. Will, thank you for your correct approach to our criticism. I am looking forward to reading your reply.
I disagree with Luke about the "common courtesy" procedure. In my opinion, in a situation where the organisation being criticised is actually (claiming to be) operating with large sums of donations, it should stand under fully open public scrutiny. Maybe it is my cultural origin but pre-moderated criticism does lend itself to easy speculation regarding corruption, and makes me trust the discourse far less. (I am NOT implying anything, nor accusing anyone, my statement is purely theoretical.) Furthermore, given the constructive nature of the criticism as well as the self-selective audience of LW rather than a wider medium, I do believe that publishing our criticism without prior consultation with Will or any other official from the CEA remains the right thing to do.
Do you still disagree with that procedure now that we know Jonathan's post contained numerous errors and misrepresentations? (See: 1, 2, 3, 4.) I'll bet that more people read the original (highly inaccurate) post than came back to read wdcrouch's corrections in the comments many days later.
The CEA is claiming to be influencing a lot of donations, but that's not the same as operating with them. As in, people are not giving large sums to the CEA. (Though I think people should, as Will's argument that their ROI is highly positive convinces me.)