Thanks! I think these questions do matter, though it's certainly exploratory work, and at the moment tractability is still unclear. I talk about some of the practical upshots in the final essay.
Thanks. On the EA Forum Ben West points out the clarification: I'm not assuming x-risk is lower than 10%; in my illustrative example I suggest 20%. (My own views are a little lower than that, but not an OOM lower, especially given that this % is about locking into near-0 value futures, not just extinction.) I wasn't meaning to be placing a lot of weight on the superforecasters' estimates.
Actually, because I ultimately argue that Flourishing is only 10% or less, for this part of the argument to work (i.e., for Flourishing to be be greater in scale than Surviving), I only need x-risk this century to be less than 90%! (Though the argument gets a lot weaker the higher your p(doom)).
The point is that we're closer to the top of the x-axis than we are to the top of the y-axis.
Ah, by the "software feedback loop" I mean: "At the point of time at which AI has automated AI R&D, does a doubling of cognitive effort result in more than a doubling of output? If yes, there's a software feedback loop - you get (for a time, at least) accelerating rates of algorithmic efficiency progress, rather than just a one-off gain from automation."
I see now why you could understand "RSI" to mean "AI improves itself at all over time". But even so, the claim would still hold - even if (implausibly) AI gets no smarter than human-level, you'd still get accelerated tech development, because the quantity of AI research effort would increase at a growth rate much faster than the quantity of human research effort.
There's definitely a new trend towards custom-website essays. Forethought is a website for lots of research content, though (like Epoch), not just PrepIE.
And I don't think it's because of people getting more productive because of reasoning models - AI was helpful for PrepIE but more like 10-20% productivity boost than 100% boost, and I don't think AI was used much for SA, either.
Thanks - appreciate that! It comes up a little differently for me, but still an issue - we've asked the devs to fix.
Argh! Original post didn't go through (probably my fault), so this will be shorter than it should be:
First point:
I know very little about CEA, and a brief check of their website leaves me a little unclear on why Luke recommends them, aside from the fact that they apparently work closely with FHI.
CEA = Giving What We Can, 80,000 Hours, and a bit of other stuff
Reason -> donations to CEA predictably increase the size and strength of the EA community, a good proportion of whom take long-run considerations very seriously and will donate to / work for FHI/MIRI, or otherwise pursue careers with the aim of extinction risk mitigation. It's plausible that $1 to CEA generates significantly more than $1's worth of x-risk-value [note: I'm a trustee and founder of CEA].
Second point:
Don't forget CSER. My view is that they are even higher-impact than MIRI or FHI (though I'd defer to Sean_o_h if he disagreed). Reason: marginal donations will be used to fund program management + grantwriting, which would turn ~$70k into a significant chance of ~$1-$10mn, and launch what I think might become one of the most important research institutions in the world. They have all the background (high profile people on the board; an already written previous grant proposal that very narrowly missed out on being successful). High leverage!
CEA and CFAR don't do anything, to my knowledge, that would increase these odds, except in exceedingly indirect ways.
People from CEA, in collaboration with FHI, have been meeting with people in the UK government, and are producing policy briefs on unprecedented risks from new technologies, including AI (the first brief will go on the FHI website in the near future). These meetings arose as a result of GWWC media attention. CEA's most recent hire, Owen Cotton-Barratt, will be helping with this work.
your account of effective altruism seems rather different from Will's: "Maybe you want to do other things effectively, but >then it's not effective altruism". This sort of mixed messaging is exactly what I was objecting too.
I think you've revised the post since you initially wrote it? If so, you might want to highlight that in the italics at the start, as otherwise it makes some of the comments look weirdly off-base. In particular, I took the initial post to aim at the conclusion:
But now the post reads more like the main conclusion is:
I think the simple answer is that "effective altruism" is a vague term. I gave you what I thought was the best way of making it precise. Weeatquince, and Luke Muelhauser wanted to make it precise in a different way. We could have a debate about which is the more useful precisifcation, but I don't think that here is the right place for that.
On either way of making the term precise, though, EA is clearly not trying to be the whole of morality, or to give any one very specific conception of morality. It doesn't make a claim about side-constraints; it doesn't make a claim about whether doing good is supererogatory or obligatory; it doesn't make a claim about the nature of welfare. EA is broad tent, and deliberately so: very many different ethical perspectives will agree, for example, that it's important to find out which charities do the most to improve the welfare of those living in extreme poverty (as measured by QALYs etc), and then encouraging people to give to those charities. If so, then we've got an important activity that people of very many different ethical backgrounds can get behind - which is great!
Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.
Some areas where the criticism seems reasonable:
Now, clarifying my position:
Here’s what I take IABI to be arguing (written by GPT5-Pro, on the basis of a pdf, in an attempt not to infuse my biases):
And what readers will think the book is about (again written by GPT5-Pro):
The core message of the book is not saying merely “AI x-risk is worryingly high” or “stopping or slowing AI development would be one good strategy among many.” I wouldn’t disagree with the former at all, and the latter disagreement would be more about the details.
Here’s a different perspective:
This is my overall strategic picture. If the book had argued for that (or even just the “kitchen sink” approach part) then I might have disagreed with the arguments, but I wouldn’t feel, “man, people will come away from this with a bad strategic picture”.
(I think the whole strategic picture would include:
But I think that’s fine not to discuss in a book focused on AI takeover risk.)
This is also the broad strategic picture, as I understand it, of e.g. Carl, Paul, Ryan, Buck. It’s true that I’m more optimistic than they are (on the 80k podcast I say 1-10% range for AI x-risk, though it depends on what exactly you mean by that) but I don’t feel deep worldview disagreement with them.
With that in mind, some reasons why I think the promotion of the Y&S view could be meaningfully bad:
Overall, I feel pretty agnostic on whether Y&S shouting their message is on net good for the world.
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Then, just to be clear, here are some cases where you misunderstand me, just focusing on the most-severe misunderstandings:
I really don’t think that!
I vibeswise disagree, because I expect massive acceleration and I think that’s *the* key challenge: See e.g. PrepIE, 80k podcast.
But there is a grain of truth in that my best guess is a more muted software-only intelligence explosion than some others predict. E.g. a best guess where, once AI fully automates AI R&D, we get 3-5 years of progress in 1 year (at current rates), rather than 10+ years’ worth, or rather than godlike superintelligence. This is the best analysis I know of on the topic. This might well be the cause of much of the difference in optimism between me and e.g. Carl.
(Note I still take the much larger software explosions very seriously (e.g. 10%-20% probability). And I could totally change my mind on this — the issue feels very live and open to me.)
Definitely disagree with this one! In general, society having more options and levers just seems great to me.
Definitely disagree!
Like, my whole bag is that I expect us to fuck up the future even if alignment is fine!! (e.g. Better Futures)
Definitely disagree! From my POV, it’s the IABI perspective that is closer to putting all the eggs in one basket, rather than advocating for the kitchen sink approach.
But “10% chance of ruin” is not what EY&NS, or the book, is arguing for, and isn’t what I was arguing against. (You could logically have the view of “10% chance of ruin and the only viable way to bring that down is a global moratorium”, but I don’t know anyone who has that view.)
Also not true, though I am more optimistic than many on the takeover side of things.
Also not true - e.g. I write here about the need to slow the intelligence explosion.
There’s a grain of truth in that I’m pretty agnostic on whether speeding up or slowing down AI development right now is good or bad. I flip-flop on it, but I currently lean towards thinking speed up at the moment is mildly good, for a few reasons: it stretches out the IE by bringing it forwards, means there’s more of a compute constraint and so the software-only IE doesn’t go as far, and means society wakes up earlier, giving more time to invest more in alignment of more-powerful AI.
(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)
In general, I think that how the IE happens and is governed is a much bigger deal than when it happens.
Which Will-version are you thinking of? Even in DGB I wrote about preventing near-term catastrophes as a top cause area.
Again really not intended! I think I’ve been clear about my views elsewhere (see previous links).
Ok, that’s all just spelling out my views. Going back, briefly, to the review. I said I was “disappointed” in the book — that was mainly because I thought that this was E&Y’s chance to give the strongest version of their arguments (though I understood they’d be simplified or streamlined), and the arguments I read were worse than I expected (even though I didn’t expect to find them terribly convincing).
Regarding your object-level responses to my arguments — I don’t think any of them really support the idea that alignment is so hard that AI takeover x-risk is overwhelmingly likely, or that the only viable response is to delay AI development by decades. E.g.
But if it’s a matter of “insufficiency”, the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that.
(I think I’m unlikely to respond further, but thanks again for the engagement.)