- I think the book's thesis is basically right — that if anyone builds superintelligent AI in the next decade or two, it'll have a terrifyingly high (15%+) chance of causing everyone to die in short order
I think this is an absurd statement of the book's thesis: the book is plainly saying something much stronger than that. How did you pick 15% rather than for example 80% or 95%?
I am basically on the fence about the statement you have made. For reasons described here, I think P(human extinction|AI takeover) is like 30%, and I separately think P(AI takeover) is like 45%.
- I think the world where the book becomes an extremely popular bestseller is much better on expectation than the world where it doesn't
I think this is probably right, but it's unclear.
- I generally respect MIRI's work and consider it underreported and underrated
It depends on what you mean by "work". I think their work making AI risk arguments at the level of detail represented by the book is massively underreported and underrated (and Eliezer's work on making a rationalist/AI safety community is very underrated). I think that their technical work is overrated by LessWrong readers.
Yeah there’s always going to be a gray area when everyone is being asked if their complex belief-state maps to one side of a binary question like endorsing a statement of support.
I’ve updated the first bullet to just use the book’s phrasing. It now says:
(I agree-reacted to the statement, because I think polling is often a very reasonable default activity. I am not upvoting the post, especially not for a goal of turning the frontpage post list into a space to signal the positions of LW, rather than for interesting and novel arguments and/or well written explanations.)
(Added: To be clear Liron did not ask for upvotes; but the post talked about what gets upvoted on frontpage, and then posted the poll claim, which seemed suggestive to me of a standard terrible failure mode for discussion spaces; so I want to disclaim upvoting in order to make the frontpage signal positions to low-engagement LW readers.)
"If Anyone Builds It, Everyone Dies" is more true than not[1].
Personally my p(doom)≈60%, which may be reduced by ~10%-15% by applying the best known safety techniques, but then we've exhausted my epistemic supply of "easy alignment worlds".
After that is the desert of "alignment is actually really hard" worlds. We may get another 5% because mildly smart AI systems refuse to construct their successors because they know they can't solve alignment under those conditions.
So the title of that book is more correct than not. I think the AI-assisted alignment feels the most promising to me, but also reckless as hell. Best is human intelligence enhancement through some kind of genetech or neurotech. Feels like very few people with influence are planning for "alignment is really hard" worlds.
I nevertheless often dunk on MIRI because I would like them to spill more on their agent foundations thoughts, and because I think the arguments don't rise above the level of "pretty good heuristics". Definitely not to the level of "physical law" which we've usually used to "predict endpoints but not trajectories".
The statement, not the book. The book was (on a skim) underwhelming, but I expected that, and it probably doesn't matter very much since I'm Not The Target Audience. ↩︎
I could be described as supporting the book in the sense that I preordered it, and I bought two extra copies to give as gifts. I'm planning to give one of them tomorrow to an LLM-obsessed mathematics professor at San Francisco State University. But the reason I'm giving him the book is because I want him to read it and think carefully about the arguments on the merits, because I think that mitigating the risk of extinction from AI should be a global priority. It's about the issue, not supporting MIRI or any particular book.
The purpose of this post is to build mutual knowledge that many (most?) of us on LessWrong support If Anyone Builds It, Anyone Dies.
Inside of LW, not every user is a long-timer who's already seen consistent signals of support for these kinds of claims. A post like this could make the difference in strengthening vs. weakening the perception of how much everyone knows that everyone knows (...) that everyone supports the book.
Externally, people who wonder how seriously the book is being taken may check LessWrong and look for an indicator of how much support the book has from the community that Eliezer Yudkowsky originally founded.
The LessWrong frontpage, where high-voted posts are generally based on "whether users want to see more of a kind of content", wouldn't by default map a large amount of internal support for IABIED into a frontpage that signals support, and more like an active discussion of various aspects of the book, including interesting & valid nitpicks and disagreements.
I support If Anyone Builds It, Everyone Dies.
That is:
The famous 2023 Center for AI Safety Statement on AI risk reads: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."
I'm extremely happy that this statement exists and has so many prominent signatories. While many people considered it too obvious and trivial to need stating, many others who weren't following the situation closely (or are motivated to think otherwise) had assumed there wasn't this level of consensus on the content of the statement across academia and industry.
Notably, the statement wasn't a total consensus that everyone signed, or that everyone who signed agreed with passionately, yet it still documented a meaningfully widespread consensus, and was a hugely valuable exercise. I think LW might benefit from having a similar kind of mutual-knowledge-building Statement on this occasion.