My Objections to "We’re All Gonna Die with Eliezer Yudkowsky"
Introduction I recently watched Eliezer Yudkowsky's appearance on the Bankless podcast, where he argued that AI was nigh-certain to end humanity. Since the podcast, some commentators have offered pushback against the doom conclusion. However, one sentiment I saw was that optimists tended not to engage with the specific arguments pessimists like Yudkowsky offered. Economist Robin Hanson points out that this pattern is very common for small groups which hold counterintuitive beliefs: insiders develop their own internal language, which skeptical outsiders usually don't bother to learn. Outsiders then make objections that focus on broad arguments against the belief's plausibility, rather than objections that focus on specific insider arguments. As an AI "alignment insider" whose current estimate of doom is around 5%, I wrote this post to explain some of my many objections to Yudkowsky's specific arguments. I've split this post into chronologically ordered segments of the podcast in which Yudkowsky makes one or more claims with which I particularly disagree. I have my own view of alignment research: shard theory, which focuses on understanding how human values form, and on how we might guide a similar process of value formation in AI systems. I think that human value formation is not that complex, and does not rely on principles very different from those which underlie the current deep learning paradigm. Most of the arguments you're about to see from me are less: > I think I know of a fundamentally new paradigm that can fix the issues Yudkowsky is pointing at. and more: > Here's why I don't agree with Yudkowsky's arguments that alignment is impossible in the current paradigm. My objections Will current approaches scale to AGI? Yudkowsky apparently thinks not ...and that the techniques driving current state of the art advances, by which I think he means the mix of generative pretraining + small amounts of reinforcement learning such as with ChatGPT, aren't r