Also known as Raelifin: https://www.lesswrong.com/users/raelifin
Yep. I agree with this. As I wrote, I think it's a key skill to manage to hold the heart of the issue in a way that is clear and raw, while also not going overboard. There's a milquetoast failure mode and an exaggeration failure mode and it's important to dodge both. I think the quoted text fails to thread the needle, and was agreeing with Ryan (and you) on that.
Upvoted! You've identified a bit of text that is decidedly hyperbolic, and is not how I would've written things.
Backing up, there is a basic point that I think The Problem is making, that I think is solid and I'm curious if you agree with. Paraphrasing: Many people underestimate the danger of superhuman AI because they mistakenly believe that skilled humans are close to the top of the range of mental ability in most domains. The mistake can be shown by looking at technology in general, where specialized machines are approximately always better than the direct power that individual humans can bring to bear, when machines that can do comparable work are built. (This is a broader pattern than with mental tasks, but it still applies for AI.)
The particular quoted section of text argues for this in a way that overstates the point. Phases like "routinely blow humans out of the water," "as soon as ... at all," "vastly outstrips," and "barely [worth] mentioning" are rhetorically bombastic and unsubtle. Reality, of course, is subtle and nuanced and complicated. Hyperbole is a sin, according to my aesthetic, and I wish the text had managed not to exaggerate.
On the other hand, smart people are making an important error that they need to snap out of and fighting words like the ones The Problem uses are helpful in foregrounding that mistake. There are, I believe many readers who would glaze over a toned-down version of the text that will correctly internalize the severity of the mistake when it's presented in a bombastic way. Punchy text can also be fun to read, which matters.
On the other other hand, I think this is sort of what writing skill is all about? Like, can you make something that's punchy and holds the important thing in your face in a way that clearly connects to the intense, raw danger while also being technically correct and precise? I think it's possible! And we should be aspiring to that standard.
All that said, let's dig into more of the object-level challenge. If I'm reading you right, you're saying something like: AI capabilities have been growing at a pace in most domains where the time between "can do at all" and "vastly outstrips humans" takes at least years and sometimes decades, and it is importantly wrong to characterize this as "very soon afterwards." I notice that I'm confused about whether you think this is importantly wrong in the sense of invalidating the basic point that people neglect how much room there is above humans in cognitive domains, or whether you think it's importantly wrong because it conflicts with other aspects of the basic perspective such as takeoff speeds and the importance of slowing down before we have AGI vs muddling through. Or maybe you're just arguing that it's hyperbolic, and you just wish the language was softer?
On some level you're simply right. If we think of Go engines using MCTS as being able to play "at all" in 2009, then it took around 8 years (Alpha Go Zero) to vastly outstrip any human. Chess is even more right, with human-comparable engines existing in the mid 60s and it taking ~40 years to become seriously superhuman. Essays, coding, and buying random things on the internet are obviously still comparable to humans, and have arguably been around since ~2020 (less obviously with the buying random things, but w/e). Recognizing if an image has a dog was arguably "at all" in 2012 with AlexNet, and became vastly superhuman ~2017.
On another level, I think you're wrong. Note the use of the word "narrow domains" in the sentence before the one you quote. What is a "narrow domain"? Essay writing is definitely not narrow. Playing Go is a reasonable choice of "narrow domain," but detecting dogs is an even better one. Suppose that you want to detect dogs for a specific task where you need <10% accuracy, and skilled humans have ~5% accuracy, when trying (ie its comparable to ImageNet). If you need <10%, then AlexNet is not able to do that narrow task! It is not "at all." Maybe GoogLeNet counts (in 2014) or maybe Microsoft's ResNet (in 2015). At this point you have a computer system with comparable ability to a human that is skilled at the task and trying to do it. Is AI suddenly able to vastly outstrip human ability? Yes! The AI can identify images faster, more cheaply, and with no issues of motivation or fatigue. The world suddenly went from "you basically need a human to do this task" to "obviously you want to use an AI to do this task." One could argue that Go engines instantly went from "can't serve as good opponents to train against" to "vastly outstripping the ability of any human to serve as a training opponent" in a similar way.
(Chess is, I think, a weird outlier due to how it was simultaneously tractable to basic search, and a hard enough domain that early computers just took a while to get good.)
Suppose that I simply agree. Should we re-write the paragraph to say something like "AI systems routinely outperform humans in narrow domains. When AIs become at all competitive with human professionals on a given task, humans usually cease to be able to compete within just a handful of years. It would be unexpected if this pattern suddenly stopped applying for all the tasks that AI can't yet compete with human professionals on."? Do you agree that the core point would remain, if we did that rewrite? How would you feel about a simple footnote that says "Yes, we're being hyperbolic here, but have you noticed the skulls of people who thought machines would not outstrip humans?"
Armstrong is one of the authors on the 2015 Corrigibility paper, which I address under the Yudkowsky section (sorry, Stewart!). I also have three of his old essays listed on the 0th essay in this sequence:
While I did read these as part of writing this sequence, I didn't feel like they were central/foundational/evergreen enough to warrant a full response. If there's something Armstrong wrote that I'm missing or a particular idea of his that you'd like my take on, please let me know! :)
You have correctly identified that giving a corrigible superintelligence to most people will result in doom. This is why I think it's vital that power over superintelligence be kept in the hands of a benevolent governing body. And yes, since this is probably an impossible ask, I think we should basically shut down AI development until we figure out how to select for benevolence and wisdom.
Still, I think corrigibility is a better strategy than the approaches currently being taken by frontier labs (which are even more doomed).
I just encountered this, and I really appreciate you writing it! I feel like you very much got the essence of what I was hoping to communicate. :D
My reading of the text might be wrong, but it seems like bacteria count as living beings with goals? More speculatively, possible organisms that might exist somewhere in the universe also count for the consensus? Is this right?
If so, a basic disagreement is that I don't think we should hand over the world to a "consensus" that is a rounding error away from 100% inhuman. That seems like a good way of turning the universe into ugly squiggles.
If the consensus mechanism has a notion of power, such that creatures that are disempowered have no bargaining power in the mind of the AI, then I have a different set of concerns. But I wasn't able to quickly determine how the proposed consensus mechanism actually works, which is a bad sign from my perspective.
I agree that if everyone in my decision-theoretic reference class stopped trying to pause AI (perhaps because of being hit by buses), the chance of a pause is near 0.
You are right and I am wrong. Oops. After writing my comment I scrolled up to the top of my post, saw the graph from Manafold (not Metaculus), thought "huh, I forgot the market was so confident" and edited in my parenthetical without thinking. This is even more embarrassing because no market question is actually about the probability conditional on no pause occurring, which is a potentially important factor. I definitely shouldn't have added that text. Thank you.
(I will point out, as a bit of an aside, that economically transformative AI seems like a different threshold than AGI. My sense is that if an AGI takes a million dollars an hour to run an instance, it's still an AGI, but it won't be economically transformative unless it's substantially superintelligent or becomes much cheaper.
Still, I take my lumps.)
Cool. Your definition of AGI seems reasonable. Sounds like we probably disagree about confidence and timelines. (My confidence, I believe, matches Metaculus. [Edit: It doesn't! I'm embarrassed to have claimed this.])
I agree that we seem not to be on the path of pausing. Is your argument "because pausing is extremely unlikely per se, most of the timelines where we make it to 2050 don't have a pause"? If one assumes that we won't pause, I agree that the majority of probability mass for X doesn't involve a pause, for all X, including making it to 2050.
I generally don't think it's a good idea to put a probability on things where you have a significant ability to decide the outcome (i.e. probability of getting divorced), and instead encourage you to believe in pausing.
I appreciate your point about this being a particularly bad place to exaggerate, given that it's a cruxy point of divergence with our closest allies. This makes me update harder towards the need for a rewrite.
I'm not really sure how to respond to the body of your comment, though. Like, I think we basically agree on most major points. We agree on the failure mode that relevant text of The Problem is highlighting is real and important. We agree that doing Control research is important, and that if things are slow/gradual, this gives it a better chance of working. And I think we agree that it might end up being too fast and sloppy to actually save us. I'm more pessimistic about the plan of "use the critical window of opportunity to make scientific breakthroughs that save the day" but I'm not sure that matters? Like, does "we'll have a 3 year window of working on near-human AGIs before they're obviously superintelligent" change the take-away?
I'm also worried that we're diverging from the question of whether the relevant bit of source text is false. Not sure what to do about that, but I thought I'd flag it.