Also known as Raelifin: https://www.lesswrong.com/users/raelifin
Thank you for this response. I think it really helped me understand where you're coming from, and it makes me happy. :)
I really like the line "their case is maybe plausible without it, but I just can't see the argument that it's certain." I actually agree that IABIED fails to provide an argument that it's certain that we'll die if we build superintelligence. Predictions are hard, and even though I agree that some predictions are easier, there's a lot of complexity and path-dependence and so on! My hope is that the book persuades people that ASI is extremely dangerous and worth taking action on, but I'd definitely raise an eyebrow at someone who did not have Eliezer-level confidence going in, but then did have that level of confidence after reading the book.
There's a motte argument that says "Um actually the book just says we'll die if we build ASI given the alignment techniques we currently have" but this is dumb. What matters is whether our future alignment skill will be up to the task. And to my understanding, Nate and Eliezer both think that there's a future version of Earth which has smarter, more knowledgeable, more serious people that can and should build safe/aligned ASI. Knowing that a godlike superintelligence with misaligned goals will squish you might be an easy call, but knowing exactly what the state of alignment science will be when ASI is first built is not.
(This is why it's important that the world invests a whole bunch more in alignment research! (...in addition to trying to slow down capabilities research.))
It seems like maybe part of the issue is that you hear Nate and Eliezer as saying "here is the argument for why it's obvious that ASI will kill us all" and I hear them as saying "here is the argument for why ASI will kill us all" and so you're docking them points when they fail to reach the high standard of "this is a watertight and irrefutable proof" and I'm not?
On a different subtopic, it seems clear to me that we think about the possibility of a misaligned ASI taking over the world pretty differently. My guess is that if we wanted to focus on syncing up our worldviews, that is where the juicy double-cruxes are. I'm not suggesting that we spend the time to actually do that--just noting the gap.
Thanks again for the response!
@Max H may have a different take than mine, and I'm curious for his input, but I find myself still thinking about serial operations versus parallel operations. Like, I don't think it's particularly important to the question of whether AIs will think faster to ask how many transistors operating in parallel will be needed to capture the equivalent information processing of a single neuron, but rather how many serial computations are needed. I see no reason it would take that many serial operations to capture a single spike, especially in the limit of e.g. specialized chips.
Yeah, sorry. I should've been more clear. I totally agree that there are ways in which brains are super inefficient and weak. I also agree that on restricted domains it's possible for current AIs to sometimes reach comparable data efficiency.
Ah, I hadn't thought about that misreading being a source of confusion. Thanks!
Sweet. Thanks for the thoughtful reply! Seems like we mostly agree.
I don't have a good source on data efficiency, and it's tagged in my brain as a combination of "a commonly believed thing" and "somewhat apparent in how many epochs of training on a statement it takes to internalize it combined with how weak LLMs are at in-context learning for things like novel board games" but neither of those is very solid and I would not be that surprised to learn that humans are not more data efficient than large transformers that can do similar levels of transfer learning or something. idk.
So it sounds like your issue is not any of the facts (transistor speeds, neuron speeds, AIs faster that humans) but rather the notion that comparing clock speeds and how many times a neuron can spike in a second is not a valid way to reason about whether AI will think faster than humans?
I'm curious what sort of argument you would make to a general audience to convey the idea that AIs will be able to think much faster than humans. Like, what do you think the valid version of the argument looks like?
IABI says: "Transistors, a basic building block of all computers, can switch on and off billions of times per second; unusually fast neurons, by contrast, spike only a hundred times per second. Even if it took 1,000 transistor operations to do the work of a single neural spike, and even if artificial intelligence was limited to modern hardware, that implies human-quality thinking could be emulated 10,000 times faster on a machine— to say nothing of what an AI could do with improved algorithms and improved hardware.
@EigenGender says "aahhhhh this is not how any of this works" and calls it an "egregious error". Another poster says it's "utterly false."
(Relevant online resources text.)
(Potentially relevant LessWrong post.)
I am confused what the issue is, and it would be awesome if someone can explain it to me.
Where I'm coming from, for context:
Anyway, like I said, I'm confused. I respect IABI's critics and am hoping to learn where my model is wrong.
I appreciate your point about this being a particularly bad place to exaggerate, given that it's a cruxy point of divergence with our closest allies. This makes me update harder towards the need for a rewrite.
I'm not really sure how to respond to the body of your comment, though. Like, I think we basically agree on most major points. We agree on the failure mode that relevant text of The Problem is highlighting is real and important. We agree that doing Control research is important, and that if things are slow/gradual, this gives it a better chance of working. And I think we agree that it might end up being too fast and sloppy to actually save us. I'm more pessimistic about the plan of "use the critical window of opportunity to make scientific breakthroughs that save the day" but I'm not sure that matters? Like, does "we'll have a 3 year window of working on near-human AGIs before they're obviously superintelligent" change the take-away?
I'm also worried that we're diverging from the question of whether the relevant bit of source text is false. Not sure what to do about that, but I thought I'd flag it.
Yep. I agree with this. As I wrote, I think it's a key skill to manage to hold the heart of the issue in a way that is clear and raw, while also not going overboard. There's a milquetoast failure mode and an exaggeration failure mode and it's important to dodge both. I think the quoted text fails to thread the needle, and was agreeing with Ryan (and you) on that.
Upvoted! You've identified a bit of text that is decidedly hyperbolic, and is not how I would've written things.
Backing up, there is a basic point that I think The Problem is making, that I think is solid and I'm curious if you agree with. Paraphrasing: Many people underestimate the danger of superhuman AI because they mistakenly believe that skilled humans are close to the top of the range of mental ability in most domains. The mistake can be shown by looking at technology in general, where specialized machines are approximately always better than the direct power that individual humans can bring to bear, when machines that can do comparable work are built. (This is a broader pattern than with mental tasks, but it still applies for AI.)
The particular quoted section of text argues for this in a way that overstates the point. Phases like "routinely blow humans out of the water," "as soon as ... at all," "vastly outstrips," and "barely [worth] mentioning" are rhetorically bombastic and unsubtle. Reality, of course, is subtle and nuanced and complicated. Hyperbole is a sin, according to my aesthetic, and I wish the text had managed not to exaggerate.
On the other hand, smart people are making an important error that they need to snap out of and fighting words like the ones The Problem uses are helpful in foregrounding that mistake. There are, I believe many readers who would glaze over a toned-down version of the text that will correctly internalize the severity of the mistake when it's presented in a bombastic way. Punchy text can also be fun to read, which matters.
On the other other hand, I think this is sort of what writing skill is all about? Like, can you make something that's punchy and holds the important thing in your face in a way that clearly connects to the intense, raw danger while also being technically correct and precise? I think it's possible! And we should be aspiring to that standard.
All that said, let's dig into more of the object-level challenge. If I'm reading you right, you're saying something like: AI capabilities have been growing at a pace in most domains where the time between "can do at all" and "vastly outstrips humans" takes at least years and sometimes decades, and it is importantly wrong to characterize this as "very soon afterwards." I notice that I'm confused about whether you think this is importantly wrong in the sense of invalidating the basic point that people neglect how much room there is above humans in cognitive domains, or whether you think it's importantly wrong because it conflicts with other aspects of the basic perspective such as takeoff speeds and the importance of slowing down before we have AGI vs muddling through. Or maybe you're just arguing that it's hyperbolic, and you just wish the language was softer?
On some level you're simply right. If we think of Go engines using MCTS as being able to play "at all" in 2009, then it took around 8 years (Alpha Go Zero) to vastly outstrip any human. Chess is even more right, with human-comparable engines existing in the mid 60s and it taking ~40 years to become seriously superhuman. Essays, coding, and buying random things on the internet are obviously still comparable to humans, and have arguably been around since ~2020 (less obviously with the buying random things, but w/e). Recognizing if an image has a dog was arguably "at all" in 2012 with AlexNet, and became vastly superhuman ~2017.
On another level, I think you're wrong. Note the use of the word "narrow domains" in the sentence before the one you quote. What is a "narrow domain"? Essay writing is definitely not narrow. Playing Go is a reasonable choice of "narrow domain," but detecting dogs is an even better one. Suppose that you want to detect dogs for a specific task where you need <10% accuracy, and skilled humans have ~5% accuracy, when trying (ie its comparable to ImageNet). If you need <10%, then AlexNet is not able to do that narrow task! It is not "at all." Maybe GoogLeNet counts (in 2014) or maybe Microsoft's ResNet (in 2015). At this point you have a computer system with comparable ability to a human that is skilled at the task and trying to do it. Is AI suddenly able to vastly outstrip human ability? Yes! The AI can identify images faster, more cheaply, and with no issues of motivation or fatigue. The world suddenly went from "you basically need a human to do this task" to "obviously you want to use an AI to do this task." One could argue that Go engines instantly went from "can't serve as good opponents to train against" to "vastly outstripping the ability of any human to serve as a training opponent" in a similar way.
(Chess is, I think, a weird outlier due to how it was simultaneously tractable to basic search, and a hard enough domain that early computers just took a while to get good.)
Suppose that I simply agree. Should we re-write the paragraph to say something like "AI systems routinely outperform humans in narrow domains. When AIs become at all competitive with human professionals on a given task, humans usually cease to be able to compete within just a handful of years. It would be unexpected if this pattern suddenly stopped applying for all the tasks that AI can't yet compete with human professionals on."? Do you agree that the core point would remain, if we did that rewrite? How would you feel about a simple footnote that says "Yes, we're being hyperbolic here, but have you noticed the skulls of people who thought machines would not outstrip humans?"
Agreed. Thanks for pointing out my failing, here. I think this is one of the places in my rebuttal where my anger turned into snark, and I regret that. Not sure if I should go back and edit...