It seems to me worth trying to slow down AI development to steer successfully around the shoals of extinction and out to utopia.

But I was thinking lately: even if I didn’t think there was any chance of extinction risk, it might still be worth prioritizing a lot of care over moving at maximal speed. Because there are many different possible AI futures, and I think there’s a good chance that the initial direction affects the long term path, and different long term paths go to different places. The systems we build now will shape the next systems, and so forth. If the first human-level-ish AI is brain emulations, I expect a quite different sequence of events to if it is GPT-ish.

People genuinely pushing for AI speed over care (rather than just feeling impotent) apparently think there is negligible risk of bad outcomes, but also they are asking to take the first future to which there is a path. Yet possible futures are a large space, and arguably we are in a rare plateau where we could climb very different hills, and get to much better futures.

New Comment
12 comments, sorted by Click to highlight new comments since:

What is the mechanism, specifically, by which going slower will yield more "care"? What is the mechanism by which "care" will yield a better outcome? I see this model asserted pretty often, but no one ever spells out the details.

I've studied the history of technological development in some depth, and I haven't seen anything to convince me that there's a tradeoff between development speed on the one hand, and good outcomes on the other.

If you go slower, you have more time to find desirable mechanisms. That's pretty much it I guess.


More information usually means better choices, and when has it ever been the case that the first design of something also was the best one? And wherever convention locked us on a path determined by early constraints, suboptimal results abound (e.g. the QWERTY keyboard). The worry about AI is that it might run away from us so fast, it has that sort of lock in on steroids.

Disclaimer: I don't necessarily support this view, I thought about it for like 5 minutes but I thought it made sense.

If we were to do things the same thing as other slowing down of regulation, then that might make sense, but I'm uncertain that you can take the outside view here? 

Yes, we can do the same as for other technologies by leaving it down to the standard government procedures to make legislation and then I might agree with you that slowing down might not lead to better outcomes. Yet, we don't have to do this. We can use other processes that might lead to a lot better decisions. Like what about proper value sampling techniques like digital liquid democracy? I think we can do a lot better than we have in the past by thinking about what mechanism we want to use.

Also, for some potential examples, I thought of cloning technology in like the last 5 min. If we just went full-speed with that tech then things would probably have turned out badly? 

Do you think it's worth slowing down other technologies to ensure that we push for care in how we use them over the benefit of speed? It's true that the stakes are lower for other technologies, but that mostly just means that both the upside potential and the downside risks are lower compared to AI, which doesn't by itself imply that we should go quickly.


I don't know what Katja thinks, but for me at least: I think AI might pose much more lock-in than other technologies. I.e., I expect that we'll have much less of a chance (and perhaps much less time) to redirect course, adapt, learn from trial and error, etc. than we typically do with a new technology. Given this, I think going slower and aiming to get it right on the first try is much more important than it normally is.  


What you are missing here is:

  • Existential risk apart from AI
  • People are dying / suffering as we hesitate

Yes, there is a good argument that we need to solve alignment first to get ANY good outcome, but once an acceptable outcome is reasonably likely, hesitation is probably bad. Especially if you consider the likelihood that mere humans can accurately predict, let alone precisely steer a transhuman future.


From a purely utilitarian standpoint, I'm inclined to think that the cost of delaying is dwarfed by the number of future lives saved by getting a better outcome, assuming that delaying does increase the chance of a better future.

That said, after we know there's "no chance" of extinction risk, I don't think delaying would likely yield better future outcomes. On the contrary, I suspect getting the coordination necessary to delay means it's likely that we're giving up freedoms in a way that may reduce the value of the median future and increase the chance of stuff like totalitarian lock-in, which decreases the value of the average future overall.

I think you're correct that there's also to balance the "other existential risks exist" consideration in the calculation, although I don't expect it to be clear-cut.

I kind of agree with this, and in this way is where I fundamentally differ from a lot of e/accs and AI progress boosters quite a lot.

However, I think 2 things matter here that limit the force of this, though I don't know to what extent:

  1. People have pretty different values, and while I mostly don't consider it a bottleneck to alignment as understood on LW, it does impact this post specifically because there are differences in what people consider the best future, and this is why I'm unsure that we should pursue your program specifically.

  2. I think there are semi-reasonable arguments that lock-in concerns are somewhat overstated, and while I don't totally buy them, they are at least somewhat reasonable, and thus I don't fully support the post at this time.

However, this post has a lot of food for thought, especially given my world model of AI development is notably skewed more towards optimistic outcomes than most of LW by a lot, so thank you for at least trying to argue for a slow down without assuming existential risk.

  1. What plateau?  Why pause now (vs say 10 years ago)?  Why not wait until after the singularity and impose a "long reflection" when we will be in an exponentially better place to consider such questions.
  2. Singularity 5-10 years from now vs 15-20 years from now determines whether or not some people I personally know and care about will be alive.
  3. Every second we delay the singularity leads to a "cosmic waste" as millions more galaxies move permanently behind the event horizon defined by the expanding universe
  4. Slower is not prima facia safer.  To the contrary, the primary mechanism for slowing down AGI is "concentrate power in the hands of a small number of decision makers," which in my current best guess increases risk.
  5. There is no bright line for how much slower we should go. If we accept without evidence that we should slow down AGI by 10 years, why not 50? why not 5000?

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?


At the moment the A.I. world is dominated by an almost magical believe in large language models. Yes, they are marvelous, a very powerful technology. By all means, let's understand and develop them. But they aren't the way, the truth and the light. They're just a very powerful and important technology. Heavy investment in them has an opportunity cost, less money to invest in other architectures and ideas. 

And I'm not just talking about software, chips, and infrastructure. I'm talking about education and training. It's not good to have a whole cohort of researchers and practitioners who know little or nothing beyond the current orthodoxy about machine learning and LLMs. That kind of mistake is very difficult to correct in the future. Why? Because correcting it means education and training. Who's going to do it if no one knows anything else? 

Moreover, in order to exploit LLMs effectively we need to understand how they work. Mechanistic interpretability is one approach. But: We're not doing enough of it. And by itself it won't do the job. People need to know more about language, linguistics, and cognition in order to understand what those models are doing.