People seem to believe in inevitable AI doom because it’s a compelling meme more than because they believe in any particular argument
Yes, it's noticeable that if you want to argue against Doom, you have to construct your own argument for it, because there is no existing argument from the doomer side that states all their actual assumptions.
Current LLMs.are nowhere near enough like actual lifeforms -- they can't reproduce without human assistance and don't have goals of their own. Perhaps you are making the assumption that some phase change will occur and future AI will be very different.
Because SE Gyges erroneously believes that Alignment Is Proven To Be Solvable.
You have helpfully given a link to Gyges' argument that alignment is solveable. If you disagree, maybe you could give a link to the disproof.
The most recent argument against alignment being close to solved is Current AIs seem pretty misaligned to me by Greenblatt. I have yet to understand what prevents the AI-2027-like scenario where misalignment creeps into becoming increasingly difficult to notice, increasingly scheming and increasingly shifting the AI's goals to solving hard problems. I even posted an angry rant about the fact that Mythos Preview's misalignment discussed in Opus 4.7's System Card resembles Greenblatt-like failures and misalignment of Agent-3. Additionally, an LLM's reproduction is either creation of new servers where weights are stored or of new accounts calling the LLM and instructing it to do requests, so the phase change which you describe could've partially happened...
Sorry for being a bad reader. I browsed your post but... We're talking about the creation of artificial non-biological entities far smarter than any human being. To do that is to replace the human species as the most capable "lifeform" on Earth. Could you summarize for me, why you think that's a good idea, and why you think we have any idea of the consequences?
or AI algorithms function
There is the human brain which almost by definition is AGI-equivalent and was trained on at most ~1e24 FLOP, but in 30 years.
I agree that the baseline doom scenario is now more like the AI-2027 alignment expandable by Kokotajlo than a brain in a box in a basement. However, the AI-2027 scenario's Race branch still ends with doom precisely because we have yet to understand how to align the AIs.
Upvoting because it is useful to have a lot of counter-pause arguments in one place. Tomorrow I should take the time to go through these carefully and give my response to them, since I come from the pro pause side more.
It is sometimes claimed that
I think that this is wrong because the first premise is wrong. In my opinion, AI is roughly a normal science like physics or biology, and is dangerous in the same way those fields are dangerous but perhaps more so. This means that the conclusion, "the only rational response is to ban AI completely for a prolonged period or forever", is also wrong.
AI risk should be mitigated in a similar way to how we mitigate risks from other sciences and engineering projects. There will be some differences due to how the field actually works, in that there is nothing really equivalent to uranium or smallpox samples and so physical controls are less effective or at least very different. There may also be a difference in magnitude. My argument is only that AI is not different in kind from any other science which carries substantial risk.
If AI is a roughly normal science, pushing for a complete ban or moratorium on AI is likely to be be counterproductive. If nothing else, such advocacy adds noise into the environment and can make it more difficult to stage other interventions that might be better, like interpretability research, safety evaluations, and release criteria.
Other people have been arguing about this longer than I have and it's a broad topic covering both AI itself and the broad, societal issue of managing AI. In this case I think I can more productively engage with the subject as a whole by providing, basically, a literature review of who has written what that I think is correct. This was originally a thread by deepfates, there was some desire to extend it, and it seemed like this canon perhaps needed a permanent home with the rationale for its existence right up top.
Organization is entirely my preference.
On Those Undefeatable Arguments for AI Doom by 1a3orn
You can find this essay here: https://1a3orn.com/sub/essays-ai-doom-invincible.html.
People seem to believe in inevitable AI doom because it's a compelling meme more than because they believe in any particular argument. I would like to add that as the actual landscape has changed team doom has, seemingly, not changed any of its opinions.
This post makes a good case that just having a lot of arguments is no merit. I have therefore endeavored to include here only things which I think do not overlap much (if at all), and each of which if proven wrong I think would considerably strengthen the argument for inevitable doom.
Beren's Entire Blog
It turns out Beren Millidge has essentially written a major work on AI alignment scattered across the last few years of blog posts. I had read maybe half of them, figured they were probably true, and mostly not thought about them after that. It was only obvious to me while compiling threads of links that this rose to the level of being a self-contained work that covered the subject pretty well.
We can divide this nicely into sections, and pull quote what seems (to me) to most directly address ordinary "inevitable doom" beliefs and their consequences.
Fundamentals
My path to prosaic alignment and open questions
Maintaining Alignment during RSI as a Feedback Control Problem
The Biosingularity Alignment Problem Seems Harder than AI Alignment
Alignment likely generalizes further than capabilities
Mechanics
Alignment In The Age Of Synthetic Data
Empathy as a natural consequence of learnt reward models
The computational anatomy of human values
Policy
Open source AI has been vital for alignment
My Preliminary Thoughts on AI Safety Regulation
Strong infohazard norms lead to predictable failure modes
Bostrom
Optimal Timing for Superintelligence
Nick Bostrom is sort of the grandfather of AI Doom as a concept and he seems to want to put the genie at least part-way back in the bottle.
AI Optimism
This blog lives at optimists.ai, and contains detailed arguments concerning optimizers and evolution. Some highlights:
AI is easy to control
Counting arguments provide no evidence for AI doom
Adrian Leicht on Policy
Is an AI Pause a good idea, even assuming a relatively high level of risk?
Press Play To Continue: ‘Pausing AI’ is bad policy and worse politics
Me
I am actually not sure I would include these if I personally had not written them, because they are a little bit redundant with Beren and AI Optimism. I do, however, take a wider, more historical and less technical perspective.
Alignment Is Proven To Be Solvable
Counting Arguments and AI
An Argument I Haven't Seen Made In Long Form
AI risk discussion anytime before 2022 was often about the idea of FOOM, which, well:
From here.
Modern AI is incredibly resource intensive! You have to pump more and more electricity into the thing to get any result. A brain in a box in a basement cannot exponentially improve itself relative to human society given any technology we currently have. It would have to have some feasible way of acquiring more energy. Physics tells us that the universe is always minimizing free energy, so it tends to be relatively hard to find!
If this possibility was part of the reason anyone believed doom was likely, they should currently believe doom is less likely. Unless we see major, paradigm-shiftingly different technology in terms of how physical computers or AI algorithms function, nothing along current lines is likely to do anything like this. LLMs (and all modern AI) are largely scale- and energy-bottlenecked, not design bottlenecked.
If you were worried about FOOM, congratulations, LLMs are power-hungry monsters. You should hope that development continues on these lines, because it can't FOOM from a basement. In fact, you can't fit the training compute in a basement at all. Instead of having thousands of places something could go fantastically wrong you have maybe a few dozen, and really since frontier research is only taking place at maybe five companies, you actually have like five places to worry about. This is a vast improvement.
Conclusion
What would convince me I was wrong, or make me more worried? Really if any of the technical arguments above proved to be very wrong, or to be wrong for systems currently towards the cutting edge. The only one of these I think is sort of shaky is energy-efficiency. I think it's perfectly plausible that future algorithms or computers might actually be much more efficient, and then you do in fact have to worry about them growing on short time scales.
I'm also quite concerned with the impact of the technology and its governance. Society seems like it's not doing great at managing itself already, and it's not clear that we are capable of making good collective decisions about AI research. It's also not clear that we are capable of making good decisions surrounding deployment, and mitigating the consequences of deployment on e.g. the job market. However, this is a human problem, a real "this is why we can't have nice things" sort of issue. It's not a fundamental problem with the technology, it's a problem with the societal context in which we develop it.
One thing you'll notice, though, is that there's apparently no specific falsifiability criteria for inevitable doom as a thesis. Several things have happened that should have falsified or at least modified the position. Strong AI probably can't arise in a random basement with anything like current technology and human values are actually relatively easy to convey to an LLM, to name two. We can infer from the lack of change in their position that their position is not actually based on the evidence, and that the goalposts will always move.