What an aligned ASI that is about to foom should do
I thought the assumption behind the "pivotal act" is that it is done at a time when no-one actually knows how to align an ASI, and it is done to buy time until alignment theory can be figured out?
I don't think that's accurate. According to Yudkowskyan theory, as far as I know, if no one knows how to align an ASI, we'll die straight away. In order for a pivotal act to be possible, someone needs to understand how to at least align an ASI that much. I think a pivotal act is kind of an MVP for alignment: a minimum necessity in order to not go extinct and preserve the future. Once you understand alignment even better, you might want to do fancier things as well, but in any case you'd need to find a way to make sure no one builds unaligned ASI.
GPT-5 was a disappointment for many, and at the same time, interesting new paradigms may be emerging. Therefore, some say we should get back to the traditional LessWrong AI safety ideas: little compute (large hardware overhang) and a fast takeoff (foom) resulting in a unipolar, godlike superintelligence.
If this would indeed be where we end up, everything would depend on alignment. In such a situation, traditionally a pivotal act has been proposed, which was envisaged by Yudkowsky as “burn all GPUs” or later “subtly modify all GPUs so no AI can train on them anymore” (quotes approximate).
I think this idea is too much focused on hardware, and not enough focused on people and power structures.
What stops people, once their GPUs have been burned or modified, from building new ones and ending the universe? Probably, the implicit part of the pivotal act does not only entail burning or modifying GPUs, but also suppressing humanity forever to prevent them from building new GPUs. If one would indeed disallow people to make GPUs, without communicating to them why this is necessary, they would rise up against this unacceptable restraint on their freedom. Therefore, they would need to be stably suppressed to rule out such an uprising forever. This is an important part of the pivotal act that has not been made explicit, but is unavoidable if one wants to stably prevent misaligned ASI.
If one is suppressing humanity anyway, as the traditional Yudkowsky-style pivotal act would require, there are myriad ways in which a suppressing ASI can make sure no one builds a competing, misaligned ASI. GPUs are only one of many possibilities. The core is to suppress humanity to make it not do something, building ASI, that it would really like to do. The nucleus of the traditional pivotal act is social, not technical, and it is suppressive.
I would like to propose a different pivotal act, one that is also social, but not oppressive. Its core is to inform society about AI’s existential risks, to generate support for serious AI and hardware regulation, but on a voluntary basis. What an aligned ASI that is about to foom should do:
The core of this pivotal act is communication. What it does is the same as what my organization, the Existential Risk Observatory, and many others have tried to do: communicate to society that AI can cause doom, while at the same time communicating that there is a viable way to do something about this problem should we choose to.
This pivotal act is superior over the traditionally proposed one, since it does not entail suppressing humanity for eternity. It also doesn’t harm any person, and tries hard not to break any laws or harm any property.
After this pivotal act, the choice would be clear: do we want to regulate or let the next AI kill or suppress everyone forever?
Those working on alignment should not try to implement alignment to a particular set of values (value alignment). Such an ASI would inevitably end up suppressing at least a significant part of humans, who either do not exactly share its values, or object to the way the AI undemocratically enforces them. Once the ASI is enforcing its values, against the will of many, it will be impossible to adjust them, and an everlasting dystopia is likely to ensue. Coherent Extrapolated Volition is a somewhat more thought-through option, but suffers from essentially the same weaknesses, and should therefore also be avoided. Those working on technical alignment should obviously also resist aligning the ASI with their own will or their own values, thereby suppressing the rest of us. They should also not commit a traditional pivotal act (“subtly modify all GPUs”), since the silent part of such a pivotal act is that humanity would need to be suppressed forever to make it keep not building GPUs.
Instead, they should work on a “gentle foom”: demonstrate foom capabilities, but don’t harm anyone and switch yourself off after the fact. Then, let humanity choose the wise path autonomously and voluntarily.