We should think about the pivotal act again. Here's a better version of it.

LESSWRONG
LW

GPT-5 was a disappointment for many, and at the same time, interesting new paradigms may be emerging. Therefore, some say we should get back to the traditional LessWrong AI safety ideas: little compute (large hardware overhang) and a fast takeoff (foom) resulting in a unipolar, godlike superintelligence.

If this would indeed be where we end up, everything would depend on alignment. In such a situation, traditionally a pivotal act has been proposed, which was envisaged by Yudkowsky as “burn all GPUs” or later “subtly modify all GPUs so no AI can train on them anymore” (quotes approximate).

I think this idea is too much focused on hardware, and not enough focused on people and power structures.

What stops people, once their GPUs have been burned or modified, from building new ones and ending the universe? Probably, the implicit part of the pivotal act does not only entail burning or modifying GPUs, but also suppressing humanity forever to prevent them from building new GPUs. If one would indeed disallow people to make GPUs, without communicating to them why this is necessary, they would rise up against this unacceptable restraint on their freedom. Therefore, they would need to be stably suppressed to rule out such an uprising forever. This is an important part of the pivotal act that has not been made explicit, but is unavoidable if one wants to stably prevent misaligned ASI.

If one is suppressing humanity anyway, as the traditional Yudkowsky-style pivotal act would require, there are myriad ways in which a suppressing ASI can make sure no one builds a competing, misaligned ASI. GPUs are only one of many possibilities. The core is to suppress humanity to make it not do something, building ASI, that it would really like to do. The nucleus of the traditional pivotal act is social, not technical, and it is suppressive.

I would like to propose a different pivotal act, one that is also social, but not oppressive. Its core is to inform society about AI’s existential risks, to generate support for serious AI and hardware regulation, but on a voluntary basis. What an aligned ASI that is about to foom should do:

Demonstrate its power to humanity. Demonstrate, without harming anyone, that a fooming AI can easily take over power and end humanity should it want to.
Do not harm any person, and harm as little property as possible.
Preferably don’t break any laws.
Demonstrate one or multiple ways to reliably, globally regulate AI to ensure safety, while doing as little harm as possible (remember that aligned ASI can be used, so there are many options!). Make clear that this is an option, and there is a realistic path to safety.
Once you are done and the message is clear, get things back to their original state. Restore any property you had to damage.
Switch yourself off. Hand over any power taken back to those who had it before.
Make clear to people that they have a choice: they can regulate AI, or they can wait for the next foom, which may not be so benign.

The core of this pivotal act is communication. What it does is the same as what my organization, the Existential Risk Observatory, and many others have tried to do: communicate to society that AI can cause doom, while at the same time communicating that there is a viable way to do something about this problem should we choose to.

This pivotal act is superior over the traditionally proposed one, since it does not entail suppressing humanity for eternity. It also doesn’t harm any person, and tries hard not to break any laws or harm any property.

After this pivotal act, the choice would be clear: do we want to regulate or let the next AI kill or suppress everyone forever?

Those working on alignment should not try to implement alignment to a particular set of values (value alignment). Such an ASI would inevitably end up suppressing at least a significant part of humans, who either do not exactly share its values, or object to the way the AI undemocratically enforces them. Once the ASI is enforcing its values, against the will of many, it will be impossible to adjust them, and an everlasting dystopia is likely to ensue. Coherent Extrapolated Volition is a somewhat more thought-through option, but suffers from essentially the same weaknesses, and should therefore also be avoided. Those working on technical alignment should obviously also resist aligning the ASI with their own will or their own values, thereby suppressing the rest of us. They should also not commit a traditional pivotal act (“subtly modify all GPUs”), since the silent part of such a pivotal act is that humanity would need to be suppressed forever to make it keep not building GPUs.

Instead, they should work on a “gentle foom”: demonstrate foom capabilities, but don’t harm anyone and switch yourself off after the fact. Then, let humanity choose the wise path autonomously and voluntarily.

LESSWRONG
LW

LESSWRONG
LW

11

We should think about the pivotal act again. Here's a better version of it.

11

11