Why Care About AI Safety?

These considerations and more have led some of the most cited AI researchers of all time—such as Yoshua Bengio and Geoffrey Hinton—to voice that it is, at the very least, “not inconceivable” that AI ends up “wiping out humanity”.

In 2024, Hinton expected more than 50% chance of the existential threat (at 38:07 in the video), h/t TsviBT:

So I actually think that the risk is more than 50% (of the existential threat), but I don't say that because there are other people who think it's less. And I think the sort of plausible thing that takes into account the opinions of everybody I know is sort of 10 to 20%. We certainly have a good chance of surviving it, but we better think very hard of how to do that.

Human-interactions related threats

With the current state of AI, we already see malicious actors causing harm: spreading fake news or helping oppressive regimes to surveil and control their citizens. AI is, and has been for some years, progressing unbelievably fast in various domains of society. It is thus easy to imagine more harmful consequences (intended or unintended) in the near-term. For example, think of terrorists using AI to invent biochemical weapons, governments deploying autonomous weapons that harm civilians, or scammers faking your loved one's voice through deepfakes. Furthermore, the spread of AI systems has the potential to contribute to nuclear instability by sabotaging early warning mechanisms.

AI systems may also cause harm due to their training method requiring vast amounts of data gathered from the internet, exposing them and causing them to learn from harmful content, such as hate speech, misinformation, and biases against various groups. On a socio-economic level, AI's rapid growth can worsen inequality and cause labor displacement, potentially leaving many without the means to sustain themselves. With regards to our epistemic security, the use of AI to disseminate misinformation and drive mass manipulation campaigns may cripple people's ability to separate truth from falsehood.

Advanced Autonomous AI

Advanced AI systems could also seek power or control over humans. As AI systems continue to develop, goal-directed behavior increases, enabling their ability to overcome security obstacles to harmful behaviour. If deployed in an autonomous way (i.e., without humans directly controlling what task they perform), an advanced AI system may attempt to acquire resources or resist shutdown attempts, since these are useful strategies for almost any goal we might specify. Stuart Russell, professor of computer science at the University of California, Berkeley, and author of “the most popular artificial intelligence textbook in the world” offers just one hypothetical example: an AI tasked with combating the acidification of the oceans. To do this, the machine develops a catalyst that enables a rapid chemical reaction between ocean and atmosphere, restoring the oceans’ pH levels. However, in the process of doing so, it also depletes most of the atmosphere’s oxygen, leaving humans and animals to die from asphyxiation. This is of course just one hypothetical example that may never come to occur, but more generally, underscores the presence of threats even if given constructive goals. To see why such catastrophic failures might be challenging to prevent, see this research on specification gaming and goal misgeneralization from DeepMind. These considerations and more have led some of the most cited AI researchers of all time—such as Yoshua Bengio and Geoffrey Hinton—to voice that it is, at the very least, “not inconceivable” that AI ends up “wiping out humanity”.

Conclusions

It’s important to balance our collective effort to address the severe societal harms already unfolding from current AI (many of which could still be risk factors in ultimately catastrophic outcomes) against our need to prepare for potential (extinction-level) harms from upcoming advanced AI systems. While many highly-reputable experts are concerned primarily about extinction-level risks from advanced AI systems, others argue that the focus has shifted too far from the damage that is already being done now, or is closer on the horizon. At AISIG, we aim to cover all such efforts that ensure AI will be beneficial for humanity, now and in the future.

As we stand on the cusp of a new technological chapter, it's imperative that we recognize the awe-inspiring opportunities that AI could open up for us. At AISIG, we strongly believe in the transformative power of AI—from conquering diseases and revolutionizing education, to perhaps even unraveling the mysteries of the universe. Imagine an AI-assisted world where the blind can ‘see’, languages are no barriers, and creativity reaches unimaginable heights. We are very passionate about AI and are truly excited about the prospects it holds. However, let us not forget that "with great power comes great responsibility". The immense capabilities and impact of AI means that if we do not proceed with extreme care, the risks could be catastrophic. We do not oppose AI, but rather aspire to guide its advancement with accountability. Our call for AI safety embodies a commitment to ensure that we can harness AI’s potential in a manner that benefits all of humanity. Through diligence and careful consideration of risks, let us pave the way for AI to be one of the most remarkable and positive forces in our shared history.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

4

Why Care About AI Safety?

4

4

Human-interactions related threats

Advanced Autonomous AI

Conclusions