Reframing the AI Risk

Here are some of my thoughts after reflecting on this post for a day. These ideas are somewhat disconnected from one another but hopefully in aggregate provide some useful commentary on different aspects of the "reframing AI risk" proposal:

The results of the AI Safety Arguments Competition should be released soon (Thomas Woodside told me last week they were wrapping up review of argument submissions). If it went well, then we may see some compelling reframings coming out of that.
I agree that AGI discussion has historically been stigmatized, but it seems to be becoming less so. DeepMind isn't shy about using the term AGI on their website and in their podcast. Same with OpenAI who mentions "artificial general intelligence" in the headline of their About page, and whose CEO tweeted "AGI is gonna be wild" earlier this year. Elon Musk tweeted a few weeks ago that he thinks we'll see AGI by 2029. Google Trends shows that searches on the terms "AGI" and "artificial intelligence" have been at historically high levels since around 2016.

Projecting forward, I believe that as increasingly impressive large model and multimodal feats like GPT-N, DALL-E, etc. continue to be announced, it will become more natural for ML experts and other people following these developments to think and talk about AGI. Instead of seeming like a wacky far-off sci-fi idea, AGI starts to look like more something that's coming down the pipe. So while I think reframing AI risk to get around AGI stigma might be useful for getting traction on alignment ideas sooner, I don't think the benefit would be as large as it would be if you're assuming that AGI skepticism is constant.
Changing the terminology has costs and risks. You touch on this in your post (the "Robust" point), but I think it's worth emphasizing the care that needs to be taken with a sudden change. I worry that if we start using a term like "advanced software security" instead of "AGI alignment", some ML researchers will be trying to decode what you're saying. Then when they probe your models and realize what you're talking about, their reaction will be "Oh look, the paranoid AGI transhumanists are getting sneaky and using euphemisms now". Changing the terminology could also be confusing to people who are sympathetic or open to taking the risks seriously.
I thought this post from Matthew Yglesias was interesting, where he's arguing for somewhat the opposite approach as this post. He says that we should embrace analogies to The Terminator movies to make AI risk more concrete and relatable, at least when discussing it with broader audiences.
There is something that has been bothering me about the dynamics of the ML/AI field for awhile now with respect to AGI risk - your post reminded me of it again such I'm going to finally try to write it down here. The opinions of ML researchers and engineers across the software industry are granted elevated respect on this topic. For example, the influential Grace et al. 2017 survey on AI timelines surveyed researchers from the NIPS/NeurIPS and ICML, which are conferences for people who work in ML broadly, not just for AGI researchers. Also, anecdotally, I have several friends who work in ML (but not on AGI research), and when I initially started talking to them about AGI risk they scoffed and looked down their noses (though over time have become more sympathetic).

My point is that working on narrow ML systems and AGI research are very different things. We currently treat anyone in the ML industry as having an expert opinion on AGI, but I don't think we should. Working on a spam classifier model or self-driving cars does not qualify you to talk about AGI. In fact, it tends to bias you to think that AGI is more of a pipe dream than you would otherwise, because you're accustomed to dealing with the embarrassing day-to-day failures of present-day ML systems, and you assume that's the state of the art while not having time to read about recent research out of DeepMind and OpenAI. I would be interested in a reframing that rebalanced the default status/respect granted in AGI risk discussions from "anyone who works in ML" to "people who work at AGI research labs and people who have been specifically studying AGI".

[-]Evan R. Murphy3y60

[-]Anon User3y60

I think "AI is software" that has "bugs" is dangerously misleading - it incorrectly implies that we know what we want AI to do, and just need to be a little more careful about how we program it. But in reality, today's AIs are not programmed, but instead semi-randomly chosen by a process that we do not fully understand and are not fully in control of. I think it's the latter part that need to be emphasized - we are not in control, and we do not know how to regain control of something that keeps getting further and further away from us, and a runaway crash is the only possible outcome of the current trajectory we are on.

[-]Jeff Rose3y10

In addition to being misleading, this just makes AI one more (small) facet of security. But security is broadly underinvested in and there is limited government pushback. In addition, there is already a security community which prioritizes other issues and thinks differently. So this would place AI in the wrong metaphorical box.

While I'm not a fan of the proposed solution I do want to note that its good that people are beginning to look at the problem.

[-]Igor Ivanov3y50

ChatGPT was recently launched, and it is so powerful, that it made me think that the problem of a misuse of a powerful AI It's a very powerful tool. No one really knows how to use it, but I am sure, we will soon see it used as a tool for unpleasant things

But I also see more and more of perception of AI as a live entity with agency. People are having conversations with ChatGPT as with a human

[-]Evan R. Murphy3y20

Interesting proposal. Just finished reading and will be thinking on it.

One candidate for an alternative to "AGI safety" that is less precise but also less fraught is "ML safety", a term which I've noticed Dan Hendryks using.

[-]Thane Ruthenis3y20

Would be very interested in your feedback, once you've thought it through.

Just shared some of my thoughts in a top-level comment.

Framing	Treatment A	Treatment B
Positive	"Saves 200 lives"	"A 33% chance of saving all 600 people, 66% possibility of saving no one."
Negative	"400 people will die"	"A 33% chance that no people will die, 66% probability that all 600 will die."

LESSWRONG
LW

LESSWRONG
LW

26

Reframing the AI Risk

26

26

Introduction

The Power of Framing

What Frame Do We Want?

Potential Candidate