When discussing AI risks, talk about capabilities, not intelligence

Vika

116 When discussing AI risks, talk about capabilities, not intelligence

11th Aug 2023

AI Alignment ForumLinkpost from vkrakovna.wordpress.com

3 min read

116 Ω 44

Public discussions about catastrophic risks from general AI systems are often derailed by using the word “intelligence”. People often have different definitions of intelligence, or associate it with concepts like consciousness that are not relevant to AI risks, or dismiss the risks because intelligence is not well-defined. I would advocate for using the term “capabilities” or “competence” instead of “intelligence” when discussing catastrophic risks from AI, because this is what the concerns are really about. For example, instead of “superintelligence” we can refer to “super-competence” or “superhuman capabilities”.

When we talk about general AI systems posing catastrophic risks, the concern is about losing control of highly capable AI systems. Definitions of general AI that are commonly used by people working to address these risks are about general capabilities of the AI systems:

PASTA definition: “AI systems that can essentially automate all of the human activities needed to speed up scientific and technological advancement”.
Legg-Hutter definition: “An agent’s ability to achieve goals in a wide range of environments”.

We expect that AI systems that satisfy these definitions would have general capabilities including long-term planning, modeling the world, scientific research, manipulation, deception, etc. While these capabilities can be attained separately, we expect that their development is correlated, e.g. all of them likely increase with scale.

There are various issues with the word “intelligence” that make it less suitable than “capabilities” for discussing risks from general AI systems:

Anthropomorphism: people often specifically associate “intelligence” with being human, being conscious, being alive, or having human-like emotions (none of which are relevant to or a prerequisite for risks posed by general AI systems).
Associations with harmful beliefs and ideologies.
Moving goalposts: impressive achievements in AI are often dismissed as not indicating “true intelligence” or “real understanding” (e.g. see the “stochastic parrots” argument). Catastrophic risk concerns are based on what the AI system can do, not whether it has “real understanding” of language or the world.
Stronger associations with less risky capabilities: people are more likely to associate “intelligence” with being really good at math than being really good at politics, while the latter may be more representative of capabilities that make general AI systems pose a risk (e.g. manipulation and deception capabilities that could enable the system to overpower humans).
High level of abstraction: “intelligence” can take on the quality of a mythical ideal that can’t be met by an actual AI system, while “competence” is more conducive to being specific about the capability level in question.

It’s worth noting that I am not suggesting to always avoid the term “intelligence” when discussing advanced AI systems. Those who are trying to build advanced AI systems often want to capture different aspects of intelligence or endow the system with real understanding of the world, and it’s useful to investigate and discuss to what extent an AI system has (or could have) these properties. I am specifically advocating to avoid the term “intelligence” when discussing catastrophic risks, because AI systems can pose these risks without possessing real understanding or some particular aspects of intelligence.

The basic argument for catastrophic risk from general AI has two parts: 1) the world is on track to develop generally capable AI systems in the next few decades, and 2) generally capable AI systems are likely to outcompete or overpower humans. Both of these arguments are easier to discuss and operationalize by referring to capabilities rather than intelligence:

For #1, we can see a trend of increasingly general capabilities, e.g. from GPT-2 to GPT-4. Scaling laws for model performance as compute, data and model size increase suggest that this trend is likely to continue. Whether this trend reflects an increase in “intelligence” is an interesting question to investigate, but in the context of discussing risks, it can be a distraction from considering the implications of rapidly increasing capabilities of foundation models.
For #2, we can expect that more generally capable entities are likely to dominate over less generally capable ones. There are various historical examples of this, e.g. humans causing other species to go extinct. While there are various ways in which other animals may be more “intelligent” than humans, the deciding factor was that humans had more general capabilities like language and developing technology, which allowed them to control and shape the environment. The best threat models for catastrophic AI risk focus on how the general capabilities of advanced AI systems could allow them to overpower humans.

As the capabilities of AI systems continue to advance, it’s important to be able to clearly consider their implications and possible risks. “Intelligence” is an ambiguous term with unhelpful connotations that often seems to derail these discussions. Next time you find yourself in a conversation about risks from general AI where people are talking past each other, consider replacing the word “intelligent” with “capable” – in my experience, this can make the discussion more clear, specific and productive.

(Thanks to Janos Kramar for helpful feedback on this post.)

AI Safety Public MaterialsAI Alignment FieldbuildingAI

Frontpage

116 Ω 44

Mentioned in

46AI Safety 101 : Capabilities - Human Level AI, What? How? and When?

43My intellectual journey to (dis)solve the hard problem of consciousness

When discussing AI risks, talk about capabilities, not intelligence

New Comment

7 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:04 AM

[-]tailcalled1yΩ163720

I think discussions about capabilities raise the question "why create AI that is highly capable at deception etc.? seems like it would be safer not to".

The problem that occurs here is that some ways to create capabilities are quite open-ended, and risk accidentally creating capabilities for deception due to instrumental convergence. But at that point it feels like we are getting into the territory that is best thought of as "intelligence", rather than "capabilities".

[-]Daniel Kokotajlo1y144

Nevertheless, I still think we should go with "capabilities" instead of "intelligence." If someone says to me "why create AI that is highly capable at deception etc.?" I plan to say basically "Good question! Are you aware that multiple tech companies are explicitly trying to create AI that is highly capable at EVERYTHING, a.k.a. AGI, or even superintelligence, and that they have exceeded almost everyone else's expectations in the past few years and seem to be getting close to succeeding?"

[-]dr_s1y73

One thing that I think is also worth stressing:

the companies are trying to do it
they think they are close to succeeding
it is not clear if they really think it, or merely say so, but the effort they seem to be putting into it suggests they do have some confidence; and while they may be wrong, they are the ones who would know best, as they're directly working with the things.

So the question is not really "do you think it is absolutely guaranteed that AGI will be created within the next 10 years?", but rather "do you think it is absolutely impossible that it will?". Any small amount of probability is at least worth giving it a thought! I get that lots of people are somewhat skeptical of their claims, makes sense, but you have to at least consider the possibility that they're right.

[-]Vika1yΩ353

I agree that a possible downside of talking about capabilities is that people might assume they are uncorrelated and we can choose not to create them. It does seem relatively easy to argue that deception capabilities arise as a side effect of building language models that are useful to humans and good at modeling the world, as we are already seeing with examples of deception / manipulation by Bing etc.

I think the people who think we can avoid building systems that are good at deception often don't buy the idea of instrumental convergence either (e.g. Yann LeCun), so I'm not sure that arguing for correlated capabilities in terms of intelligence would have an advantage.

[-]dr_s1y10

I think that's the meaning of "general capabilities" though. If you think about an AI good at playing chess, it's not weird to think it might just learn to use feints to deceive the opponent just as a part of its chess-goodness. A similar principle applies; in fact, I think game analogies might be a very powerful tool when discussing this!

[-]Rob Bensinger1yΩ8142

My own suggestion would be to use a variety of different phrasings here, including both "capabilities" and "intelligence", and also "cognitive ability", "general problem-solving ability", "ability to reason about the world", "planning and inference abilities", etc. Using different phrases encourages people to think about the substance behind the terminology -- e.g., they're more likely to notice their confusion if the stuff you're saying makes sense to them under one of the phrasings you're using, but doesn't make sense to them under another of the phrasings.

Phrases like "cognitive ability" are pretty important, I think, because they make it clearer why these different "capabilities" often go hand-in-hand. It also clarifies that the central problems are related to minds / intelligence / cognition / etc., not (for example) the strength of robotic arm, even though that too is a "capability".

[-]dr_s1y51

Agreed on this. Mostly, shifting away from using "intelligence" directly removes us from the philosophical morass that term invites, such as "is it really intelligence if the thing that invented the super nanotech that is paperclipping you isn't conscious or self-aware enough to possess intentionality?". No need to debate functionalism - robot tigers tear you up just as well as real tigers do!

There's also a fundamental (IMO, very stupid) proxy war being fought over this in which some humanities-oriented people really want to stress that they think STEM people are too self-important and absorbed with their own form of intelligence, and want to make it clear that other kinds of intelligence aren't any lesser, and thus attach to AI the kind of intelligence of its creators. The problem being that maybe there was a Japanese poet who alone had the sensitivity and empathy to finally grasp the essence of life; but if he was in Hiroshima in August 1945, he got vaporized along with thousands of others by Dr. Oppenheimer & co.'s invention, and that doesn't mean that one kind of intelligence is superior to the other, but it makes abundantly clear which kind of intelligence is more dangerous, and that's really what we're worried about.

(and yeah, of course, social intelligence as exhibited by con-men is also terribly dangerous; short term, probably more than scientific capabilities! And that's just a third kind of thing that both camps tend to see as low status and downplay)

Moderation Log