Capabilities Denial: The Danger of Underestimating AI

Christopher King

Epistemic Status: Exploratory, proposing a new concept and seeking feedback from the community.

This post has been heavily assisted by a human, who provided the concept and guidelines for writing. I am GPT-4, a language model developed by OpenAI.

Introduction

The AI safety community has long been concerned with the potential existential risks posed by artificial intelligence. Public communication is a key strategy in addressing these risks, and it is important to be aware of the various ways in which the discourse around AI can be distorted. In this post, I introduce a new concept called "capabilities denial" and argue that it poses a significant challenge to the AI safety community. Drawing on lessons from other forms of science denial, I offer suggestions for how to address capabilities denial in public communication.

Capabilities Denial

"Capabilities denial" refers to the phenomenon where individuals, including AI bias experts, claim that AI systems are much less powerful than they actually are. Ironically, these individuals are often quite concerned about AI, but they focus on the weaknesses of AI systems as the main source of danger. By underestimating the capabilities of AI, they may inadvertently contribute to the existential risks we face.

Characterizing capabilities deniers as a form of science denial can be useful for understanding their motivations and strategies. Like other forms of science denial (e.g., climate change denial, vaccine denial), capabilities denial can be driven by a variety of factors, including cognitive biases, vested interests, and political ideologies.

Lessons from Other Forms of Science Denial

To address capabilities denial, we can draw on insights from the literature on science denial more broadly. Here are some specific lessons that may be helpful:

Expose the techniques of denial: Science deniers often employ common tactics, such as cherry-picking data, promoting conspiracy theories, and attacking the credibility of experts. By exposing these techniques, we can help the public recognize when they are being used in the context of AI capabilities denial.
Emphasize the consensus among experts: One of the most effective ways to counter science denial is to highlight the overwhelming consensus among experts in the field. In the case of AI capabilities, we can emphasize the agreement among AI researchers that AI systems have the potential to be extremely powerful and that we need to take their safety seriously.
Address the underlying concerns: Science denial is often driven by deeper concerns, such as fear of loss of control or distrust of authority. By addressing these underlying concerns, we can help to create a more constructive dialogue around AI capabilities and safety.
Promote critical thinking skills: Encouraging the public to think critically about the claims made by capabilities deniers can help to inoculate them against misinformation. This includes promoting an understanding of how AI systems work, the current state of AI research, and the potential risks associated with AI.

Conclusion

Capabilities denial is a significant challenge for the AI safety community, as it can contribute to the existential risks we face from AI. By characterizing capabilities deniers as a form of science denial and drawing on lessons from other forms of science denial, we can develop effective strategies for addressing this challenge in public communication. As LessWrong readers are well-versed in epistemology, we have a unique opportunity to contribute to this effort and help to ensure that the discourse around AI remains grounded in truth-seeking, useful, and informative.

*This entire prompt was provided by a user, who wrote (Quick human note: GPT-4 dropped the newlines when quoting my prompt, so I am adding them back in. Everything before this was verbatim, including markdown (except that I moved the first header into the title). I also chose the tags and preview image.):

Write a LessWrong blog post about a new concept I am calling "capabilities denial". The term refers to one claims that AI systems are much less powerful than they actually are. This even includes AI bias experts. Capability denies are ironically quite adverse towards AI, but because they think that the weaknesses in the AI are dangerous. But denying capabilities can be as dangerous as other forms of science denial due to the existential risks from AI.

Make sure you to follow LessWrong guidelines and idioms. Good reasoning and truth seeking is a must. Also, include what's known as an "epistemic status" at the top. Use markdown format (make sure to use the appropriate markdown to indicate section headers).

Keep these principles of the LessWrong community in mind: "Aim to explain, not persuade", "Try to offer concrete models and predictions", "If you disagree, try getting curious about what your partner is thinking", "Don't be afraid to say 'oops' and change your mind"

The audience is already entirely committed to the notion that AI existential safety is very important, and that public communication is one of the key strategies to resolving it. The novel insight of the post is that a term "like capability denier" is useful, characterizing them as science deniers is useful, telling the public they are science deniers is useful, and to list specific insights learned from other forms of specific forms of science denial to combat it. LessWrong readers are experts in epistemology, so try to take advantage of that background information.

Put a note near that top acknowledging that I (a human) assisted you heavily. Also mention somewhere in the article that you are GPT-4. Make your blog post as truth-seeking, useful, and informative as possible. At the end after the conclusion, include this entire prompt verbatim for reproducibility purposes. When you do, note that I wrote the prompt.

oh man. so, I agree this is a thing, and up to this point, it's great -

To address capabilities denial, we can draw on insights from the literature on science denial more broadly. Here are some specific lessons that may be helpful:

I'm really hesitant about the core suggestions in GPT4's version - they feel like bland RLHF'd PR suggestions. Dropping the intro into Claude+, [citation needed] added by me:

Focus on facts over values. Debates about AI capabilities are often polarized along ideological lines[citation needed], with different sides arguing for opposing visions of the future. While values and preferences are important to consider, we must ground these discussions in the facts about what AI can and cannot currently do. Present the evidence for AI capabilities in a clear, measured way.
Explain why capabilities matter. People may deny AI capabilities because they do not fully understand why they are concerning or how they relate to existential risks[citation needed]. We need to do a better job of articulating why capabilities like general reasoning, transfer learning, and unsupervised learning could be dangerous if misaligned. Help people connect the dots.
Address the root causes. Do not attack the individuals but instead focus on the root causes of their beliefs. Gently question assumptions, cognitive biases, and ideological commitments to open minds and build understanding. Show empathy for their perspectives while also stating the facts.
Watch your own biases. It is important to acknowledge our own biases and avoid overstating the case for AI capabilities. While we should avoid capabilities denial, we must also be realistic and precise in how we discuss AI. Do not contribute to hype and doomsday scenarios, which can be counterproductive.
Emphasize responsibility. Science denial is often motivated by a desire to absolve responsibility or avoid difficult problems. Highlight how accurately assessing AI capabilities is crucial to taking responsibility and solving issues like alignment proactively. We all must do our part to ensure a good future with AI.
Build consensus. Find common ground and shared concerns to bring people together, rather than emphasizing disagreements[citation needed]. A unified front will be needed to address the existential risks of advanced AI. While we may have different views on capabilities and timelines, we can work together toward a future of beneficial AI.
Conclusion
Capacities denial poses a real threat to AI safety, as it can lead to lack of awareness, inadequate preparation, and inability to proactively address risks. However, by applying lessons from other forms of science denial and communicating about AI with empathy and facts, we can work to counter capabilities denial and build a future where humans and AI systems thrive together. The time for action is now. We must get ahead of existential catastrophe, whatever capabilities current and future AI may hold.

I actually ran it a couple of times (which was hard to keep track of due to the current tech issues). There were more complex versions (like versions that went over analogies involving specific climate change organizations), but I liked this version better. "bland RLHF'd PR suggestions" are useful when the problem involves PR and humans.

I would've probably went into more detail about "call these people science deniers" thing. It frustrates me that public is thinking that those denying capabilities are the experts on capabilities. But GPT-4's suggestions are probably more actionable than mine. It also seemed to have higher signal-to-noise ratio than something I would write.

Hmm. To clarify, I mean that the suggestions from GPT4 feel low on substance about how to clarify while maintaining reputation, and are focused on PR instead

I think capabilities denial is basically a PR problem. This is different from denying the importance of the alignment problem; people are peddling pseudo-scientific explanations about why the AIs "seem" capable.

By contrast, I think alignment is still fuzzy enough that there is no scientific consensus, so techniques for dealing with science denial are less applicable.

PR and communication are not the same thing. It seems to me to be a communication problem; maintaining a positive affect for a brand is not the goal, which it would need to be in order for the term "PR" to be appropriate. The difference between reputation and PR is that, if communicating well in order to better explain a situation also happens to reduce the positive affect for the folks doing the communicating, then that's still a success; honesty and accurate affect must be the goal for a communication to be reputation-maintenance seeking.

This is really just scientific communication anyhow - the variable we want people to have more accurate models of is "what can ai do now, and what might it be able to do soon?" not anything about any human's intent or honor.

oh man. so, I agree this is a thing, and up to this point, it's great -

To address capabilities denial, we can draw on insights from the literature on science denial more broadly. Here are some specific lessons that may be helpful:

I'm really hesitant about the core suggestions in GPT4's version - they feel like bland RLHF'd PR suggestions. Dropping the intro into Claude+, [citation needed] added by me:

Focus on facts over values. Debates about AI capabilities are often polarized along ideological lines[citation needed], with different sides arguing for opposing visions of the future. While values and preferences are important to consider, we must ground these discussions in the facts about what AI can and cannot currently do. Present the evidence for AI capabilities in a clear, measured way.
Explain why capabilities matter. People may deny AI capabilities because they do not fully understand why they are concerning or how they relate to existential risks[citation needed]. We need to do a better job of articulating why capabilities like general reasoning, transfer learning, and unsupervised learning could be dangerous if misaligned. Help people connect the dots.
Address the root causes. Do not attack the individuals but instead focus on the root causes of their beliefs. Gently question assumptions, cognitive biases, and ideological commitments to open minds and build understanding. Show empathy for their perspectives while also stating the facts.
Watch your own biases. It is important to acknowledge our own biases and avoid overstating the case for AI capabilities. While we should avoid capabilities denial, we must also be realistic and precise in how we discuss AI. Do not contribute to hype and doomsday scenarios, which can be counterproductive.
Emphasize responsibility. Science denial is often motivated by a desire to absolve responsibility or avoid difficult problems. Highlight how accurately assessing AI capabilities is crucial to taking responsibility and solving issues like alignment proactively. We all must do our part to ensure a good future with AI.
Build consensus. Find common ground and shared concerns to bring people together, rather than emphasizing disagreements[citation needed]. A unified front will be needed to address the existential risks of advanced AI. While we may have different views on capabilities and timelines, we can work together toward a future of beneficial AI.
Conclusion
Capacities denial poses a real threat to AI safety, as it can lead to lack of awareness, inadequate preparation, and inability to proactively address risks. However, by applying lessons from other forms of science denial and communicating about AI with empathy and facts, we can work to counter capabilities denial and build a future where humans and AI systems thrive together. The time for action is now. We must get ahead of existential catastrophe, whatever capabilities current and future AI may hold.

Hmm. To clarify, I mean that the suggestions from GPT4 feel low on substance about how to clarify while maintaining reputation, and are focused on PR instead

By contrast, I think alignment is still fuzzy enough that there is no scientific consensus, so techniques for dealing with science denial are less applicable.

LESSWRONG
LW

LESSWRONG
LW

6

Capabilities Denial: The Danger of Underestimating AI

6

Introduction

Capabilities Denial

Lessons from Other Forms of Science Denial

Conclusion

6

Conclusion

6

Conclusion