7 Alignment is hard. Communicating that, might be harder

1st Sep 2022

4 min read

7

Crossposted from the EA Forum: https://forum.effectivealtruism.org/posts/PWKWEFJMpHzFC6Qvu/alignment-is-hard-communicating-that-might-be-harder

Note: this is my attempt to articulate why I think it's so difficult to discuss issues concerning AI safety with non-EAs/Rationalists, based on my experience. Thanks to McKenna Fitzgerald for our recent conversation about this, among other topics.

The current 80,000 Hours list of the world's most pressing problems ranks AI safety as the number one cause in the highest priority area section. And yet, it's a topic never discussed in the news. Of course, that's not because journalists and reporters mind talking about catastrophic scenarios. Most professionals in the field, are perfectly comfortable talking about climate change, wars, pandemics, wildfires, etc. So what is it about AI safety that doesn't make it a legitimate topic for a panel on TV?

The EA consensus is roughly that being blunt about AI risks in the broader public would cause social havoc. And this is understandable; the average person seems to interpret the threats from AI either as able to provoke socio-economic shifts similar to those that occurred because of novel technologies during the Industrial Revolution (mostly concerning losing jobs), or as violently disastrous as in science fiction films (where e.g., robots take over by fighting wars and setting cities on fire).

If that's the case, taking seriously what Holden Karnofsky describes in The Most Important Century as well as what many AI timelines suggest (i.e., that humanity might be standing at a very crucial point in its trajectory) could easily be interpreted in ways that would lead to social collapse just by talking about what the problem concerns. Modern AI Luddites would potentially form movements to "prevent the robots from stealing their jobs". Others would be anxiously preparing to physically fight and so on.

But, even if the point about AGI doesn't get misinterpreted, if, in other words, all the technical details could be distilled in a way that made the arguments accessible to the public, it's very likely that it would trigger the same social and psychological mechanisms that would result in chaos manifested in various ways (for example, stock market collapse because of short timelines, people breaking laws and moral codes since nothing matters if the end is near, etc.).

From Eliezer Yudkowsky's twitter account @ESYudkowsky.

From that conclusion, it makes sense to not want alignment in the news. But what about all the people that would understand and help solve the problem if only they knew about it and how important it is? In other words, shouldn't the communication of science also serve the purpose of attracting thinkers to work on the problem? This looks like the project of EA so far; it's challenging and sometimes risky itself, but the community seems to overall have a good grasp of its necessity and to strategize accordingly.

Now, what about all the non-EA/Rationalist circles that produce scholarly work but refuse to put AI ethics/safety in the appropriate framework? Imagine trying to talk about alignment and your interlocutor thinks that Bostrom's paperclips argument is unconvincing and too weird to be good analytic philosophy, that it must be science fiction. There are many background reasons why they may think that. The friction this resistance and sci-fi-labelling creates is unlikely to be helpful if we are to deal with short timelines and the urgency of value alignment. What's also not going to be helpful is endorsing the conceptual, metalinguistic disputes some philosophers of AI are engaging with (for instance, trying to define "wisdom" and arguing that machines will need practical wisdom to be ethical and helpful to humans). To be fair, however, I must note that the necessity to teach machines how to be moral was emphasized by certain philosophers (such as Colin Allen) early on in the late 2000s.

Moreover, as of now, the state of AI ethics seems to be directed towards questions that are important, but might give the impression that misalignment merely consists of biased algorithms that discriminate among humans based on their physical characteristics, as it often happens for example, with AI image creation systems that don't take into account racial or gender diversity in the human population. To be clear, I'm neither downplaying the seriousness of this as a technical problem nor its importance for the social domain. I just want to make sure that the AI ethics agenda prioritizes the different risks and dangers in a straightforward way, i.e., existential risks that accompany the development of this technology deserve more effort and attention than non-existential risks.

From Kerry Vaughan's twitter account @KerryLVaughan.

In conclusion, here I list some reasons why I think AI safety is so difficult to talk about:

It's actually weird to think about these issues and it can mess with one's intuitions about how science and technology progress; the amount of time each of us spends on earth is too short to be able to grasp how fast or slow technological advancements took place in the past and to project that into the not-so-distant future.
This weirdness entails significant uncertainties at different levels (practical, epistemic, moral). Uncertainty is by itself an uncomfortable feeling that people usually try to repel in various ways (e.g., by rationalizing).
It's a very new area of organized research; many of the AI safety teams/organizations have been around for only a couple of years.
There isn't a lot of work published in peer-reviewed journals (which is a turn-off for many academics who don't give credibility to blog posts, despite the fact that many of them are well-researched and technically rigorous).
Narratives about future catastrophic scenarios exist in most cultures and eras. The belief that "our century is the most important one" isn't special to our time. If people were wrong about their importance in the past, why would our case be any different?
There's a widespread reluctance to take seriously what our best science takes seriously (which is not limited to issues in AI).
There's a broader crisis in science communication and journalism that creates skepticism towards research communities.
There are persistent open questions in the philosophy and theory of science such as the demarcation problem, what makes a claim scientific, how falsifiability works, and many more.

Distillation & PedagogyAI RiskPractice & Philosophy of ScienceAI

Frontpage

7

Alignment is hard. Communicating that, might be harder

5Phil Tanny

4Duncan Sabien (Inactive)

New Comment

8 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:29 AM

[-]Phil Tanny3y50

The current 80,000 Hours list of the world's most pressing problems ranks AI safety as the number one cause in the highest priority area section.

AI safety is not the world's most pressing problem. It is a symptom of the world's most pressing problem, our unwillingness and/or inability to learn how to manage the pace of the knowledge explosion.

Our outdated relationship with knowledge is the problem. Nuclear weapons, AI, genetic engineering and other technological risks are symptoms of that problem. EA writers insist on continually confusing sources and symptoms.

To make this less abstract, consider a factory assembly line. The factory is the source. The products rolling off the end of the assembly line are the symptoms.

EA writers (and the rest of the culture) insist on focusing on each product as it comes off the end of the assembly line, while the assembly line keeps accelerating faster and faster. While you're focused on the latest shiny product to emerge off the assembly line, the assembly line is ramping up to overwhelm you with a tsunami of other new products.

[-]Duncan Sabien (Inactive)3y42

(I clicked through to see your other comments after disagreeing with one. Generally, I like your comments!)

I think that EA writers and culture are less "lost" than you think, on this axis. I think that most EA/rationalist/ex-risk-focused people in this subculture would basically agree with you that the knowledge explosion/recursive acceleration of technological development is the core problem, and when they talk about "AI safety" and so forth, they're somewhat shorthanding this.

Like, I think most of the people around here are, in fact, worried about some of the products rolling off the end of the assembly line, but would also pretty much immediately concur with you that the assembly line itself is the root problem, or at least equally important.

I can't actually speak for everybody, of course, but I think you might be docking people more points than you should.

[-]Phil Tanny3y10

Hi Duncan, thanks for engaging.

I think that EA writers and culture are less "lost" than you think, on this axis. I think that most EA/rationalist/ex-risk-focused people in this subculture would basically agree with you that the knowledge explosion/recursive acceleration of technological development is the core problem

Ok, where are their articles on the subject? What I see so far are a ton of articles about AI, and nothing about the knowledge explosion unless I wrote it. I spent almost all day every day for a couple weeks on the EA forum, and observed the same thing there.

That said, I'm here because the EA community is far more interested in X risk than the general culture and the vast majority of intellectual elites, and I think that's great. I'm hoping to contribute by directing some attention away from symptoms and towards sources. This is obviously a debatable proposition and I'm happy to see it debated, no problem.

[-]nem3y53

You captured this in your post, but for me it really comes down to people dismissing existential fears as scifi. It's not more complicated than "Oh you've watched one too many Terminator movies". What we need is for several well-respected smart figureheads to say "Hey, this sounds crazy, but it really is the biggest threat of our time. Bigger than climate change, bigger than biodiversity loss. We really might all die if we get this wrong. And it really might happen in our lifetimes."

If I could appeal to authority when explaining this to friends, it would go over much better.

[-]DialecticEel3y30

"The EA consensus is roughly that being blunt about AI risks in the broader public would cause social havoc."

I find this odd and patronising to the general public. Why would this not also apply to climate change? Climate change is also a not-initially-obvious threat, yet the bulk of the public now has a reasonable understanding and it's driven a lot of change.

Or would nuclear weapons be a better analogy? Then at least nuclear weapons being publicly understood brought gravity to the conversation. Or could part of the reason to avoid public awareness be avoiding having to bear the weight of that kind of responsibility on our consciences? If the public is clueless, we appear pro-active. If the public is knowledgeable, we appear unprepared and the field of AI reckless, which we are and it is.

Also, lesswrong is a public forum. Eliezer's dying with dignity post was definitely newsworthy for example. Is it even accurate to suggest that we have significant control over the spread of these ideas in the public conciousness at the moment as there is so little attention on it, and we don't control the sorting functions of these media platforms?

[-]Raemon3y84

I find this odd and patronising to the general public. Why would this not also apply to climate change? Climate change is also a not-initially-obvious threat, yet the bulk of the public now has a reasonable understanding and it's driven a lot of change.

One of the specific worries is that climate change is precisely an example of something that got politicized, and now... half of politicians (at least in the US) sort of "have" to be opposed to doing anything about it, because that's what The Other Team made into their talking point.

[-]DialecticEel3y21

I see, that's a great point, thanks for your response. It does seem realistic that it would become political, and it's clear that a co-ordinated response is needed.

On that note I think it's a mistake to neglect that our epistemic infrastructure optimises for profit which is an obvious misalignment now. Like facebook and google are already optimising for profit at the expense of civil discourse, they are already misaligned and causing harm. Only focusing on the singularity allows tech companies to become even more harmful, with the vague promise that they'll play nice once they are about to create superintelligence.

Both are clearly important and the control problem specifically deserves a tonne of dedicated resources, but in addition it would be good to have some effort on getting approximate alignment now or at least better than profit maximisation. This obviously wouldn't make progress on the control problem, but it might help society move to a state where it is more likely to do so.

[-]Kaj_Sotala3y21

And yet, it's a topic never discussed in the news. [...] The EA consensus is roughly that being blunt about AI risks in the broader public would cause social havoc.

I'm confused by this, haven't people have been blunt about AI risks in public ever since Eliezer first woke up to the topic, and then Bostrom had his book (which got a lot of media attention) and so on? And I don't think there's any EA consensus about these things having been bad?

It seems to me that AI risk does get discussed in the news when something happens that's in the kind of form that's news-worthy; e.g. "Oxford philosopher writes a book warning about superintelligence destroying humanity, endorsed by Bill Gates and Elon Musk" was the kind of a thing that the broader public could easily understand and which is usually covered by news, and therefore it got a lot of publicity. I think that the problem is mostly that to be covered by the news, it has to actually look new; and since the fact that there are academics who worry about AI risk has already been covered, there would need to be something that the common person would experience as a significant update to that previous state of affairs.

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

7

Alignment is hard. Communicating that, might be harder

7

7