As Eliezer Yudkowsky thinks, I am afraid that AGI will by default end humanity. I also agree that there is only a point in making an aligned AGI if that AGI is powerful enough to block all following AGIs (the pivotal act). If not, an unaligned AGI is sooner or later going to emerge, and that AGI is likely to win over aligned AGIs because it's more efficient. Evolutionary pressure works against us. I also agree that the chance we succeed in doing this, is small.

However, unlike Yudkowsky, I think there might be a non-vanishing chance of doing what he calls the pivotal act, and I would call AGI regulation, before the first AGI, instead of after. This regulation should be decided democratically and be robust, while doing as little economic damage as possible (this might still be a significant amount). The public will likely accept this regulation if it wholeheartedly believes that the alternative might be human extinction. A deal between the US and China could be sufficient. This deal would be achievable, as long as elites and the public in these countries believe that the alternative is likely to be extinction. Note as well that the result would likely be the same as after Yudkowsky's pivotal act, since an AGI might not be able to do anything more than only this act (since aligning just this act is already difficult enough).

The big problem here is not coordination, but communication. For this route to work, we need elites and regular people, at least roughly 50%, to wholeheartedly believe that AI could cause human extinction. Convincing people about this is hard. There are a few reasons for that. First, science fiction has made fun of it for decades, meaning there is a significant social penalty for anyone bringing up the idea in a serious context. Few things are more effective at making someone shut up than the fear of being ridiculous, and therefore information doesn't spread nearly as fast as it otherwise would. Second, there is no scientific consensus. The majority of AI academics (although not the AI safety ones) dismisses the idea (the scifi effect, conflicts of interest, and undue conservatism likely all play a role). And third, there are the biases that have been described by Yudkowsky and others before.

Basically all forms of AI xrisk communication have been tried, including newspaper articles, books, scary youtube videos, TED talks, and documentaries. There is no single communication method that is going to convince everyone in one go. But still, real progress can be made.

In the last year, the organization I have founded has piloted communication to a general audience in the Netherlands. With a self-funded team of only about 3 FTE, we have succeeded in getting four pieces in major newspapers published. This meant increasing total coverage of the topic by more than 25%. Also, after our intervention, a leading opinion maker, who was skeptical about the topic before, started to write about it (text in Dutch). I think that in a few years, we could really change the public opinion on the topic. In my model, achieving this in the US and China would give us a real shot at globally effective AGI regulation. And that could mean saving us.

If a small self-funded organization like ours can do this, imagine what ten or fifty large AI xrisk communication organizations, well-funded and staffed with people who are great at communication, active in the most impactful regions, could do.

What we need, are people who want to help. Not by staffing our org, we don't have problems attracting talent. Also not primarily by funding us (although we could productively spend extra money). But mostly by starting great AI xrisk communication initiatives and organizations themselves, much better ones than ours. Because if we all work on this in earnest, we have a real chance.

New to LessWrong?

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 7:33 PM

I do put a fair amount of my hope in someting-like-this-working, but I feel a bunch anxiety about this post because I think a lot of efforts in this space will end up counterproductive.

Perhaps all the more reason for great people to start doing it?

If you want colleges professors to discuss AI risks (as I do in my economics of future tech class at Smith College) you should come up with packets of material professors in different disciplines teaching at different levels could use.

Thanks for the suggestion! Not sure we are going to have time for this, as it doesn't align completely with informing the public, but someone should clearly do this. Also great you're teaching this already to your students!

The regulation you mention sounds very drastic & clumsy to my ears. I'd suggest starting by proposing something more widely acceptable, such as regulating highly effective self modifying software that lacks security safeguards.

I've made an edit and removed the specific regulation proposal. I think it's more accurate to just state that it needs to be robust, do as little harm as possible, and that we don't know yet what it should look like precisely.

I agree that it's drastic and clumsy. It's not an actual proposal, but a lower bound of what would likely work. More research into this is urgently needed.

Aren't you afraid that people could easily circumvent the regulation you mention? This would require every researcher and hacker, everywhere, forever, to comply. Also, many researchers are probably unaware that their models could start self-improving. Also, I'd say the security safeguards that you mention amount to AI Safety, which is of course currently an unsolved problem.

But honestly, I'm interested in regulation proposals that would be sufficiently robust while minimizing damage. If you have those, I'm all ears.