I'd say your first assumption is off. We actually researched something related. We asked people the question: "List three events, in order of probability (from most to least probable) that you believe could potentially cause human extinction within the next 100 years". I would say that if your assumption would be correct, they would say "robot takeover" or something similar as part of that top 3. However, >90% doesn't mention AI, robots, or anything similar. Instead, they typically say things like climate change, asteroid strike, or pandemic. So based on this research, either people don't see a robot takeover scenario as likely at all, or they think timelines are very long (>100 yrs).
I do support informing the public more about the existence of the AI Safety community, though, I think that would be good.
I see your point, but I think this is unavoidable. Also, I haven't heard of anyone who was stressing out much after our information.
Personally, I was informed (or convinced perhaps) a few years ago at a talk from Anders Sandberg from FHI. That did cause stress and negative feelings for me at times, but it also allowed me to work on something I think is really meaningful. I never for a moment regretted being informed. How many people do you know who say, I wish I hadn't been informed about climate change back in the nineties? For me, zero. I do know a lot of people who would be very angry if someone had deliberately not informed them back then.
I think people can handle emotions pretty well. I also think they have a right to know. In my opinion, we shouldn't decide for others what is good or bad to be aware of.
AI safety researcher Roman Yampolskiy did research into this question and came to the conclusion that AI cannot be controlled or aligned. What do you think of his work?
Thank you for writing this post! I agree completely, which is perhaps unsurprising given my position stated back in 2020. Essentially, I think we should apply the precautionary principle for existentially risky technologies: do not build unless safety is proven.
A few words on where that position has brought me since then.
First, I concluded back then that there was little support for this position in rationalist or EA circles. I concluded as you did, that this had mostly to do with what people wanted (subjective techno-futurist desires), and less with what was possible or the best way to reduce human extinction risk. So I went ahead and started the Existential Risk Observatory anyway, a nonprofit aiming to reduce human extinction risk by informing the public debate. We think public awareness is essentially the bottleneck for effective risk reduction, and we hope more awareness will lead to increased amounts of talent, funding, institutes, diversity, and robustness for AI Safety, and increased support for constructive regulation. This can be in the form of software, research, data, or hardware regulation, with each having their own advantages and disadvantages. Our intuition is that with 50% awareness, countries should be able to implement some combination of the above that would effectively reduce AI existential risk, while trying to keep economic damage to a minimum (an international treaty may be needed, or a US-China deal, or using supply chain leverage, or some smarter idea). To our knowledge, no-one has worked out a detailed regulation proposal for this (perhaps this comes kind of close). If true, we think that's embarrassing and regulation proposals should be worked out (and this work should be funded) with urgency. If there are regulation proposals which are not shared, we think people should share them and be less infohazardy about it.
So how did informing the societal debate go so far?
We started from a super crappy position: self-funded, hardly any connection to the xrisk space (that was also partially hostile to our concept), no media network to speak of, located in Amsterdam, far from everything. I had only some founding experience with a previous start-up. Still, I have to say that on balance, things went better than expected:
We think that if we can do this, many more people can. Raising awareness is constrained by many things, but most of all by manpower. Although there are definitely qualities that makes you better at this job (xrisk expertise, motivation, intelligence, writing and communication skills, management skills, network), you don't need to be a super genius or have a very specific background to do communication. Many in the EA and rationalist communities who would love to do something about AI xrisk but aren't machine learning experts could work in this field. With only about 3 FTE, I'm positive our org can inform millions of people. Imagine what dozens, hundreds, or thousands of people working in this field could achieve.
If we would all agree that AI xrisk comms is a good idea, I think humanity would have a good chance of making it through this century.
Thanks for the post and especially for the peer-reviewed paper! Without disregarding the great non-peer-reviewed work that many others are doing, I do think it is really important to get the most important points peer-reviewed as well, preferably as explicit as possible (e.g. also mentioning human extinction, timelines, lower bound estimates, etc). Thanks as well for spelling out your lower bound probabilities, I think we should have this discussion more often, more structurally, and more widely (also with people outside of the AI xrisk community). I guess I'm also in the same ballpark regarding the options and numbers (perhaps a bit more optimistic).
"3.1.1. Practical laws exist which would, if followed, preclude dangerous AI. 100% (recall this is optimistically-biased, but I do tentatively think this is likely, having drafted such a law)."
Can you share (a link to) the law you drafted?
This is what we are doing with the Existential Risk Observatory. I agree with many of the things you're saying.
I think it's helpful to debunk a few myths:
- No one has communicated AI xrisk to the public debate yet. In reality, Elon Musk, Nick Bostrom, Stephen Hawking, Sam Harris, Stuart Russell, Toby Ord and recently William MacAskill have all sought publicity with this message. There are op-eds in the NY Times, Economist articles, YouTube videos and Ted talks with millions of views, a CNN item, at least a dozen books (including for a general audience), and a documentary (incomplete overview here). AI xrisk communication to the public debate is not new. However, the public debate is a big place and when compared to e.g. climate, coverage of AI xrisk is still minimal (perhaps a few articles per year in a typical news outlet, compared to dozens to hundreds for climate).
- AI xrisk communication to the public debate is easy, we could just 'tell people'. If you actually try this, you will quickly find out public communication, especially of this message, is a craft. If you make a poor quality contribution or your network is insufficient, it will probably never make it out. If your message does make it out, it will probably not be convincing enough to make most media consumers believe AI xrisk is an actual thing. It's not necessarily easier to convince a member of the general public of this idea than it is to convince an expert, and we can see from the case of Carmack and many others how difficult this can be. Arguably, LW and EA are the only places where this has really been successful so far.
- AI xrisk communication is really dangerous and it's easy to irreversibly break things. As can easily be seen from the wealth of existing communication and how little that did, it's really hard to move the needle significantly on the topic. That cuts both ways: it's, fortunately, not easy to really break something with your first book or article, simply because it won't convince enough people. That means there's some room to experiment. However, it's also, unfortunately, fairly hard to make significant progress here without a lot of time, effort, and budget.
We think communication to the public debate is net positive and important, and a lot of people could work on this who could not work on AI alignment. There is an increasing amount of funding available as well. Also, despite the existing corpus, the area is still neglected (we are to our knowledge the only institute that specifically aims to work on this issue).
If you want to work on this, we're always available for a chat to exchange views. EA is also starting to move in this direction, good to compare notes with them as well.
I've made an edit and removed the specific regulation proposal. I think it's more accurate to just state that it needs to be robust, do as little harm as possible, and that we don't know yet what it should look like precisely.
I agree that it's drastic and clumsy. It's not an actual proposal, but a lower bound of what would likely work. More research into this is urgently needed.
Aren't you afraid that people could easily circumvent the regulation you mention? This would require every researcher and hacker, everywhere, forever, to comply. Also, many researchers are probably unaware that their models could start self-improving. Also, I'd say the security safeguards that you mention amount to AI Safety, which is of course currently an unsolved problem.
But honestly, I'm interested in regulation proposals that would be sufficiently robust while minimizing damage. If you have those, I'm all ears.
Thanks for the suggestion! Not sure we are going to have time for this, as it doesn't align completely with informing the public, but someone should clearly do this. Also great you're teaching this already to your students!
It's an interesting letter. If I were an AI, I would find a heartfelt appeal from every human that doesn't want to die much more convincing, though.
Hereby: I don't want to die, AI.