Me and @Roman_Yampolskiy published a piece on AI xrisk in a Chinese academic newspaper: http://www.cssn.cn/skgz/bwyc/202303/t20230306_5601326.shtml
We were approached after our piece in Time and asked to write for them (we also gave quotes for another provincial newspaper). I have the impression (I've also lived and worked in China) that leading Chinese decision makers and intellectuals (or perhaps their children) read Western news sources like Time, NYTimes, Economist, etc. AI xrisk is currently probably mostly unknown in China, and if stumbled upon people might have trouble believing it (as they have in the west). But if/when we're going to have a real conversation about AI xrisk in the west, I think the information will seep into China as well, and I'm somewhat hopeful that if this happens, it could perhaps prepare China for cooperation to reduce xrisk. In the end, no one wants to die.
Curious about your takes though, I'm of course not Chinese. Thanks for the write-up!
I agree that raising awareness about AI xrisk is really important. Many people have already done this (Nick Bostrom, Elon Musk, Stephen Hawking, Sam Harris, Tristan Harris, Stuart Russell, Gary Marcus, Roman Yampolskiy (I coauthored one piece with him in Time), and Eliezer Yudkowsky as well).
I think a sensible place to start is to measure how well they did using surveys. That's what we've done here: https://www.lesswrong.com/posts/werC3aynFD92PEAh9/paper-summary-the-effectiveness-of-ai-existential-risk
More comms research from us is coming up, and I know a few others are doing the same now.
You could pick corporations as an example of coordinated humans, but also e.g. Genghis Khan's hordes. And they did actually take over. If you do want to pick corporations, look e.g. at East India companies that also took over parts of the world.
Funny, I had exactly the same thought and was just considering writing a short post on it. So I agree and I do think it's a very relevant model update. Some people probably already updated before. I also agree though with your second point about Auto-GPT and similar peripherals. So it looks like we're in a not-too-fast take-off with humans pretty solidly in the loop for now?
As long as there's no autonomous self-improvement of the core model, maybe an existential event could look like this: GPT-X gets trained and released, open sourcers build unsafe peripherals around it, and one of these peripherals turns out to be sufficiently capable (perhaps by self-improving its peripheral capabilities) to take over the world. Or: GPT-X itself turns out to be powerful enough to take over the world and it does, without self-improvement, just after training.
I'm curious whether in the medium term, AI progress is talent-constrained (intelligence important, self-improvement important) or compute/data-constrained (AI doesn't add much, because intelligence doesn't add much, therefore no self-improvement).
As co-author of one of the mentioned pieces, I'd say it's really great to see the AGI xrisk message mainstreaming. It doesn't nearly go fast enough, though. Some (Hawking, Bostrom, Musk) have already spoken out about the topic for close to a decade. So far, that hasn't been enough to change common understanding. Those, such as myself, who hope that some form of coordination could save us, should give all they have to make this go faster. Additionally, those who think regulation could work should work on robust regulation proposals which are currently lacking. And those who can should work on international coordination, which is currently also lacking.
A lot of work to be done. But the good news is that the window of opportunity is opening, and a lot of people could work on this which currently aren't. This could be a path to victory.
It's an interesting letter. If I were an AI, I would find a heartfelt appeal from every human that doesn't want to die much more convincing, though.
Hereby: I don't want to die, AI.
I'd say your first assumption is off. We actually researched something related. We asked people the question: "List three events, in order of probability (from most to least probable) that you believe could potentially cause human extinction within the next 100 years". I would say that if your assumption would be correct, they would say "robot takeover" or something similar as part of that top 3. However, >90% doesn't mention AI, robots, or anything similar. Instead, they typically say things like climate change, asteroid strike, or pandemic. So based on this research, either people don't see a robot takeover scenario as likely at all, or they think timelines are very long (>100 yrs).
I do support informing the public more about the existence of the AI Safety community, though, I think that would be good.
Thank you!
I see your point, but I think this is unavoidable. Also, I haven't heard of anyone who was stressing out much after our information.
Personally, I was informed (or convinced perhaps) a few years ago at a talk from Anders Sandberg from FHI. That did cause stress and negative feelings for me at times, but it also allowed me to work on something I think is really meaningful. I never for a moment regretted being informed. How many people do you know who say, I wish I hadn't been informed about climate change back in the nineties? For me, zero. I do know a lot of people who would be very angry if someone had deliberately not informed them back then.
I think people can handle emotions pretty well. I also think they have a right to know. In my opinion, we shouldn't decide for others what is good or bad to be aware of.
AI safety researcher Roman Yampolskiy did research into this question and came to the conclusion that AI cannot be controlled or aligned. What do you think of his work?
https://www.researchgate.net/publication/343812745_Uncontrollability_of_AI
Thanks for writing the post! Strongly agree that there should be more research into how solvable the alignment problem, control problem, and related problems are. I didn't study uncontrollability research by e.g. Yampolskiy in detail. But if technical uncontrollability would be firmly established, it seems to me that this would significantly change the whole AI xrisk space, and later the societal debate and potentially our trajectory, so it seems very important.
I would also like to see more research into the nontechnical side of alignment: how aggregatable are human values of different humans in principle? How to democratically control AI? How can we create a realistic power sharing mechanism for controlling superintelligence? Do we have enough wisdom for it to be a good idea if a superintelligence does exactly what we want, even assuming aggregatability? Could CEV ever fundamentally work? According to which ethical systems? These are questions that I'd say should be solved together with technical alignment before developing AI with potential take-over capacity. My intuition is that they might be at least as hard.