Me and @Roman_Yampolskiy published a piece on AI xrisk in a Chinese academic newspaper: http://www.cssn.cn/skgz/bwyc/202303/t20230306_5601326.shtml
We were approached after our piece in Time and asked to write for them (we also gave quotes for another provincial newspaper). I have the impression (I've also lived and worked in China) that leading Chinese decision makers and intellectuals (or perhaps their children) read Western news sources like Time, NYTimes, Economist, etc. AI xrisk is currently probably mostly unknown in China, and if stumbled upon people...
I agree that raising awareness about AI xrisk is really important. Many people have already done this (Nick Bostrom, Elon Musk, Stephen Hawking, Sam Harris, Tristan Harris, Stuart Russell, Gary Marcus, Roman Yampolskiy (I coauthored one piece with him in Time), and Eliezer Yudkowsky as well).
I think a sensible place to start is to measure how well they did using surveys. That's what we've done here: https://www.lesswrong.com/posts/werC3aynFD92PEAh9/paper-summary-the-effectiveness-of-ai-existential-risk
More comms research from us is coming up, and I know a few others are doing the same now.
You could pick corporations as an example of coordinated humans, but also e.g. Genghis Khan's hordes. And they did actually take over. If you do want to pick corporations, look e.g. at East India companies that also took over parts of the world.
Funny, I had exactly the same thought and was just considering writing a short post on it. So I agree and I do think it's a very relevant model update. Some people probably already updated before. I also agree though with your second point about Auto-GPT and similar peripherals. So it looks like we're in a not-too-fast take-off with humans pretty solidly in the loop for now?
As long as there's no autonomous self-improvement of the core model, maybe an existential event could look like this: GPT-X gets trained and released, open sourcers build unsafe periphe...
As co-author of one of the mentioned pieces, I'd say it's really great to see the AGI xrisk message mainstreaming. It doesn't nearly go fast enough, though. Some (Hawking, Bostrom, Musk) have already spoken out about the topic for close to a decade. So far, that hasn't been enough to change common understanding. Those, such as myself, who hope that some form of coordination could save us, should give all they have to make this go faster. Additionally, those who think regulation could work should work on robust regulation proposals which are currently lacki...
It's an interesting letter. If I were an AI, I would find a heartfelt appeal from every human that doesn't want to die much more convincing, though.
Hereby: I don't want to die, AI.
I'd say your first assumption is off. We actually researched something related. We asked people the question: "List three events, in order of probability (from most to least probable) that you believe could potentially cause human extinction within the next 100 years". I would say that if your assumption would be correct, they would say "robot takeover" or something similar as part of that top 3. However, >90% doesn't mention AI, robots, or anything similar. Instead, they typically say things like climate change, asteroid strike, or pandemic. So based o...
I see your point, but I think this is unavoidable. Also, I haven't heard of anyone who was stressing out much after our information.
Personally, I was informed (or convinced perhaps) a few years ago at a talk from Anders Sandberg from FHI. That did cause stress and negative feelings for me at times, but it also allowed me to work on something I think is really meaningful. I never for a moment regretted being informed. How many people do you know who say, I wish I hadn't been informed about climate change back in the nineties? For me, zero. I do kn...
AI safety researcher Roman Yampolskiy did research into this question and came to the conclusion that AI cannot be controlled or aligned. What do you think of his work?
Thank you for writing this post! I agree completely, which is perhaps unsurprising given my position stated back in 2020. Essentially, I think we should apply the precautionary principle for existentially risky technologies: do not build unless safety is proven.
A few words on where that position has brought me since then.
First, I concluded back then that there was little support for this position in rationalist or EA circles. I concluded as you did, that this had mostly to do with what people wanted (subjective techno-futurist desires), and less with what ...
Thanks for the post and especially for the peer-reviewed paper! Without disregarding the great non-peer-reviewed work that many others are doing, I do think it is really important to get the most important points peer-reviewed as well, preferably as explicit as possible (e.g. also mentioning human extinction, timelines, lower bound estimates, etc). Thanks as well for spelling out your lower bound probabilities, I think we should have this discussion more often, more structurally, and more widely (also with people outside of the AI xrisk community). I guess...
This is what we are doing with the Existential Risk Observatory. I agree with many of the things you're saying.
I think it's helpful to debunk a few myths:
- No one has communicated AI xrisk to the public debate yet. In reality, Elon Musk, Nick Bostrom, Stephen Hawking, Sam Harris, Stuart Russell, Toby Ord and recently William MacAskill have all sought publicity with this message. There are op-eds in the NY Times, Economist articles, YouTube videos and Ted talks with millions of views, a CNN item, at least a dozen books (including for a general audience), an...
I've made an edit and removed the specific regulation proposal. I think it's more accurate to just state that it needs to be robust, do as little harm as possible, and that we don't know yet what it should look like precisely.
I agree that it's drastic and clumsy. It's not an actual proposal, but a lower bound of what would likely work. More research into this is urgently needed.
Aren't you afraid that people could easily circumvent the regulation you mention? This would require every researcher and hacker, everywhere, forever, to comply. Also, many researchers are probably unaware that their models could start self-improving. Also, I'd say the security safeguards that you mention amount to AI Safety, which is of course currently an unsolved problem.
But honestly, I'm interested in regulation proposals that would be sufficiently robust while minimizing damage. If you have those, I'm all ears.
Thanks for the suggestion! Not sure we are going to have time for this, as it doesn't align completely with informing the public, but someone should clearly do this. Also great you're teaching this already to your students!
Perhaps all the more reason for great people to start doing it?
(4): I think regulation should get much more thought than this. I don't think you can defend the point that regulation would have 0% probability of working. It really depends on how many people are how scared. And that's something we could quite possibly change, if we would actually try (LW and EA haven't tried).
In terms of implementation: I agree that software/research regulation might not work. But hardware regulation seems much more robust to me. Data regulation might also be an option. As a lower bound: globally ban hardware development beyond 1990 lev...
I think you have to specify which policy you mean. First, let's for now focus on regulation that's really aiming to stop AGI, at least until safety is proven (if possible), not on regulation that's only focusing on slowing down (incremental progress).
I see roughly three options: software/research, hardware, and data. All of these options would likely need to be global to be effective (that's complicating things, but perhaps a few powerful states can enforce regulation on others - not necessarily unrealistic).
Most people who talk about AGI regulation seem t...
First, if there were a widely known argument about the dangers of AI, on which most public intellectual agreed.
This is exactly what we have piloted at the Existential Risk Observatory, a Dutch nonprofit founded last year. I'd say we're fairly successful so far. Our aim is to reduce human extinction risk (especially from AGI) by informing the public debate. Concretely, what we've done in the past year in the Netherlands is (I'm including the detailed description so others can copy our approach - I think they should):
Richard, thanks for your reply. Just for reference, I think this goes under argument 5, right?
It's a powerful argument, but I think it's not watertight. I would counter it as follows:
Minimum hardware leads to maximum security. As a lab or a regulatory body, one can increase safety of AI prototypes by reducing the hardware or amount of data researchers have access to.
My response to counterargument 3 is summarized in this plot, for reference: https://ibb.co/250Qgc9
Basically, this would only be an issue if postponement cannot be done until risks are sufficiently low, and if take-off would be slow without postponement intervention.
Interesting line of thought. I don't know who and how, but I still think we should already think about if it would be a good idea in principle.
Can I restate your idea as 'we have a certain amount of convinced manpower, we should use it for the best purpose, which is AI safety'? I like the way of thinking, but I still think we should use some of them for looking into postponement. Arguments:
- The vast majority of people is unable to contribute meaningfully to AI safety research. Of course all these people could theoretically do whatever makes most money and...
Thanks for that comment! I didn't know Bill McKibben, but I read up on his 2019 book 'Falter: Has the Human Game Begun to Play Itself Out?' I'll post a review as a post later. I appreciate your description of what the scene was like back in the 90s or so, that's really insightful. Also interesting to read about nanotech, I never knew these concerns were historically so coupled.
But having read McKibben's book, I still can't find others on my side of the debate. McKibben is indeed the first one I know who both recognizes AGI danger, and does not believe in a...
Thanks for your insights Adam. If every AGI researcher is in some sense for halting AGI research, I'd like to get more confirmation on that. What are their arguments? Would they also work for non-AGI researchers?
I can imagine the combination of Daniel's point 1 and 2 stops AGI researchers from speaking out on this. But for non-AGI researchers, why not explore something that looks difficult, but may have existential benefits?
I agree and thanks for bringing some nuance in the debate. I think that would be a useful path to explore.
I'm imagining an international treaty, national laws, and enforcement from police. That's a serious proposal.
I think a small group of thoughtful, committed, citizens can change the world. Indeed, it is the only thing that ever has.
I appreciate the effort you took in writing a detailed response. There's one thing you say in which I'm particularly interested, for personal reasons. You say 'I've been in or near this debate since the 1990s'. That suggests there are many people with my opinion. Who? I would honestly love to know, because frankly it feels lonely. All people I've met, so far without a single exception, are either not afraid of AI existential risk at all, or believe in a tech fix and are against regulation. I don't believe in the tech fix, because as an engineer, I've seen ...
That goes under Daniel's point 2 I guess?
Not to speak for Dagon, but I think point 2 as you write it is way, way too narrow and optimistic. Saying "it would be rather difficult to get useful regulation" is sort of like saying "it would be rather difficult to invent time travel".
I mean, yes, it would be incredibly hard, way beyond "rather difficult", and maybe into "flat-out impossible", to get any given government to put useful regulations in place... assuming anybody could present a workable approach to begin with.
It's not a matter of going to a government and making an argument. For one thing,...
Why do you think we need to find out 1 before trying? I would say, if it is indeed a good idea to postpone, then we can just start trying to postpone. Why would we need to know beforehand how effective that will be? Can't we find that out by trial and error if needed? Worst case, we would be postponing less. That is of course, as long as the flavor of postponement does not have serious negative side effects.
Or rephrased, why do these brakes need to be carefully calibrated?
Thanks for your comments, these are interesting points. I agree that these are hard questions and that it's not clear that policymakers will be good at answering them. However, I don't think AI researchers themselves are any better, which you seem to imply. I've worked as an engineer myself and I've seen that when engineers or scientists are close to their own topic, their judgement of any risks/downsides of this topic does not become more reliable, but less. AGI safety researchers will be convinced about AGI risk, but I'm afraid their judgement of their o...
Thanks for your thoughts. Of course we don't know whether AGI will harm or help us. However I'm making the judgement that the harm could plausibly be so big (existential), that it outweighs the help (reduction in suffering for the time until safe AGI, and perhaps reduction of other existential risks). You seem to be on board with this, is that right?
Why exactly do you think interference would fail? How certain are you? I'm acknowledging it would be hard, but not sure how optimist/pessimist to be on this.
I have kind of a strong opinion in favor of policy intervention because I don't think it's optional. I think it's necessary. My main argument is as follows:
I think we have two options to reduce AI extinction risk:
1) Fixing it technically and ethically (I'll call the combination of both working out the 'tech fix'). Don't delay.
2) Delay until we can work out 1. After the delay, AGI may or may not still be carried out, depending mainly on the outcome of 1.
If option 1 does not work, of which there is a reasonable chance (it hasn't worked so far and we're not n...
I wouldn't say less rational, but more bipartisan, yes. But you're right I guess that European politics is less important in this case. Also don't forget Chinese politics, which has entirely different dynamics of course.
I think you have a good point as well that wonkery, think tankery, and lobbying are also promising options. I think they, and starting a movement, should be on a little list of policy intervention options. I think each will have its own merits and issues. But still, we should have a group of people actually starting to work on this, whatever the optimal path turns out to be.
It's funny, I heard that opinion a number of times before, mostly from Americans. Maybe it has to do with your bipartisan flavor of democracy. I think Americans are also much more skeptical of states in general. You tend to look to companies for solving problems, Europeans tend to look to states (generalized). In The Netherlands we have a host of parties, and although there are still a lot of pointless debates, I wouldn't say it's nearly as bad as what you describe. I can't imagine e.g. climate change solved without state intervention (the situation here i...
Don't get me wrong, I think institutes like FHI are doing very useful research. I think there should be a lot more of them, at many different universities. I just think what's missing in the whole X-risk scene is a way to take things out of this still fairly marginal scene and into the mainstream. As long as the mainstream is not convinced that this is an actual problem, efforts are always enormously going to lag mainstream AI efforts, with predictable results.
Maybe. But I actually currently think that the longer these issues stay out of the mainstream, the better. Mainstream political discourse is so corrupted; when something becomes politicized, that means it's harder for anything to be done about it and a LOT harder for the truth to win out. You don't see nuanced, balancing-risks-and-benefits solutions come out of politicized debates. Instead you see two one-sided, extreme agendas bashing on each other and then occasionally one of them wins.
(That said, now that I put it that way, maybe that's what we want for AI risk--but only if we get to dictate the content of one of the extreme agendas and only if we are likely to win. Those are two very big ifs.)
I know their work and I'm pretty sure there's no list on how to convince governments and corporations that AI risk is an actual thing.. PhDs are not the kind of people inclined to take any concrete action I think.
I agree and I think books such as Superintelligence have definitely decreased the x-risk chance. I think 'convincing governments and corporations that this is a real risk' would be a great step forward. What I haven't seen anywhere, is a coherent list of options how to achieve that, preferably ranked by impact. A protest might be up there, but probably there are better ways. I think making that list would be a great first step. Can't we do that here somewhere?
That makes sense and I think it's important that this point gets made. I'm particularly interested by the political movement that you refer to. Could you explain this concept in more detail? Is there anything like such a political movement already being built at the moment? If not, how would you see this starting?
I think actually 1+1 = ? is not really an easy enough goal, since it's not 100% sure that the answer is 2. Getting to 100% certainty (including what I actually meant with that question) could still be nontrivial. But let's say the goal is 'delete filename.txt'? Could be the trick is in the language..
Thanks again for your reply. I see your point that the world is complicated and a utility maximizer would be dangerous, even if the maximization is supposedly trivial. However, I don't see how an achievable goal has the same problem. If my AI finds the answer of 2 before a meteor hits it, I would say it has solidly landed at 100% and stops doing anything. Your argument would be true if it decides to rule out all possible risks first, before actually starting to look for the answer of the question, which would otherwise quickly be found. But since ruling ou...
Thanks for your insights. I don't really understand 'setting [easy] goals is an unsolved problem'. If you set a goal: "tell me what 1+1 is", isn't that possible? And once completed ("2!"), the AI would stop to self-improve, right?
I think this may contribute to just a tiny piece of the puzzle, however, because there will always be someone setting a complex or, worse, non-achievable goal ("make the world a happy place!"), and boom there you have your existential risk again. But in a hypothetical situation where you have your AGI in the lab, no-one else has, ...
Thanks Charlie! :)
They are asking for only one proposal, so I will have to choose one and am planning to work out that one. So I'm mostly asking about which idea you find most interesting, rather than about which one is the strongest proposal now - that will be worked out. But thanks a lot for your feedback so far - that helps!
AGI is unnecessary for an intelligence explosion
Many arguments state that it would require an AGI to have an intelligence explosion. However, it seems to me that the critical point for achieving this explosion is that an AI can self-improve. Which skills are needed for that? If we have hardware overhang, it probably comes down to the type of skills an AI researcher uses: reading papers, combining insights, doing computer experiments until new insights emerge, writing papers about them. Perhaps an AI PhD can weigh in on the actual skills needed. I'm ho...
Tune AGI intelligence by easy goals
If an AGI is provided an easily solvable utility function ("fetch a coffee"), it will lack the incentive to self-improve indefinitely. The fetch-a-coffee-AGI will only need to become as smart as a hypothetical simple-minded waiter. By creating a certain easiness for a utility function, we can therefore tune the intelligence level we want an AGI to achieve using self-improvement. The only way to achieve an indefinite intelligence explosion (until e.g. material boundaries) would be to program a utility function ma...
This is the exact topic I'm thinking a lot about, thanks for the link! I've wrote my own essay for a general audience but it seems ineffective. I knew about the Wait but why blog post, but there must be better approaches possible. What I find hard to understand is that there have been multiple best-selling books about the topic, but still no general alarm is raised and the topic is not discussed in e.g. politics. I would be interested in why this paradox exists, and also how to fix it.
Is there any more information about reaching out to a general ...
Thanks for writing the post! Strongly agree that there should be more research into how solvable the alignment problem, control problem, and related problems are. I didn't study uncontrollability research by e.g. Yampolskiy in detail. But if technical uncontrollability would be firmly established, it seems to me that this would significantly change the whole AI xrisk space, and later the societal debate and potentially our trajectory, so it seems very important.
I would also like to see more research into the nontechnical side of alignment: how aggregatable... (read more)