Thanks Nicholas for raising this issue. I think your framing overcomplicates the crux:
the root cause of an inspiring future with AI won't be international coordination, but national self-interest.
Once the US and Chinese leadership serves their self-interest by preventing uncontrollable AGI at home, they have a shared incentive to coordinate to do the same globally. The reason that the self-interest hasn't yet played out is that US and Chinese leaders still haven't fully understood the game theory payout matrix: the well-funded and wishful-thinking-fueled disinformation campaign arguing that Turing, Hinton, Bengio, Russell, Yudkowski et al are wrong (that we're likely to figure out to control AGI in time if we "scale quickly") is massively successful. That success is unsurprising, given how successful the disinformation campaigns were for, e.g., tobacco, asbesthos and leaded gasoline – the only difference is that the stakes are much higher now.
Wow – I'd never seen that chillingly prophetic passage! Moloch for the win.
"The only winning move is not to play."
A military-AGI-industrial-complex suicide race has been my worst nightmare since my teens.
But I didn't expect "the good guys" in the Anthropic leadership pouring gasoline on it.
Salut Boghdan!
I'm not sure this line of reasoning has the force some people seem to assume. What would you expect the results of hypothetical, similar referendums would have been e.g. before the industrial revolution and before the agricultural revolution, on those changes?
I'm somewhat horrified by this comment. This hypothetical referendum is about replacing all biological humans by machines, whereas the agricultural and industrial revolutions did no such thing. If you believes in democracy, then why would you allow a tiny minority to decide to kill off everyone else against their will? I find such lackadaisical support for democratic ideals particularly hypocritical from people who say we should rush to AGI to defend democracy against authoritarian governments,
Right, Tamsin: so reasonable safety standards would presumably ban fully unrestricted superassistants too, but allow more limited assistants that could still be incredibly helpful. I'm curious what AI safety standards you'd propose – it's not a hypothetical question, since many politicians would like to know.
Thanks Noosphere89 for your long and thoughtful comment! I don't have time to respond to everything before putting my 1-year-old to bed, but here are some brief comments.
1) Although I appreciate that you wrote out a proposed AGI alignment plan, I think you'll agree that it contains no theorems or proofs, or even quantitative risk bounds. Since we insist on quantitative risk bounds before allowing much less dangerous technology such as airplanes and nuclear reactors, my view is that it would be crazy to launch AGI without quantitative risk bounds - especially when you're dealing with a super-human mind that might actively optimize against vulnerabilities of the alignment system. As you know, rigorously ensuring retained alignment under recursive self-improvement is extremely difficult. For example, MIRI had highly talented researchers work on this for many years without completing the tast.
2) The point you make about fear of 1984 vs fear of extinction. However, if someone assicns P(1984) >> P(extinction) and there's no convincing plan for preventing AGI loss-of-control, then I'd argue that it's still crazy of them (or for China) to build AGI. So they'd both forge ahead with increasingly powerful yet controllable tool AGI, presumably remaining in a today's mutually-asssured destruction paradign where neither has an incentive to try to conquer the other.
I have yet to hear a version of the "but China!" argument that makes any sense if you believe that the AGI race is a suicide race rather than a traditional armsrace. Those I hear making it are usually people who also dismiss the AGI extinction risk. If anything, the current Chinese leadership seems more concerned about AI xrisk than Western leaders.
Excellent question, Gordon! I defined tool AI specifically as controllable, so AI without a quantitative guarantee that it's controllable (or "safe", as you write) wouldn't meet the safety standards and its release would be prohibited. I think it's crucial that, just as for aviation and pharma, the onus is on the companies rather than the regulators to demonstrate that products meet the safety standards. For controllable tools with great potential for harm (say plastic explosives), we already have regulatory approaches for limiting who can use them and how. Analogously, there's discussion at the UNGA this week about creating a treaty on lethal autonomous weapons, which I support.
I indeed meant only "worst so far", in the sense that it would probably kill more people than any previous disaster.
I'm typing this from New Zealand.
Important clarification: Neither here nor in the twitter post did I advocate appeasement or giving in to blackmail. In the Venn diagram of possible actions, there's certainly a non-empty intersection of "de-escalation" and "appeasement", but they're not the same set, and there are de-escalation strategies that don't involve appeasement but might nonetheless reduce nuclear war risk. I'm curious: do you agree that halting (and condemning) the following strategies can reduce escalation and help cool things down without giving in to blackmail?
I think it would reduce nuclear war risk if the international community strongly condemned 1-7 regardless of which side did it, and I'd like to see this type of de-escalation immediately.
Thanks Akash! As I mentioned in my reply to Nicholas, I view it as flawed to think that China or the US would only abstain from AGI because of a Sino-US agreement. Rather, they'd each unilaterally do it out of national self-interest.
Once the US and Chinese leadership serves their self-interest by preventing uncontrollable AGI at home, they have a shared incentive to coordinate to do the same globally. The reason that the self-interest hasn't yet played out is that US and Chinese leaders still haven't fully understood the game theory payout matrix: the well-funded and wishful-thinking-fueled disinformation campaign arguing that Turing, Hinton, Bengio, Russell, Yudkowski et al are wrong (that we're likely to figure out to control AGI in time if we "scale quickly") is massively successful. That success is unsurprising, given how successful the disinformation campaigns were for, e.g., tobacco, asbesthos and leaded gasoline – the only difference is that the stakes are much higher now.