Epistemic Status: I know basically nothing about any of this, or at least no more than any other LessWronger, I just happened to listen to some podcasts about AI strategy recently and decided to try my hand.

Epistemic Effort: Probably about an hour of thought cumulatively, plus maybe two hours to write this up

Global Coordination

AI, like many other x-risks, is in part a global coordination problem. As I see it, there are two main subproblems here: the problem of "where to go" (i.e. how to coordinate and what to coordinate on), and the problem of "how to get there from here" (taking into account the inadequacies present in our actual, real-life political systems).

Putting this another way: there is a top-down way of looking at the problem, which is: if we imagine what the world would be like if we managed to globally coordinate on stopping AI risk, what would that look like? What coordination mechanisms would we have used, etc. And then there is a bottom-up way of looking at it, which is: there are certain people in the world who are already, right now, concerned about AI risk. What series of actions could those people perform that would ensure (or as close to "ensure" as we can muster) that AI risk would be mitigated? (Though see this post from Benquo for a forceful objection to this line of thinking that I don't yet know how to take into account.)

As far as I can tell, there are three classes of solutions to the bottom-up version of the problem:

  1. Work unilaterally, outside of states
  2. Get people who are already in positions of power within states to care about the problem (by which I mean be fully willing to implement their end of a solution. Merely believing it's important in the abstract doesn't count).
  3. Get people who already care about the problem into positions of power within states.

Solutions in class 1 may, of course, not be sufficient, if state coordination ends up being unavoidably necessary to solve the problem. Solutions in class 2 and 3 run into various problems described in Inadequate Equilibria; in particular, class 2 solutions face the "lemons problem." I don't (yet) have anything especially original to say about how to solve these problems (I'd be highly grateful for reading suggestions of places where people have proposed solutions/workarounds to the problems in Inadequate Equilibria, outside the book itself of course).

As for the top-down part of the problem, I see the following solutions:

  1. International treaty
  2. Weak supranational organization (the UN or another in a similar vein)
  3. Strong supranational organization (the EU but on a world scale) i.e. a supranational confederation
  4. One world nation e.g. supranational federalism

in rough order of ease of implementation. Solutions 3 or 4 (ignoring their political infeasibility) would be especially useful because they could solve not just AI risk, but also other x-risks that require global coordination, whereas if we solve AI risk by treaty, that makes little or no progress on other x-risks.

Actually, "fostering global coordination" seems to me like a good candidate for a high-impact cause area in its own right, as it attacks multiple x-risks at once. A lack of ability to internationally coordinate is a major factor increasing the likelihood of most x-risks (AI, climate change, bioterrorism, nuclear war, and maybe asteroid risk, though probably not non-anthropogenic pandemic risk or solar flares knocking out the power grid), not just AI, so working directly on methods of fostering our ability to globally coordinate is probably a high-impact cause area in itself, separate from work on particular x-risks. Though I should note that either Bryan Caplan or Robin Hanson (or maybe both; I don't have time to find the reference at the moment, I'll edit it in if I find it) has argued that pushing on increasing global coordination carries some risk of ending up with a global tyranny, an x-risk in itself.

Avoiding politicization

Maybe this is super obvious, but I don't think I've seen anyone else come out and say it: it's important that AI safety not become a politically polarized issue the way e.g. climate change has. It doesn't seem to be much of one yet, as neither party (sorry for U.S.-centrism) is talking about it basically at all (maybe this is why nobody has made this point), though I see Democrats talking about technology more than Republicans.

So, we need to make sure AI safety doesn't become politically polarized. How can we ensure this? Well, that depends whether you think we need to make sure it's widely discussed or not. On the one hand, I've seen it argued (I don't remember where or by who right now) that it might be dangerous to try to promote AI safety to a lay audience, because the message will almost certainly get distorted; if you think this is the way to go, then of course it's rather easy to make sure AI safety doesn't get polarized--just don't talk about it much to lay audiences or try to do public outreach about it. On the other hand, it seems likely that politicians will be a necessary component of any effort to globally coordinate around AI safety, and politicians need to focus to a large extent on the issues the public is concerned with in order to get reelected (I suspect this model is way too naive, but I don't know enough about the nitty-gritty of political science at the moment to make it better), so one way to make politicians care about AI safety is to get the public to care about AI safety. If this is the strategy you favor, then you have to balance making AI safety part of the national conversation with making sure it doesn't get politically polarized. But it seems like most issues in the public consciousness are also polarized to some extent. (This is another claim I'm highly unconfident about. Feel free to suggest counterexamples or models that contradict it; one counterexample I can think of off the top of my head is social media--it's quite salient in the national conversation, but not particularly partisan. I also believe the recent book Uncivil Agreement is largely an argument against my claim, but I haven't read it yet so I don't know for sure.) So this is an instance of a more general problem: getting politics to pay attention to issues that aren't polarized.

[After writing this section, a friend raised a ton of complications about whether polarization might be a good or bad thing, which I plan to write up in a separate post at some point.]

These are preliminary thoughts, posted in an exploratory spirit. Please do point out places where you think I've gotten something wrong; e.g. if you think one of my taxonomies is incomplete, it probably is, so I'd love if you'd point it out. I'd be especially grateful for reading suggestions, as I'm sure there are tons of places where I'm simply ignorant of relevant literature or of entire fields of study related to what I'm talking about (I have already gone through Allan Dafoe's AI strategy reading list and research agenda and extracted the readings I want to start with, although I'd be grateful if anyone has a copy of Huntington, Samuel P. “Arms Races: Prerequisites and Results.” Public Policy 8.1 (1958): 41–86. Can't find it online anywhere.)


9 comments, sorted by Click to highlight new comments since: Today at 2:47 AM
New Comment

Maybe this is super obvious, but I don’t think I’ve seen anyone else come out and say it: it’s important that AI safety not become a politically polarized issue the way e.g. climate change has.

I would be interested to know if you have any further thoughts on this since writing this post. Forecasting/influencing the social/political dynamics around AI risk seems to be a very important topic that I haven't been able to find much discussion about. (Yours seems to be the only post on this topic on both LW and EA Forum?)

My current thinking is that unless there is very fast takeoff or AI safety is really easy (neither of which I think we can be confident about), it seems inevitable that AI risk/safety will become politically polarized, because we'll need to stop or slow down AI capabilities development to give safety researchers enough time to do their work, which will hurt many people's (e.g., AI companies, people who derive benefits from existing AI and are naturally optimistic about technology, etc.) perceived interests, and they'll use political means to try to prevent that, thereby replaying the same dynamics that led climate change to become politicised.

So I wonder if you still think political polarization can be avoided, and how. Also, do you think climate scientists or activists could have prevented climate change from being politically polarized (and better achieved their goals as a result), and if so what mistake(s) did they make?

I had not thought about this again since writing the post, until your comment. (Yeah, that seems worrying if mine really is the only post. Though in my [limited] experience with the LW/EA forum search it's not easy to tell how reliable/comprehensive it is, so there may be related posts that aren't easy to find.)

I actually had a somewhat different model in mind for how polarization happens: something like "the parties tend to take opposite stances on issues. So if one party takes up an issue, this causes the other party to take up the opposite stance on that issue. So if one party starts to talk more about AI safety than the other, this would cause the other party to take the anti-AI-safety stance, therefore polarization." (Not saying it's a good model, but it's the model I had in the back of my mind.)

Your model of climate polarization seems mostly right to me. I was wondering, though, why it would lead to polarization in particular, rather than, say, everybody just not caring about climate change, or being climate skeptics. I guess the idea is something like: Some climate activists/scientists/etc got concerned about climate change, started spreading the word. Oil corps got concerned this would affect them negatively, started spreading a countermessage. It makes sense that this would lead to a split, where some people care about climate change and some people anti-care about it. But why would the split be along party lines (or even: along ideological lines)? Couple things to say here. First, maybe my model kicks in here: the parties tend to take opposite stances on issues. Maybe the dems picked up the climate-activist side, so the republicans picked up the big-oil side. But was it random which side picked up which? I guess not: the climate-activist case is quite caring-focused, which on Jon Haidt's model makes it a left issue, while the big-oil case is big-business, which is a republican-flavored issue. (Though the climate-activist case also seemingly has, or at least used to have, a pretty sizeable purity component, which is puzzling on Haidt's model.)

Applying some of this to the AI case: the activist stuff has already happened. However, the AI corporations (the equiv of big-oil in our climate story) haven't reacted in the same way big-oil did. At least public-facingly, they've actually recognized and embraced the concerns to a sizeable degree (see Google DeepMind, OpenAI, to some degree Facebook).

Though perhaps you don't think the current AI corps are the equivalent of big-oil; there will be some future AI companies that react more like big oil did.

Either way, this doesn't totally block polarization from happening: it could still happen via "one party happens to start discussing the issue before the other, the other party takes the opposite stance, voters take on the stances of their party, therefore polarization."


Hadn't thought of this till seeing your comment, but this might be an argument against Andrew Yang (though he's just dropped out)---if he had gotten the dem nomination, he might have caused Trump to take up the contrarian stance on AI, causing Trump's base to become skeptical of AI risk, therefore polarization (or some other variant on the basic "the dems take up the issue first, so the republicans take the opposite stance" story). This may still happen, though with him out it seems less likely.


I don't know if climate activists could have done anything differently in the climate case; don't know enough about the history of climate activism and how specifically it got as polarized as it is (though as I said, your model seems good at least from the armchair). This may be something worth looking into as a historical case study (though time is of the essence I suppose, since now is probably the time to be doing things to prevent AI polarization).

Thanks for prompting me to think about this again! No promises (pretty busy with school right now) but I may go back and write up the conversation with my friend that I mentioned in the OP, I probably still have the notes from it. And if it really is as neglected as you think, I may take up thinking about it again a bit more seriously.

But why would the split be along party lines

I think what you said is right, but there's a more fundamental dynamics behind it. Parties are coalitions, and when you join a coalition, you get support from others in that coalition for your interests, in exchange for your support for their interests. When I said "use political means to try to prevent that", that includes either building or joining a coalition to increase the political power behind your agenda, and it's often much easier to join/ally with an existing party than to build a new coalition from scratch. This naturally causes your opposition to join/ally with the other party/coalition.

Applying some of this to the AI case: the activist stuff has already happened. However, the AI corporations (the equiv of big-oil in our climate story) haven’t reacted in the same way big-oil did. At least public-facingly, they’ve actually recognized and embraced the concerns to a sizeable degree (see Google DeepMind, OpenAI, to some degree Facebook).

Another way to look at it though, is that the AI companies have co-opted some of the people concerned with AI risk (those on the more optimistic end of the spectrum) and cowed the rest (the more pessimistic ones, who think humanity should stop or slow down AI development) into silence (or at least only talking quietly amongst themselves). The more pessimistic researchers/activists know that they don't have nearly enough political power to win any kind of open conflict now, so they are biding their time, trying to better understand AI risk (in part to build a stronger public case for it), doing what they can around the edges, and looking for strategic openings. (The truth is probably somewhere between these two interpretations.)

And if it really is as neglected as you think, I may take up thinking about it again a bit more seriously.

Sounds good to me. It looks like your background is in philosophy, and I never thought I'd be seriously thinking about politics myself, but comparative advantage can be counter-intuitive. BTW, please check out Problems in AI Alignment that philosophers could potentially contribute to, in case you haven't come across it already.

Another way to look at it though, is that the AI companies have co-opted some of the people concerned with AI risk (those on the more optimistic end of the spectrum) and cowed the rest...

Huh, that's an interesting point.

I'm not sure where I stand on the question of "should we be pulling the brakes now," but I definitely think it would be good if we had the ability to pull the brakes should it become necessary. It hadn't really occurred to me that those who think we should be pulling the brakes now would feel quasi-political pressure not to speak out. I assumed the reason there's not much talk of that option is because it's so clearly unrealistic at this point; but I'm all in favor of building the capacity to do so (modulo Caplan-style worries about this accidentally going to far and leading to totalitarianism), and it never really occurred to me that this would be a controversial opinion.

It looks like your background is in philosophy


check out Problems in AI Alignment that philosophers could potentially contribute to, in case you haven't come across it already.

I had come across it before, but it was a while ago, so I took another look. I was already planning on working on some stuff in the vicinity of the "Normativity for AI / AI designers" and "Metaethical policing" bullets (namely the problem raised in these posts by gworley), but looking at it again, the other stuff under those bullets, as well as the metaphilosophy bullet, sound quite interesting. I'm also planning on doing some work on moral uncertainty (which, in addition to its relevance to global priorities research, also has some relevance for AI; based on my cursory understanding, CIRL seems to incorporate the idea of moral uncertainty to some extent), and perhaps other GPI-style topics. AI-strategy/governance stuff, including the topics in the OP, are also interesting, and I'm actually inclined to think that they may be more important than technical AI safety (though not far more important). But three disparate areas, all calling for disparate areas of expertise outside philosophy (AI: compsci; GPR: econ etc; strategy: international relations), feels a bit like too much, and I'm not certain which I ultimately should settle on (though I have a bit of time, I'm at the beginning of my PhD atm). I guess relevant factors are mostly the standard ones: which do I find most motivating/fun to work on, which can I skill-up in fastest/easiest, which is most important/tractable/neglected? And which ones lead to a reasonable back-up plan/off-ramp in case high-risk jobs like academia/EA-org don't work out?

Forgot one other thing I intend to work on: I've seen several people (perhaps even you?) say that the case for AI risk needs to be made more carefully than it has, that's another project I may potentially work on.

In what circumstances are you needing international coordination, and what is needed. I put substantial probability in a world that goes straight from a few researchers, ignored by the world, to super-intelligent AI with nanotech. In what world model do we have serious discussions about ASI in the UN, and in what worlds do we need them?

In your world, a treaty might make everyone keep the researchers in check until alignment is solved.

This sort of thing is really difficult to regulate for several reasons.

1) What are we banning? Potentially dangerous computer programs. What programs designs are dangerous? Experts disagree on what kind of algorithms might lead AGI. If someone hands you an arbitrary program, how do you tell if its allowed? (Assuming you don't want to ban all programming.) If we can't say what we are banning, how can we ban it.

2) We can't limit access. Enriched uranium is rare, hard to make and easy to track. This lets governments stop people from making nukes easily. Some recreational drugs can be produced with seeds and a room full of plants. The substances needed are accessible to most people. Drugs are easy to detect in minuscule quantities, and some of the equipment needed to produce them is hard to hide. Law enforcement has largely failed to stop them. If we knew that AGI required a GPU farm, limiting access somewhat might be doable. If serious processing power and memory are not required, you are trying to ban something that can be encrypted, can be sent anywhere in the world in moments, can be copied indefinitely. Look at how successful law enforcement agencies have been in stopping people making malware, or in censoring anything.

3) Unlike nearly every other circumstance in law enforcement, incentives don't work. As soon as someone actually thinks they have a shot at AGI, making it illegal won't stop them. If they succeed in making a safe AGI, programmed to do whatever they want, your threats are meaningless. If they make a papeclipper, the threat of jail is even more meaningless.

4) Unlike with most crimes, the people likely to make AGI are exceptionally smart. Less likely to be caught.

5) You have to succeed Every time.

Arresting everyone with "deep learning" on their CV would slow progress, banning all computer chips from being made would have worked in 1950, but if an old smartphone is enough, don't expect even an authoritarian world government to be able to round up every last one. So stopping new chips being made and smashing GPU's would slow work, but not stop it. Legislation can slow the arrival of AGI, but not be confidant of stopping it without a LOT of collateral.

The treaty can constrain the largest projects like DeepMind. Identifying them can be done by an international court. We don't need to defend against the smartest humans, only the crackpots that still think we're in an arms race instead of a bomb defusal operation. Imagine that the Manhattan project mathematicians had computed that a nuke had a 90% chance of igniting the atmosphere. Would their generals still have been table thumping that they should develop the nukes before the Nazis do? I think the more pressing concern is to make every potential researcher, including the Nazis, aware of the 90% chance. Legislation helping to slow is all your top-level comment requires.

New to LessWrong?