Where are people thinking and talking about global coordination for AI safety?

by Wei_Dai 3mo22nd May 201911 comments


Many AI safety researchers these days are not aiming for a full solution to AI safety (e.g., the classic Friendly AI), but just trying to find good enough partial solutions that would buy time for or otherwise help improve global coordination on AI research (which in turn would buy more time for AI safety work), or trying to obtain partial solutions that would only make a difference if the world had a higher level of global coordination than it does today.

My question is, who is thinking directly about how to achieve such coordination (aside from FHI's Center for the Governance of AI, which I'm aware of) and where are they talking about it? I personally have a bunch of questions related to this topic (see below) and I'm not sure what's a good place to ask them. If there's not an existing online forum, it seems a good idea to start thinking about building one (which could perhaps be modeled after the AI Alignment Forum, or follow some other model).

  1. What are the implications of the current US-China trade war?
  2. Human coordination ability seems within an order of magnitude of what's needed for AI safety. Why the coincidence? (Why isn’t it much higher or lower?)
  3. When humans made advances in coordination ability in the past, how was that accomplished? What are the best places to apply leverage today?
  4. Information technology has massively increased certain kinds of coordination (e.g., email, eBay, Facebook, Uber), but at the international relations level, IT seems to have made very little impact. Why?
  5. Certain kinds of AI safety work could seemingly make global coordination harder, by reducing perceived risks or increasing perceived gains from non-cooperation. Is this a realistic concern?
  6. What are the best intellectual tools for thinking about this stuff? Just study massive amounts of history and let one's brain's learning algorithms build what models it can?


7 Answers

I want to focus on your second question: "Human coordination ability seems within an order of magnitude of what's needed for AI safety. Why the coincidence? (Why isn’t it much higher or lower?)"

Bottom line up front: Humanity has faced a few potentially existential crises in the past; world wars, nuclear standoffs, and the threat of biological warfare. The fact that we survived those, plus selection bias, seems like a sufficient explanation of why we are near the threshold for our current crises.

I think this is a straightforward argument. At the same time, I'm not going to get deep into the anthropic reasoning, which is critical here, but I'm not clear enough on to discuss clearly. (Side note: Stuart Armstrong recently mentioned to me that there are reasons I'm not yet familiar with for why anthropic shadows aren't large, which is assumed in the below model.)

If we assume that large scale risks are distributed in some manner, such as from Bostrom's urn of technologies (See: Vulnerable World Hypothesis - PDF,) we should expect that the attributes of the problems, including the coordination needed to withstand / avoid them, are distributed with some mean and variance. Whatever that mean and variance is, we expect that there should be more "easy" risks (near or below the mean) than "hard" ones. Unless the tail is very, very fat, this means that we are likely to see several moderate risks before we see more extreme ones. For a toy model, let's assume risks show up at random yearly, and follow a standard normal distribution in terms of capability needed. If we had capability in the low single digits, we would be wiped out already with high probability. Given that we've come worryingly close, however, it seems clear that we aren't in the high double digits either.

Given all of that, and the selection bias of asking the question when faced with larger risks, I think it's a posteriori likely that most salient risks we face are close to our level of ability to overcome.

My question is, who is thinking directly about how to achieve such coordination (aside from FHI's Center for the Governance of AI, which I'm aware of) and where are they talking about it?

OpenAI has a policy team (this 80,000 Hours podcast episode is an interview with three people from that team), and I think their research areas include models for coordination between top AI labs, and improving publication norms in AI (e.g. maybe striving for norms that are more like those in computer security, where people are expected to follow some responsible disclosure process when publishing about new vulnerabilities). For example, the way OpenAI is releasing their new language model GPT-2 seems like a useful way to learn about the usefulness/feasibility of new publication norms in AI (see the "Release Strategy" section here).

I think related work is also being done at the Centre for the Study of Existential Risk (CSER).

A source tells me there's a fair bit of non-public discussion of AGI-safety-relevant strategy/policy/governance issues, but it often takes a while for those discussions to cohere into a form that is released publicly (e.g. in a book or paper), and some of it is kept under wraps due to worries about infohazards (and worries about the unilateralist's curse w.r.t. infohazards).

Last year there was a prize for papers and the authors spoke on a panel about this subject at HLAI 2018.

RE the title, a quick list:

  • FHI (and associated orgs)
  • CSER
  • OpenAI
  • OpenPhil
  • FLI
  • FRI
  • GovAI
  • PAI

I think a lot of orgs that are more focused on social issues which can or do arise from present day AI / ADM (automated decision making) technology should be thinking more about global coordination, but seem focused on national (or subnational, or EU) level policy. It seems valuable to make the most compelling case for stronger international coordination efforts to these actors. Examples of this kind of org that I have in mind are AINow and Montreal AI ethics institute (MAIEI).

As mentioned in other comments, there are many private conversations among people concerned about AI-Xrisk, and (IMO, legitimate) info-hazards / unilateralist curse concerns loom large. It seems prudent to make progress on those meta-level issues (i.e. how to engage the public and policymakers on AI(-Xrisk) coordination efforts) as a community as quickly as possible, because:

  • Getting effective AI governance in place seems like it will be challenging and take a long time.
  • There are a rapidly growing number of organizations seeking to shape AI policy, who may have objectives that are counter-productive from the point of view of AI-Xrisk. And there may be a significant first-mover advantage (e.g. via setting important legal or cultural precedents, and framing the issue for the public and policymakers).
  • There is massive untapped potential for people who are not currently involved in reducing AI-Xrisk to contribute (consider the raw number of people who haven't been exposed to serious thought on the subject).
  • Info-hazard-y ideas are becoming public knowledge anyways, on the timescale of years. There may be a significant advantage to getting ahead of the "natural" diffusion of these memes and seeking to control the framing / narrative.

My answers to your 6 questions:

1. Hopefully the effect will be transient and minimal.

2. I strongly disagree. I think we (ultimately) need much better coordination.

3. Good question. As an incomplete answer, I think personal connections and trust play a significant (possibly indispensable) role.

4. I don't know. Speculating/musing/rambling: the kinds of coordination where IT has made a big difference (recently, i.e. starting with the internet) are primarily economic and consumer-faced. For international coordination, the stakes are higher; it's geopolitics, not economics; you need effective international institutions to provide enforcement mechanisms.

5. Yes, but this doesn't seem like a crucial consideration (for the most part). Do you have specific examples in mind?

6. Social science and economics seem really valuable to me. Game theory, mechanism design, behavioral game theory. I imagine there's probably a lot of really valuable stuff on how people/orgs make collective decisions that the stakeholders are satisfied with in some other fields as well (psychology? sociology? anthropology?). We need experts in these fields (esp, I think the softer fields are underrepresented) to inform the AI-Xrisk community about existing findings and create research agendas.

For question 2, I think the human-initiated nature of AI risk could partially explain the small distance between ability and need. If we were completely incapable of working as a civilization, other civilizations might be a threat, but we wouldn’t have any AIs of our own, let alone general AIs.

> When humans made advances in coordination ability in the past, how was that accomplished? What are the best places to apply leverage today?

I am confused by the general lack of interest I've encountered in how joint stock corporations came to be and underwent selection to get us to where we are now. It may be I'm not looking in the right places. I know the founders of Mckinsey are quite interested in this.