This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.

Welcome. This week we discuss the twenty-eighth section in the reading guide: Collaboration.

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).

Reading: “Collaboration” from Chapter 14


  1. The degree of collaboration among those building AI might affect the outcome a lot. (p246)
  2. If multiple projects are close to developing AI, and the first will reap substantial benefits, there might be a 'race dynamic' where safety is sacrificed on all sides for a greater chance of winning. (247-8)
  3. Averting such a race  dynamic with collaboration should have these benefits:
    1. More safety
    2. Slower AI progress (allowing more considered responses)
    3. Less other damage from conflict over the race
    4. More sharing of ideas for safety
    5. More equitable outcomes (for a variety of reasons)
  4. Equitable outcomes are good for various moral and prudential reasons. They may also be easier to compromise over than expected, because humans have diminishing returns to resources. However in the future, their returns may be less diminishing (e.g. if resources can buy more time instead of entertainments one has no time for).
  5. Collaboration before a transition to an AI economy might affect how much collaboration there is afterwards. This might not be straightforward. For instance, if a singleton is the default outcome, then low collaboration before a transition might lead to a singleton (i.e. high collaboration) afterwards, and vice versa. (p252)
  6. An international collaborative AI project might deserve nearly infeasible levels of security, such as being almost completely isolated from the world. (p253)
  7. It is good to start collaboration early, to benefit from being ignorant about who will benefit more from it, but hard because the project is not yet recognized as important. Perhaps the appropriate collaboration at this point is to propound something like 'the common good principle'. (p253) 
  8. 'The common good principle': Superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals. (p254)

Another view

Miles Brundage on the Collaboration section:

This is an important topic, and Bostrom says many things I agree with. A few places where I think the issues are less clear:

  • Many of Bostrom’s proposals depend on AI recalcitrance being low. For instance, a highly secretive international effort makes less sense if building AI is a long and incremental slog. Recalcitrance may well be low, but this isn’t obvious, and it is good to recognize this dependency and consider what proposals would be appropriate for other recalcitrance levels. 
  • Arms races are ubiquitous in our global capitalist economy, and AI is already in one. Arms races can stem from market competition by firms or state-driven national security-oriented R+D efforts as well as complex combinations of these, suggesting the need for further research on the relationship between AI development, national security, and global capitalist market dynamics. It's unclear how well the simple arms race model here matches the reality of the current AI arms race or future variations of it. The model's main value is probably in probing assumptions and inspiring the development of richer models, as it's probably too simple in to fit reality well as-is. For instance, it is unclear that safety and capability are close to orthogonal in practice today. If many AI people genuinely care about safety (which the quantity and quality of signatories to the FLI open letter suggests is plausible), or work on economically relevant near-term safety issues at each point is important, or consumers reward ethical companies with their purchases, then better AI firms might invest a lot in safety for self-interested as well as altruistic reasons. Also, if the AI field shifts to focus more on human-complementary intelligence that requires and benefits from long-term, high-frequency interaction with humans, then safety and capability may be synergistic rather than trading off against each other. Incentives related to research priorities should also be considered in a strategic analysis of AI governance (e.g. are AI researchers currently incentivized only to demonstrate capability advances in the papers they write, and could incentives be changed or the aims and scope of the field redefined so that more progress is made on safety issues?).
  • ‘AI’ is too course grained a unit for a strategic analysis of collaboration. The nature and urgency of collaboration depends on the details of what is being developed. An enormous variety of artificial intelligence research is possible and the goals of the field are underconstrained by nature (e.g. we can model systems based on approximations of rationality, or on humans, or animals, or something else entirely, based on curiosity, social impact, and other considerations that could be more explicitly evaluated), and are thus open to change in the future. We need to think more about differential technology development within the domain of AI. This too will affect the urgency and nature of cooperation.


1. In Bostrom's description of his model, it is a bit unclear how safety precautions affect performance. He says 'one can model each team's performance as a function of its capability (measuring its raw ability and luck) and a penalty term corresponding to the cost of its safety precautions' (p247), which sounds like they are purely a negative. However this wouldn't make sense: if safety precautions were just a cost, then regardless of competition, nobody would invest in safety. In reality, whoever wins control over the world benefits a lot from whatever safety precautions have been taken. If the world is destroyed in the process of an AI transition, they have lost everything! I think this is the model Bostrom means to refer to. While he says it may lead to minimum precautions, note that in many models it would merely lead to less safety than one would want. If you are spending nothing on safety, and thus going to take over a world that is worth nothing, you would often prefer to move to a lower probability of winning a more valuable world. Armstrong, Bostrom and Shulman discuss this kind of model in more depth.

2. If you are interested in the game theory of conflicts like this, The Strategy of Conflict is a great book. 

3. Given the gains to competitors cooperating to not destroy the world that they are trying to take over, research on how to arrange cooperation seems helpful for all sides. The situation is much like a tragedy of the commons, except for the winner-takes-all aspect: each person gains from neglecting safety, while exerting a small cost on everyone. Academia seems to be pretty interested in resolving tragedies of the commons, so perhaps that literature is worth trying to apply here.

4. The most famous arms race is arguably the nuclear one. I wonder to what extent this was a major arms race because nuclear weapons were destined to be an unusually massive jump in progress. If this was important, it leads to the question of whether we have reason to expect anything similar in AI.

In-depth investigations

If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.

  1. Explore other models of competitive AI development.
  2. What policy interventions help in promoting collaboration?
  3. What kinds of situations produce arms races?
  4. Examine international collaboration on major innovative technology. How often does it happen? What blocks it from happening more? What are the necessary conditions? Examples: Concord jet, LHC, international space station, etc.
  5. Conduct a broad survey of past and current civilizational competence. In what ways, and under what conditions, do human civilizations show competence vs. incompetence? Which kinds of problems do they handle well or poorly? Similar in scope and ambition to, say, Perrow’s Normal Accidents and Sagan’s The Limits of Safety. The aim is to get some insight into the likelihood of our civilization handling various aspects of the superintelligence challenge well or poorly. Some initial steps were taken here and here.
  6. What happens when governments ban or restrict certain kinds of technological development? What happens when a certain kind of technological development is banned or restricted in one country but not in other countries where technological development sees heavy investment?
  7. What kinds of innovative technology projects do governments monitor, shut down, or nationalize? How likely are major governments to monitor, shut down, or nationalize serious AGI projects?
  8. How likely is it that AGI will be a surprise to most policy-makers and industry leaders? How much advance warning are they likely to have? Some notes on this here.
If you are interested in anything like this, you might want to mention it in the comments, and see whether other people have useful thoughts.

How to proceed

This has been a collection of notes on the chapter.  The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

Next week, we will talk about what to do in this 'crunch time'. To prepare, read Chapter 15. The discussion will go live at 6pm Pacific time next Monday 30 March. Sign up to be notified here.

New Comment
21 comments, sorted by Click to highlight new comments since: Today at 9:54 AM

The idea of an international collaboration reminds me of this article I read a while ago about the difficulties coordinating international efforts to create nuclear fusion: As a software developer, I tend to think that the best software is produced by small teams of elite software developers who know each other well, work together well, have been working together for a long time, work out of a single office, and are all native or extremely fluent speakers of the same language (English being the best language by a wide margin, because almost all programming languages are based on it and the majority of tool documentation is written in it, especially for the most cutting edge development tools and libraries). This is the rough model that you see used in Silicon Valley, and it seems to have won out over other models like outsourcing half your team to a foreign country where developers are not extremely fluent in English and hiring managers aren’t ruthlessly obsessed with finding the most brilliant and qualified people possible. (There are a few differences, such as the fact that Silicon Valley workers change jobs rather often and that Silicon Valley companies are now being forced to hire people who aren’t quite as brilliant or fluent as they would like. But I think I’ve described the type of team that many or most of the best CTOs in the valley would like to have.)

An international collaboration pattern matches to one of those horror stories you read about in a book like The Mythical Man-Month about a project that takes way longer than expected, goes way over budget, and might succeed in delivering a poorly designed, bug-ridden piece of software if it isn’t cancelled or started over from scratch first. Writing great software is a big topic that I don’t feel very qualified to speak on, but it does worry me that Bostrom’s plan doesn’t pass my sniff test; it makes me worry that he spent too much time theorizing from first principles and not enough having discussion with domain experts.

Either way, I think this discussion might benefit from surveying the literature on software development best practices, international research collaborations, safety-critical software development, etc. There might be some strategy besides an international collaboration that accomplishes the same thing, e.g. a core development team in a single location writing all of the software, with external teams monitoring its development, taking the time to understand it, and checking for flaws. This would both give those external teams domain expertise in producing AGIs if it turns out they’re only very powerful rather than extremely powerful, and serves the additional role of having an additional layer of safety checks. (To provide proper incentives, perhaps any monitoring team that succeeded in identifying a bug in the work of the main team would have the prestige of writing the AI revert to it. Apparently something like this adversarial structure works for a company writing safety-critical space shuttle software: )

Another idea I’ve been toying with recently is the idea that some people who are concerned with AI safety should go off and start a company that writes safety-critical AI software now, say for piloting killer drones. That would give them the opportunity to develop the soft skills and expertise necessary to write really high quality, bug-free AI software. In the ideal case they might spend half their time writing code and the other half of their time improving processes to reduce the incidence of bugs. Then we’d have a team in place to build FAI when it became possible.

The greatest danger is that an arms race will lead to the creation of a superintelligence which will immediately be used to dominate all others. Speculative threats by an autonomous superintelligence are plausible but are less certain than the first-strike logic inherent in such an arms race. Here's what we know from recent history: a) the instinct for domination is alive and well in the human species, and where circumstances allow an intelligent psychopath to reach the pinnacle of power, all available means will be deployed to maintain his (usually his) power. Cf. Stalin, Hitler, Mao, Kim (x3), Saddam, Putin, etc., and b) the logic of this kind of arms race dictates that if you've got it, you must use it. Multiple centers of power would almost certainly lead to a cyberwar or perhaps outright war. It only makes sense that the first to gain power must use it to suppress all other pretenders. Collaboration on a precursor project, similar perhaps to the Human Genome Project, might at least point us in the right direction. Perhaps it could focus on the use of AI to build an Internet immune system that might limit mitigate today's threats and constrain future one. Still, better ideas are needed to thwart the various catastrophic scenarios ahead.

What did you find most interesting in this week's reading?

Is AI more likely than other technologies to produce an race dynamic?

Hard Coded AI is less likely than ems, since ems which are copies or modified copies of other ems would instantly be aware that the race is happening, whereas most of the later stages of hard-coded AI could be concealed from strategic opponents for part of the period in which they would have made hasty decisions, if only they knew.

What do you think of Miles' views?

None of Miles's arguments resonates with me, basically because one counterargument could erase the pragmatic relevance of his points in one fell swoop:

The vast majority of expected value is on changing policies where the incentives are not aligned with ours. Cases where the world would be destroyed no matter what happened, or cases where something is providing a helping hand - such as the incentives he suggests - don't change where our focus should be. Bostrom knows that, and focuses throughout on cases where more consequences derive from our actions. It's ok to mention when a helping hand is available, but it doesn't seem ok to argue that given a helping hand is available we should be less focused on the things that are separating us from a desirable future.

What do you think of the 'Common Good Principle'?

There is no doubt that given the concept of the Common Good Principle, everyone would be FOR it prior to complete development of ASI. But once any party gains an advantage they are not likely to share, particularly with those they see as their competitors or enemies. This is an unfortunate fact of human nature that has little chance of evolving toward greater altruism in the necessary timescale. In both Bostrom's and Brundage's arguments there are a lot of "ifs". Yes, it would be great if we could develop AI for the Greater Good, but human nature seems to indicate that our only hope of doing so would be through an early and inextricably intertwined collaboration, so that no party would have the capability of seizing the golden ring of domination by cheating during development.

Do you think the model Bostrom presents of the race dynamic captures basically what will happen if there are not big efforts to coordinate?

If AI is likely to cause a 'race dynamic', do you think this could be averted by a plausible degree of effort?

Is there anything particular you would like to do by the end of this reading group, other than read and discuss the last chapter?

It would be interesting to me to read others’ more free-ranging impressions of where Bostrom gets it right in Superintelligence – and what he may have missed or not emphasized enough.

Does anyone have suggested instances of this? I actually don't know of many.

I long to hear a discussion of the overarching issues of the prospects of AI as seen from the widest perspective. Much as the details covered in this discussion are fascinating and compelling, it also deserves an approach from the perspective not only of the future of this civilization and humanity at large, but of our relationship with the rest of nature and the cosmos. ASI would essentially trump earthly "nature" as we know it (through evolution, geo-engineering, nanotech, etc., though certainly not the laws of nature). Thereby will be raised all kinds of new problems that have yet to occur to us in this slice of time.

I think It would be fruitful to discuss ultimate issues, like how does the purpose of humanity intersect with nature? Is the desire for more and more just a precursor to suicide or is there some utopian vision that is actually better than the natural world we've been born into? Why do we think we will enjoy being under the control of ASI any more than we do that of our parents, an authoritarian government, fate or God? Is intelligence a non-survivable mutation? Regardless of what is achieved in the end, it seems to me that most all the issues we've been discussing pale in comparison to these larger questions....I look forward to more!

What did you find least persuasive in this week's reading?

None of Miles's arguments resonates with me, basically because one counterargument could erase the pragmatic relevance of his points in one fell swoop:

The vast majority of expected value is on changing policies where the incentives are not aligned with ours. Cases where the world would be destroyed no matter what happened, or cases where something is providing a helping hand - such as the incentives he suggests - don't change where our focus should be. Bostrom knows that, and focuses throughout on cases where more consequences derive from our actions. It's ok to mention when a helping hand is available, but it doesn't seem ok to argue that given a helping hand is available we should be less focused on the things that are separating us from a desirable future.

[This comment is no longer endorsed by its author]Reply

the over complications, the suppositions in all areas, the assumptions of certain outcomes, the complex logic of spaghetti minded mental convulsions to make a point. It all misses the essence of AI in whatever form. I ran into this at uni doing philslophy-logic and couldn't be philosophical about the proposed propositions. I cheated to pass. It is the same - more erudite - here and the book in general. Creating more forests and hiding the trees. Still it's a learning curve.

There is a gender difference in resource constraint satisfaction worth mentioning: males in most primate species are less resource constrained than females, including humans. The main reason why females require fewer resources to be emotionally satisfied is that the upper bound on how many resources are required to attract the males with the best genes, acquire their genes and parenting resources, and have nearly as many children as possible, as well as taking good care of these children and their children is limited. For males however, because there is competitive bargaining with females where many males compete for reproductive access and mate-guarding, and because males can generate more offspring, there are many more ways in which resources can be fungible with reproductive prowess, such as fathering children without much interacting with their mother, but still providing resources for the kid, as well as paying some signaling cost to mate with as many apparently fertile and healthy females as possible. Accordingly, men are hard and softwired to seek fungible resources more frequently and more intensely than women.

Human satisfaction marginally decreases on resource quantity, but they have two clearly distinct clusters on level of marginal decrease.

since men are wired to mate diversely then obviously the recipient must feel the same not different. I mean it takes 2 to tango. I've met women who wanted to ** with me and once asked the proponent that I had a lover and she said: so what? Lesson over.

What are some more recent papers or books on the topic of Strategy and Conflict that take a Schellingian approach to the dynamics of conflict?

I find it hard to believe that the best book on any topic of relevance was written in 1981.