This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the eighth section in the reading guide: Cognitive Superpowers. This corresponds to Chapter 6.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: Chapter 6
- AI agents might have very different skill profiles.
- AI with some narrow skills could produce a variety of other skills. e.g. strong AI research skills might allow an AI to build its own social skills.
- 'Superpowers' that might be particularly important for an AI that wants to take control of the world include:
- Intelligence amplification: for bootstrapping its own intelligence
- Strategizing: for achieving distant goals and overcoming opposition
- Social manipulation: for escaping human control, getting support, and encouraging desired courses of action
- Hacking: for stealing hardware, money and infrastructure; for escaping human control
- Technology research: for creating military force, surveillance, or space transport
- Economic productivity: for making money to spend on taking over the world
- These 'superpowers' are relative to other nearby agents; Bostrom means them to be super only if they substantially exceed the combined capabilities of the rest of the global civilization.
- A takeover scenario might go like this:
- Pre-criticality: researchers make a seed-AI, which becomes increasingly helpful at improving itself
- Recursive self-improvement: seed-AI becomes main force for improving itself and brings about an intelligence explosion. It perhaps develops all of the superpowers it didn't already have.
- Covert preparation: the AI makes up a robust long term plan, pretends to be nice, and escapes from human control if need be.
- Overt implementation: the AI goes ahead with its plan, perhaps killing the humans at the outset to remove opposition.
- Wise Singleton Sustainability Threshold (WSST): a capability set exceeds this iff a wise singleton with that capability set would be able to take over much of the accessible universe. 'Wise' here means being patient and savvy about existential risks, 'singleton' means being internally coordinated and having no opponents.
- The WSST appears to be low. e.g. our own intelligence is sufficient, as would some skill sets be that were strong in only a few narrow areas.
- The cosmic endowment (what we could do with the matter and energy that might ultimately be available if we colonized space) is at least about 10^85 computational operations. This is equivalent to 10^58 emulated human lives.
Bostrom starts the chapter claiming that humans' dominant position comes from their slightly expanded set of cognitive functions relative to other animals. Computer scientist Ernest Davis criticizes this claim in a recent review of Superintelligence:
The assumption that a large gain in intelligence would necessarily entail a correspondingly large increase in power. Bostrom points out that what he calls a comparatively small increase in brain size and complexity resulted in mankind’s spectacular gain in physical power. But he ignores the fact that the much larger increase in brain size and complexity that preceded the appearance in man had no such effect. He says that the relation of a supercomputer to man will be like the relation of a man to a mouse, rather than like the relation of Einstein to the rest of us; but what if it is like the relation of an elephant to a mouse?
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, almost entirely taken from Luke Muehlhauser's list, without my looking into them further.
- Try to develop metrics for specific important cognitive abilities, including general intelligence. Build on the ideas of Legg, Yudkowsky, Goertzel, Hernandez-Orallo & Dowe, etc.
- What is the construct validity of non-anthropomorphic intelligence measures? In other words, are there convergently instrumental prediction and planning algorithms? E.g. can one tend to get agents that are good at predicting economies but not astronomical events? Or do self-modifying agents in a competitive environment tend to converge toward a specific stable attractor in general intelligence space?
- Scenario analysis: What are some concrete AI paths to influence over world affairs? See project guide here.
- How much of humanity’s cosmic endowment can we plausibly make productive use of given AGI? One way to explore this question is via various follow-ups to Armstrong & Sandberg (2013). Sandberg lists several potential follow-up studies in this interview, for example (1) get more precise measurements of the distribution of large particles in interstellar and intergalactic space, and (2) analyze how well different long-term storable energy sources scale. See Beckstead (2014).
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about the orthogonality of intelligence and goals, section 9. To prepare, read The relation between intelligence and motivation from Chapter 7. The discussion will go live at 6pm Pacific time next Monday November 10. Sign up to be notified here.