Here’s an intuitively compelling argument: only a few million years after diverging from chimpanzees, humans became much more capable, at a rate that was very rapid compared with previous progress. This supports the idea that AIs will, at some point, also start becoming more capable at a very rapid rate. Paul Christiano has made an influential response; the goal of this post is to evaluate and critique it. Note that the arguments discussed in this post are quite speculative and uncertain, and also cover only a small proportion of the factors which should influence our views on takeoff speeds - so in the process of writing it I’ve made only a small update towards very fast takeoff. Also, given that Paul’s vision of a continuous takeoff occurs much faster than any mainstream view, I expect that even totally resolving this debate would have relatively few implications for AI safety work. So it's probably more useful to compare both Paul and Eliezer's scenarios against more mainstream views, than against each other. Nevertheless, it’s disappointing that such an influential argument has received so little engagement, so I wanted to use this post to explore some of the uncertainties around the issue.
I’ll call Paul’s argument the changing selection pressures argument, and quote it here at length:
Chimpanzees evolution is not primarily selecting for making and using technology, for doing science, or for facilitating cultural accumulation. The task faced by a chimp is largely independent of the abilities that give humans such a huge fitness advantage. It’s not completely independent - the overlap is the only reason that evolution eventually produces humans - but it’s different enough that we should not be surprised if there are simple changes to chimps that would make them much better at designing technology or doing science or accumulating culture.
Relatedly, evolution changes what it is optimizing for over evolutionary time: as a creature and its environment change, the returns to different skills can change, and they can potentially change very quickly. So it seems easy for evolution to shift from “not caring about X” to “caring about X,” but nothing analogous will happen for AI projects. (In fact a similar thing often does happen while optimizing something with SGD, but it doesn’t happen at the level of the ML community as a whole.)
If we step back from skills and instead look at outcomes we could say: “Evolution is always optimizing for fitness, and humans have now taken over the world.” On this perspective, I’m making a claim about the limits of evolution. First, evolution is theoretically optimizing for fitness, but it isn’t able to look ahead and identify which skills will be most important for your children’s children’s children’s fitness. Second, human intelligence is incredibly good for the fitness of groups of humans, but evolution acts on individual humans for whom the effect size is much smaller (who barely benefit at all from passing knowledge on to the next generation). Evolution really is optimizing something quite different than “humanity dominates the world.”
So I don’t think the example of evolution tells us much about whether the continuous change story applies to intelligence. This case is potentially missing the key element that drives the continuous change story: optimization for performance. Evolution changes continuously on the narrow metric it is optimizing, but can change extremely rapidly on other metrics. For human technology, features of the technology that aren’t being optimized change rapidly all the time. When humans build AI, they will be optimizing for usefulness, and so progress in usefulness is much more likely to be linear.
In other words, Paul argues firstly that human progress would have been much less abrupt if evolution had been optimising for cultural ability all along; and secondly that, unlike evolution, humans will continually optimise for whatever makes our AIs more capable. (I focus on “accumulating culture” rather than “designing technology or doing science”, because absorbing and building on other people’s knowledge is such an integral part of intellectual work, and it’s much clearer what proto-culture looks like than proto-science.) In this post I’ll evaluate:
- Are there simple changes to chimps (or other animals) that would make them much better at accumulating culture?
- Will humans continually pursue all simple yet powerful changes to our AIs?
Although it feels very difficult to operationalise these in any meaningful way, I’ve put them down as Elicit distributions with 50% and 30% confidence respectively. The rest of this post will explore why.
How easily could animals evolve culture?
Let’s distinguish between three sets of skills which contribute to human intelligence: general cognitive skills (e.g. memory, abstraction, and so on); social skills (e.g. recognising faces, interpreting others’ emotions); and cultural skills (e.g. language, imitation, and teaching). I expect Paul to agree with me that chimps have pretty good general cognitive skills, and pretty good social skills, but they seriously lack the cultural skills that precipitated the human “fast takeoff”. In particular, there’s a conspicuous lack of proto-languages in all nonhuman animals, including some (like parrots) which have no physiological difficulties in forming words. Yet humans were able to acquire advanced cultural skills relatively quickly after diverging from chimpanzees. So why haven’t nonhuman animals, particularly chimpanzees, developed cultural skills that are anywhere near as advanced as ours? Here are three possible explanations:
- Advanced cultural skills are not very useful for species with sub-human levels of general cognitive skills and social skills.
- Advanced cultural skills are not directly selected for in species with sub-human levels of general cognitive skills and social skills.
- Advanced cultural skills are too complex for species with sub-human levels of general cognitive skills and social skills to acquire.
I’ve assigned 40%, 45% and 15% credence respectively to each of these being the most important explanation for the lack of cultural skills in other species, although again these are very very rough estimates.
What reasons do we have to believe or disbelieve in each? The first one is consistent with Lewis and Laland’s experiments, which suggest that the usefulness of culture increases exponentially with fidelity of cultural transmission. For example, moving from a 90% chance to a 95% chance of copying a skill correctly doubles the expected length of any given transmission chain, allowing much faster cultural accumulation. This suggests that there’s a naturally abrupt increase in the usefulness of culture as species gain other skills (such as general cognitive skills and social skills) which decrease their error rate. As an alternative possibility, Dunbar’s work on human evolution suggests that increases in our brain size were driven by the need to handle larger social groups. It seems plausible that culture becomes much more useful when interacting with a bigger group. Either of these hypotheses supports the idea that AI capabilities might quickly increase.
The second possibility is the most consistent with the changing selection pressures argument. The core issue is that culture requires the involvement of several parties - for example, language isn’t useful without both a speaker and a listener. This makes it harder for evolution to select for advanced language use, since it primarily operates on an individual level. Consider also the problem of trust: what prevents speakers from deceiving listeners? Or, if the information is honest and useful, what ensures that listeners will reciprocate later? These problems might significantly reduce the short-term selection for cultural skills. However, it seems to me that many altruistic behaviours have overcome these barriers, for example by starting within kin groups and spreading from there. In Darwin’s Unfinished Symphony, Laland hypothesises that language started the same way. It seems hard to reconcile observations of altruistic behaviour in chimps and other animals with the claim that proto-culture would have been even more useful, but failed to emerge. However, I've given this possibility relatively high credence anyway because if I imagine putting chimps through strong artificial selection for a few thousand years, it seems pretty plausible that they could acquire useful cultural skills. (Although see the next section for why this might not be the most useful analogy.)
The third possibility is the trickiest to evaluate, because it’s hard to reason about the complexity of cognitive skills. For example, is the recursive syntax of language something that humans needed complex adaptation to acquire, or does it reflect our pre-existing thought patterns? One skill that does seem very sophisticated is the ability of human infants to acquire language - if this relied on previous selection for general cognitive skills, then it might have been very difficult for chimps to acquire. This possibility implies that developing strong non-cultural skills makes it much easier to develop cultural skills. This would also be evidence in favour of fast takeoffs, since it means that even if humans are always trying to build increasingly useful AIs, our ability to add some important skills might advance rapidly once our AIs possess other prerequisite skills.
How well can humans avoid comparable oversights?
Even assuming that evolution did miss something simple and important for a long time, though, the changing selection pressures argument fails if humans are likely to also spend a long time overlooking some simple way to make our AIs much more useful. This could be because nobody thinks of it, or merely because the idea is dismissed by the academic mainstream. See, for example, the way that the field of AI dismissed the potential of neural networks after Minsky and Papert’s Perceptrons was released. And there are comparably large oversights in many other scientific domains. When we think about how easy it would be for AI researchers to do better than evolution, we should be asking: “would we have predicted huge fitness gains from cultural learning in chimpanzees, before we’d ever seen any examples of cultural learning?” I suspect not.
Paul would likely respond by pointing to AI Impacts’ evidence that discontinuities are rare in other technological domains - suggesting that, even when fields have been overlooking big ideas, their discovery rarely cashes out in sharp changes to important metrics. But I think there is an important disanalogy between AI and other technologies: modern machine learning systems are mostly “designed” by their optimisers, with human insights only contributing at a high level. This has three important implications.
Firstly, it means that attempts to predict discontinuities should consider growth in compute as well as intellectual progress. Exactly how we do so depends on whether compute and insights are better modeled as substitutes or complements to each other - that is, whether insights have less or more impact when more compute becomes available. If they’re substitutes, then we should expect continuous compute growth to “smooth out” the lumpiness in human insight. But if they’re complements, then compute growth exacerbates that lumpiness - an insight which would have led to a big jump with a certain amount of compute available could lead to a much bigger jump if it’s only discovered when there’s much more compute available.
I think there’s much more to be said on this question, which I’m currently very uncertain about. My best guess is that we used to be in a regime where compute and insight were substitutes, because domain-specific knowledge played a large role. But now that researchers are taking the bitter lesson more seriously, and working on tasks where it’s harder to encode domain-specific knowledge, it seems more plausible that we’re in a complementary regime, where insights are mainly used to leverage compute rather than replace it.
Either way, this argument suggests that the comparison to other technological domains in general is a little misleading. Instead, we should look at fields in which an important underlying resource was becoming exponentially cheaper - for instance, fields which rely on DNA sequencing. One could perhaps argue that all scientific fields depend on the economy as a whole, which is growing exponentially - but I’d be more convinced by examples in which the dependency is direct, as it is in ML.
Secondly, our reliance on optimisers means that we don’t understand the low-level design details of neural networks as well as we understand the low-level design details in other domains. Not only are the parameters of our neural networks largely opaque to us, we also don’t have a good understanding of what our optimisers are doing when they update those parameters. This makes it more likely that we miss an important high-level insight, since our high-level intuitions aren’t very well linked to whatever low-level features make our neural networks actually function.
Thirdly, even if we can identify all the relevant traits that we’d like to aim for at a high level, we may be unable to specify them to our optimisers, for all the reasons explained in the AI safety literature. That is, by default we should expect our optimisers to develop AIs with capabilities that aren't quite what we wanted (which I'll call capabilities misspecification). Perhaps that comes about because it’s hard to provide high-quality feedback, or hard to set up the right environments, or hard to make multiple AIs interact with each other in the right way (I discuss such possibilities in more depth in this post). If so, then our optimisers might make the same types of mistakes as evolution did, for many of the same reasons. For example, it’s not implausible to me that we build AGI by optimising for the most easily measurable tasks that seem to require high intelligence, and hoping that these skills generalise - as was the case with GPT-3. But in that case the fact that humans are “aiming towards” useful AIs doesn’t help very much in preventing discontinuities.
Paul claims that, even if this argument applies at the level of individual optimisers, it hasn't previously been relevant at the level of the ML community as a whole. This seems plausible, but note that the same could be said for alignment problems in general. So far they've only occurred in isolated contexts, yet many of us expect that alignment problems will get more serious as we build more sophisticated systems that generalise widely in ways we don't understand very well. So I'm inclined to believe that capabilities misspecification will also be more of a problem in the future, for roughly the same reasons. One could also argue against the likelihood of capabilities misspecification by postulating that in order to build AGIs we’ll only need to optimise them to achieve relatively straightforward tasks in relatively simple environments. In practice, though, it’s difficult to make such arguments compelling given the uncertainties involved.
Overall, I think that the changing selection pressures argument is a plausible consideration, but far from fully convincing; and that evaluating it thoroughly will require much more scrutiny. However, I'd be more excited about future work which classifies both Paul and Eliezer's positions as "fast takeoff", and then evaluates those against the view that AGI will "merely" bump us up to a steeper exponential growth curve - e.g. as defended by Hanson.
- As further support for this argument it’d be nice to have more examples of cases where evolution plausibly missed an important leap, in addition to the development of human intelligence. Are there other big evolutionary discontinuities? Plausibly multicellularity and the Cambrian explosion qualify. On a smaller scale, two striking types of biological discontinuities (for which I credit Amanda Askell and Beth Barnes) are invasive species, and runaway sexual selection. But in both cases I think this is more reasonably described as a change in the objective, rather than a species quickly getting much fitter within a given environment.
- In practice we can take inspiration from humans in order to figure out which traits will be necessary in AGIs - we don’t need to invent all the ideas from scratch. But on the other hand, even given the example of humans, we haven’t made much progress in understanding how or why our intelligence works, which suggests that we’re reasonably likely to overlook some high-level insights.
- One natural reason to think that economic usefulness of AIs will be relatively continuous even if we overlook big insights is that humans can fill in gaps in the missing capabilities of our AIs, so that they can provide a lot of value without being good at every aspect of a given job. By contrast, in nature each organism has to be very well-rounded in order to survive.
- Perhaps the strongest hypothesis along these lines is that language is the key ingredient - yet it seems like language models will become data-constrained relatively soon.