This post is part 4 in our sequence on Modeling Transformative AI Risk. We are building a model to understand debates around existential risks from advanced AI. The model is made with Analytica software, and consists of nodes (representing key hypotheses and cruxes) and edges (representing the relationships between these cruxes), with final outputs corresponding to the likelihood of various potential failure scenarios. You can read more about the motivation for our project and how the model works in the Introduction post. The previous post in the sequence, Paths to High-Level Machine Intelligence, investigated how and when HLMI will be developed.
We are interested in feedback on this post, especially in places where the model does not capture your views or fails to include an uncertainty that you think could be an important crux. Similarly, if an explanation seems confused or confusing, flagging this is useful – both to help us clarify, and to ensure it doesn’t reflect an actual disagreement.
The goal of this part of the model is to describe the different potential characteristics of a transition from (pre-)HLMI¹ to superintelligent AI (i.e., “AI takeoff”). We also aim to clarify the relationships between these characteristics, and explain what assumptions they are sensitive to.
As shown in the above image, the relevant sections of the model (circled in light blue) take inputs primarily from these modules:
The outputs of the sections of concern in this post, corresponding to the above circled modules, are:
These outputs provide a rough way of characterizing AI takeoff scenarios. While they are non-exhaustive, we believe they are a simple way of characterizing the range of outcomes which those who have seriously considered AI takeoff tend to find plausible. For instance, we can summarize (our understanding of) the publicly espoused views of Eliezer Yudkowsky, Paul Christiano, and Robin Hanson (along with a sceptic position) as follows:
(No significant intermediate doublings)
(with complete intermediate doublings on the order of ~1 year)
These outputs – in addition to outputs from other sections of our model, such as those covering misalignment – impact further downstream sections of our model relevant to failure modes.
In a previous post we explored the Analogies and General Priors on Intelligence module, which includes very important input for the modules in this post. That module outputs answers to four key questions (which are used throughout this later module):
In the next section of this post, we discuss the Intelligence Explosion and Discontinuity around HLMI without self-improvement modules, which are upstream of the other modules covered in this post. After that, we explore these other modules (which are influenced by the earlier modules): HLMI is Distributed and Takeoff Speed.
We now examine the modules Discontinuity around HLMI without self-improvement and Intelligence Explosion, which affect the two later modules – HLMI is Distributed and Takeoff Speed.
This module aims to answer the question: will the first HLMI (or a very early HLMI) represent a discontinuity in AI capabilities from what came before? We define a discontinuity as a very large and very sudden jump in AI capabilities, not necessarily a mathematical discontinuity, but a phase change caused by a significantly quicker rate of improvement than what projecting the previous trend would imply. Note that this module is NOT considering rapid self-improvement from quick feedback loops (that would be considered instead in the module on Intelligence Explosion), but is instead concerned with large jumps occurring around the time of the HLMI generation of AI systems.
Such a discontinuity could be from a jump in capabilities to HLMI or a jump from HLMI to significantly higher capabilities (or both). A jump in capabilities from HLMI could occur, either if the first HLMI “overshoots” the HLMI-level and is very very capable (which likely depends on both the type of HLMI and whether marginal intelligence improvements are difficult around HLMI), or if a large hardware overhang allows for the first HLMI to scale (in quality or quantity) in such a way that it is far beyond the HLMI-level in capabilities.
The image below shows a zoomed-in view of the two routes (either a large capability gain from HLMI, or a large capability gain to HLMI):
Regarding capability jumps to HLMI, we see a few pathways, which can be broken down by whether HLMI will ultimately be bottlenecked on hardware or software (i.e., which of hardware or software will be last to fall into place – as determined by considerations in our post on Paths to HLMI). If HLMI will be bottlenecked on hardware (circled in green on the graph below), then the question reduces to whether pre-HLMI with almost enough compute has almost as strong capabilities as HLMI. To get a discontinuity from hardware-limited HLMI, the relationship between increasing abilities and increasing compute has to diverge from an existing trend to reach HLMI (i.e., hardware-limited, pre-HLMI AI with somewhat less compute is much less capable than HLMI with the required compute). We suspect that whether this crux is true may depend on the type of HLMI in question (e.g. statistical methods might be more likely to gain capabilities if scaled up and run with more compute).
If HLMI is software limited, on the other hand, then instead of hardware, we want to know whether the last software step(s) will result in a large jump in capabilities. This could happen either if there are very few remaining breakthroughs needed for HLMI (circled in magenta above) such that the last step(s) correspond to a large portion of the problem, or if the last step(s) act as “missing gears” putting the rest of the system in place (circled in black above).
We suspect that whether the last step(s) present a “missing gears” situation is likely to depend on the type of HLMI realized. A (likely) example of “missing gears” would be whole brain emulation (WBE), where 99% of the way towards WBE presumably doesn’t get you anything like 99% of the capabilities of WBE. (See here for an extended discussion of the relationship between "fundamental breakthroughs” and "missing gears”.) If the “missing gears” crux resolves negatively, however, then determining whether there will be a large capability gain to HLMI is modeled as depending on the number of remaining breakthroughs needed for HLMI.
We make the simplifying assumption that the remaining fundamental breakthroughs are all of roughly comparable size, such that the more breakthroughs needed, the less of a step any individual breakthrough represents. This means that the last breakthrough – the one that gives us HLMI – might either take us from AI greatly inferior to HLMI all the way to HLMI (if there are ‘few’ key breakthroughs needed), or just be an incremental improvement on pre-HLMI AI that is already almost as useful as HLMI (if there are an ‘intermediate’ or ‘huge number’ of key breakthroughs needed).
We consider several lines of evidence to estimate whether HLMI requires few or many key breakthroughs.
To start, the type of HLMI being developed influences the number of breakthroughs we expect will be needed. For instance, if HLMI is achieved with current deep learning methods plus business-as-usual advancements, then, ceteris paribus, we’d expect fewer breakthroughs needed to reach HLMI than if HLMI is achieved via WBE.
As well as depending on the type of HLMI developed, our model assumes the expected number of breakthroughs needed for HLMI is influenced significantly by the Difficulty of Marginal Intelligence Improvements at HLMI (in the Analogies and General Priors on Intelligence module). If marginal intelligence improvements are difficult at HLMI, then more separate breakthroughs are probably required for HLMI.
Lastly, the hard paths hypothesis and timeline to HLMI (as determined in the Paths to HLMI modules) each influence our estimate of how many breakthroughs are needed to reach HLMI. The hard paths hypothesis claims that it’s rare for environments to straightforwardly select for general intelligence – if this hypothesis is true, then we’d expect more steps to be necessary (e.g., for crafting the key features of such environments). Additionally, short timelines would imply that there are very few breakthroughs remaining, while longer timelines may imply more breakthroughs needing to be found (remember, it’s fine if this logic feels “backwards”, as our model is not a causal model per se, and instead arrows represent probabilistic influence).
We don’t place exact values on the number of breakthroughs needed for HLMI in either the ‘few’, ‘intermediate’ or ‘huge number’ cases. This is because we have not yet settled on a way of defining “fundamental breakthroughs”, nor of estimating how many would be needed to shift the balance on whether there would be a large capability gain to HLMI.
Our current plan for characterizing the number of breakthroughs is to anchor the ‘intermediate’ answer to ‘number of remaining breakthroughs’ as similar to the number of breakthroughs that have so far occurred in the history of AI. If we identify that there have been 3 major paradigms in AI so far (knowledge engineering, deep search, and deep learning), and maybe ten times as many decently-sized breakthroughs (within deep learning, this means things like CNNs, the transformer architecture, DQNs) to get to our current level of AI capability, then an ‘intermediate’ case would imply similar numbers to come. From this, we have:
Another way to estimate the number of remaining breakthroughs is to use expert opinion. For example, Stuart Russell identifies 4 remaining fundamental breakthroughs needed for HLMI (although these ‘breakthroughs’ seem more fundamental than those listed above, and might correspond to a series of breakthroughs as defined above):
"We will need several conceptual breakthroughs, for example in language or common sense understanding, cumulative learning (the analog of cultural accumulation for humans), discovering hierarchy, and managing mental activity (that is, the metacognition needed to prioritize what to think about next)"
Note that the ‘Few Breakthroughs’ case includes situations where 0 breakthroughs are needed (and we actually don't make any new breakthroughs before HLMI) - i.e. cases where current deep learning with only minor algorithmic improvements and somewhat increased compute gives us HLMI.
This module aims to answer the question: will there eventually be an intelligence explosion? We define an “intelligence explosion” as a process by which HLMI successfully accelerates the rate at which HLMI hardware or software advances, to such a degree that the rate of progress approaches vertical over a large range.
In our model, whether an intelligence explosion eventually occurs does not directly depend on what type of HLMI is first developed (as we assume that if one type of HLMI could not achieve an intelligence explosion while another could, even if the first type of HLMI is achieved first, the latter type – if possible to build – will eventually be built and cause an intelligence explosion then). Our model considers two paths to an intelligence explosion – a software-mediated path and a hardware-mediated path.
Under the software-mediated path, our model assumes there will be explosive growth in intelligence, due to AI accelerating the rate of AI progress, if HLMI is developed and:
In such a scenario, the positive feedback loop from “more intelligent AI” to “better AI research (performed by the more intelligent AI)” to “even more intelligent AI still” would be explosive – in effect, HLMI could achieve sustained returns on cognitive reinvestment, at least over a sufficiently large range.
In addition to the software-mediated path, an intelligence explosion could occur due to a hardware-mediated path.
In this scenario, HLMI (doing hardware R&D) would cause an explosion in the amount of hardware, and thus an explosion in the “population” of HLMI (implying more HLMIs to perform hardware research, faster hardware gains, and so on). This phenomenon would require hardware improvements to scale with the number of hardware researchers (with this work being performed by HLMIs), for hardware improvements not to become strongly-increasingly difficult, and for there to be plenty of room for more hardware improvements. Such a pathway would allow, at least in principle, the capabilities of AI to explode even if the capability of any given AI (with a fixed amount of hardware) did not explode.
Note that Intelligence Explosion as defined in this model does not necessarily refer to an instantaneous switch to an intelligence explosion immediately upon reaching HLMI – an intelligence explosion could occur after a period of slower post-HLMI growth with intermediate doubling times. Questions about immediate jumps in capabilities upon reaching HLMI are handled by the Discontinuity module.
The distinction between a discontinuity and an intelligence explosion in our model can be understood from the following graphs, which show rough features of how AI capabilities might advance over time given different resolutions of these cruxes. Note that, while these graphs show the main range of views our model can express, they are not exhaustive of these views (e.g., the graphs show the discontinuity going through HLMI, while it’s possible that a discontinuity would instead simply go to or from HLMI).
Additionally, the model is a simplification, and we do not mean to imply that progress will be quite as smooth as the graphs imply – we’re simply modeling what we expect to be the most important and crux-y features. Take these graphs as qualitative descriptions of possible scenarios as opposed to quantitative predictions – note that the y-axis (AI “intelligence”) is an inherently fuzzy concept (which perhaps is better thought of as increasing on a log scale) and that the dotted blue line for “HLMI” might not occupy a specific point as implied here but instead a rough range. Further remember that we’re not talking about economic growth, but AI capabilities (which feed into economic growth).
The earlier modules from the previous section act as inputs to the modules in this section: Takeoff Speed and HLMI is Distributed.
We have seen how the model estimates the factors which are important for assessing the takeoff speed:
This module aims to combine these results to answer the question – what will happen to economic growth post-HLMI? Will there be a new, faster economic doubling time (or equivalent) and if so, how fast will it be? Alternatively, will growth be roughly hyperbolic (before running into physical limits)? To clarify, while previously we were considering changes in AI capabilities, here we are examining the resultant effects for economic growth (or similar). This discussion is not per se premised on considerations of GDP measurement.
If the Intelligence Explosion module indicates an intelligence explosion, then we assume economic growth becomes roughly hyperbolic, along with AI capabilities (i.e., increasingly short economic doubling times).
If there is not an intelligence explosion, however, then we assume that there will be a switch to a new mode of exponential economic growth, and we estimate the speed-up factor for this new growth rate compared to the current growth rate based largely on outside-view estimates of previous transitions in growth rates (the Agricultural Revolution and the Industrial Revolution – on this outside view alone, we would expect the next transition to bring us an economic doubling time of perhaps days to weeks). This estimate is then updated based on an assessment of the overall ease of marginal improvements post-HLMI.
then we conclude that the transition to more-powerful HLMI looks faster, all else equal, and update our outside-view estimate regarding the economic impact accordingly (and similarly we update towards slower growth if these conditions do not apply). We plan to use these considerations to create a lognormally-distributed estimate of the final growth rate, given that we are uncertain over multiple orders of magnitude regarding the post-HLMI growth rate, even in a world without an intelligence explosion.
The connection between economic doubling time and the overall intelligence/capability of HLMI is not precise. We think our fuzzy assessment is appropriate, however, since we’re only looking for a ballpark estimate here (and due to the lognormal uncertainty, the results of our model should be robust to small differences in these parameters).
This module aims to answer the question, is HLMI ‘distributed by default’? That is, do we expect (ignoring the possibility of a Manhattan Project-style endeavour that concentrates most of the world’s initial research effort) to see HLMI capability distributed throughout the world, or highly localized into one or a few leading projects?
Later in this sequence of posts we will synthesise predictions about the two routes to highly localised HLMI: the route explored in this post i.e., HLMI not being distributed by default; and an alternative route, explored in a later module, where most of the world's research effort is concentrated into one project. We expect that if HLMI is distributed by default and research effort is not strongly concentrated into a few projects, many powerful HLMIs will be around at the same time.
Several considerations in this section are taken from Intelligence Explosion Microeconomics (in section 3.9 “Local versus Distributed Intelligence Explosions”).
Arguments about the degree to which HLMI will be distributed by default can be further broken up into two main categories: those heavily influenced by the economic takeoff speed/possibility of an intelligence explosion (mostly social factors, circled in green); and those not heavily influenced by the economic takeoff speed (mostly technical factors, circled in red). We should note that, while takeoff speed indirectly affects the likelihood of HLMI distribution through intermediate factors, it does not directly affect whether HLMI will be distributed; even in the case of an intelligence (and therefore economic) explosion, it’s still possible that progress could accelerate uniformly, such that no single project has a chance to pull ahead.
Here, we will first examine the factors not tied to takeoff speed, before turning to the ones that are.
A significant consideration is whether there will be a discontinuity in AI capabilities around HLMI. If there is a discontinuity, then it is highly likely that HLMI will not initially be distributed by default, because one project will presumably reach the discontinuity first. We model this as there being only one leading AI project to begin with. Even if progress around HLMI is continuous, however, there could still be only a few leading projects going into HLMI, especially if fixed costs are a large portion of the total costs for HLMI (presumably affected by the kind of HLMI), since high fixed costs may present a barrier from many competitor projects.
HLMI is also more likely to be distributed if AIs can easily trade cognitive content, code tweaks, and so on (this likelihood is also presumably influenced by the type of HLMI), as if so, the advantages that leading projects hold may be more likely to be distributed to other projects.
Finally, if HLMI can achieve large gains from scaling onto increasing hardware, then we might expect leading projects to increase their leads over competitors, as profits could be reinvested in more hardware (or compute may be seized by other means), and thus HLMI may be expected to be less distributed. We consider that the likelihood of large gains from further hardware is dependent on both the type of HLMI, and the difficulty of marginal improvements in intelligence around HLMI (with lower difficulty implying a greater chance of large gains from increasing hardware).
Then, there are the aforementioned factors which are heavily influenced by the takeoff speed (which is influenced by whether there will be an intelligence explosion):
If catch-up innovation based on imitating successful HLMI projects is easier than discovering methods of improving AI in the first place, then we would expect more distribution of HLMI, as laggards may successfully play catch-up. A faster doubling time – and in particular an intelligence explosion – may push against this, as projects that begin to gather a lead may “pull ahead” more easily than others can play catch-up (we expect a faster takeoff to accelerate cutting-edge AI development by more than it accelerates the rest of the economy).
If major AI innovations tend to be kept secret, then this also pushes against HLMI being distributed. We may consider that a race to HLMI may encourage more secrecy between competitors. Additionally, secrecy may be more likely if AI projects can derive larger benefits from using its innovations locally than from selling its innovations to other projects. Local use may be more likely larger if there are shorter economic doubling times/an intelligence explosion, as such scenarios imply large returns from cognitive reinvestment.
Finally, we consider that distributed HLMI is less likely if leading projects eliminate or agglomerate laggards. Again, a race dynamic probably makes this scenario more likely. Additionally, if HLMI is incorrigible, it might be more likely to “psychopathically” eliminate laggards via actions that projects with corrigible HLMI might opt to avoid.
To summarize, we have examined key questions related to AI takeoff: whether there will be a discontinuity in AI capabilities to and/or from HLMI, whether there will be an intelligence explosion due to feedback loops post-HLMI, the growth rate of the global economy post-AI takeoff, and whether HLMI will be distributed by default. These estimates use a mixture of inside-view and outside-view considerations.
In building our model, we have made several assumptions and simplifications, as we’re only attempting to model the main cruxes. Naturally, this does not leave space for every possible iteration on how the future of AI might play out.
In the next post in this series, we will discuss risks from mesa-optimization.
This post was edited by Issa Rice. We would like to thank both the rest of the MTAIR project team, as well as the following individuals, for valuable feedback on this post: Ozzie Gooen, Daniel Kokotaljo and Rohin Shah
Overall, it's quite unclear how we should think about the spectrum from "not impressive/capable" to "very impressive/capable" for AI. And indeed, in my experience, different AI researchers have radically different intuitions about which systems are impressive or capable, and how progress is going.