Takeoff Speed: Simple Asymptotics in a Toy Model.

Thanks for writing this Aaron! (And for engaging with some of the common arguments for/against AI safety work.)

I personally am very uncertain about whether to expect a singularity/fast take-off (I think it is plausible but far from certain). Some reasons that I am still very interested in AI safety are the following:

I think AI safety likely involves solving a number of difficult conceptual problems, such that it would take >5 years (I would guess something like 10-30 years, with very wide error bars) of research to have solutions that we are happy with. Moreover, many of the relevant problems have short-term analogues that can be worked on today. (Indeed, some of these align with your own research interests, e.g. imputing value functions of agents from actions/decisions; although I am particularly interested in the agnostic case where the value function might lie outside of the given model family, which I think makes things much harder.)
I suppose the summary point of the above is that even if you think AI is a ways off (my median estimate is ~50 years, again with high error bars) research is not something that can happen instantaneously, and conceptual research in particular can move slowly due to being harder to work on / parallelize.
While I have uncertainty about fast take-off, that still leaves some probability that fast take-off will happen, and in that world it is an important enough problem that it is worth thinking about. (It is also very worthwhile to think about the probability of fast take-off, as better estimates would help to better direct resources even within the AI safety space.)
Finally, I think there are a number of important safety problems even from sub-human AI systems. Tech-driven unemployment is I guess the standard one here, although I spend more time thinking about cyber-warfare/autonomous weapons, as well as changes in the balance of power between nation-states and corporations. These are not as clearly an existential risk as unfriendly AI, but I think in some forms would qualify as a global catastrophic risk; on the other hand I would guess that most people who care about AI safety (at least on this website) do not care about it for this reason, so this is more idiosyncratic to me.

Happy to expand on/discuss any of the above points if you are interested.

Best,

Jacob

[-]Aaron Roth8y20

Good points all; these are good reasons to work on AI safety (and of course as a theorist I'm very happy to think about interesting problems even if they don't have immediate impact :-) I'm definitely interested in the short-term issues, and have been spending a lot of my research time lately thinking about fairness/privacy in ML. Inverse-RL/revealed preferences learning is also quite interesting, and I'd love to see some more theory results in the agnostic case.

[-]paulfchristiano8y130

As far as I can tell, this possibility of an exponentially-paced intelligence explosion is the main argument for folks devoting time to worrying about super-intelligent AI now, even though current technology doesn't give us anything even close. So in the rest of this post, I want to push a little bit on the claim that the feedback loop induced by a self-improving AI would lead to exponential growth, and see what assumptions underlie it.

I think few AI safety advocates believe this. It's much more common to expect growth to be faster than exponential. As you point out, exponential growth is a knife-edge phenomenon.

As far as I can tell, very few people actually think that intelligence growth would exhibit an actual mathematical singularity

This is actually a pretty common view---not a literal singularity, but rapid technological acceleration until natural resource limitations (e.g. on total available solar energy and raw minerals) start binding. If you look at the history of technological progress, it looks a whole lot more like a hyperbola than like an exponential curve, so the hyperbolic growth forecast isn't so insane. It's the person arguing that growth rates are going to stop at 3% who is arguing against the bulk of historical precedent (and whose predecessors would have been wrong if they'd expected growth to stop at 0.3% or 0.03% or 0.003%...).

this seems instead to be a metaphor for exponential growth.

I think "singularity" usually either follows Vinge's use (as the point beyond which you can't predict what will happen, because the future is guided by actors smarter than you are) or as a reference to the dynamic that would produce a mathematical singularity if left unchecked.

[-]paulfchristiano8y130

In a more typical endogenous growth model, output is the product of physical capital (e.g. how many computers you have) and a technology factor (e.g. how smart you are). You can either invest in producing more capital (building more computers) or doing research (becoming smarter). On these models, even returns of $x^{ϵ}$ still lead to a mathematical singularity (while constant technology leads to exponential growth).

From this perspective, you are investigating whether there is an intelligence explosion with finite capital. If productivity grows sublinearly with inputs, you need to build more machines (and ultimately extract more resources from nature) in order to grow really fast. This might suggest that getting to a singularity would take years rather than weeks, but doesn't much change the qualitative conclusion or substantially change the urgency (especially given that the early phase of takeoff would be driven by moving resources over from lower productivity areas into higher productivity areas).

[-]paulfchristiano8y100

I think it's a mistake to think of "productivity is linear in effort" as the "no diminishing returns" model, and to consider it a degenerate extreme case. Linear returns is the model where doubling inputs leads to doubled outputs. A priori, it's nearly as natural for constant additional effort leads to doubling of efficiency, so we need to actually look at the data to distinguish.

(It seems more theoretically natural---and more common in practice---for each clever trick to lead to a 10% increase in efficiency, then for each clever trick to lead to an absolute increase of 1 unit of efficiency.)

In semiconductors, as you point out, output has increased exponentially over time. Research investment has also increased exponentially, but with a significantly smaller exponent. So on your model the curve appears to be $x^{α}$ for $α > 1$ .

The performance curves database contains many interesting time series, and you'll note that the y-axis is typically exponential. They don't track inputs, so it's a bit hard to draw conclusions, but comparing to overall increases in R&D investment it looks like superlinear returns are probably quite common.

A few years ago Katja looked into the rate of algorithmic progress, and found that it was very approximately comparable to the rate of progress in hardware (though it's hard to know how much of that comes from realizing increasing economies of scale w.r.t. compute), across a range of domains. Algorithms seem like a particularly relevant domain to the current discussion.

[-]Aaron Roth8y80

Hi all,

Thanks for the very thoughtful comments; lots to chew on. As I hope was clear, I'm just an interested outside observer, and have not spent very long thinking about these issues, and don't know much of the literature. (My blog post ended up as a cross post here because I posted it to facebook, and asked if anyone could point me to more serious literature thinking about this problem, and a commenter suggested that I should crosspost here for feedback)

I agree that linear feedback is more plausible if we think of research breakthroughs as producing multiplicative gains, a simple point that I hadn't thought about.

[-]Qiaochu_Yuan8y60

Eliezer did exactly this calculation in an old LW post. Unfortunately I have no idea how to find it. Fortunately the calculation comes out the same no matter who does it!

[-]Vanessa Kosoy8y50

As far as I can tell, this possibility of an exponentially-paced intelligence explosion is the main argument for folks devoting time to worrying about super-intelligent AI now, even though current technology doesn't give us anything even close.

Not at all. The reasons we should work on AI alignment now are:

AI alignment is a hard problem
We don't know how long it will take us to solve it
We don't know how long it will be until superintelligent AI becomes possible
There is no strong reason to believe we will know superintelligent AI is coming far in advance

"Current technology doesn't give us anything even close" is not extremely informative since we don't know the metric w.r.t. which "close" should be measured. Heavier than air flight was believed impossible by many, until the Wright brothers did it. The technology of 1929 didn't give anything close to an atom bomb or a moon landing, and yet the atom bomb was made 16 years later, and the moon landing 40 years later.

Regarding the differential equations, I don't think it's a very meaningful analysis if you haven't even defined the scale on which you measure intelligence. If I(x) is some measure of intelligence that grows exponentially, then log I(x) is another measure of intelligence which grows linearly, and if I(x) grows linearly then exp I(x) grows exponentially.

Also, you might be interested in this paper by Yudkowsky.

[-]paulfchristiano8y100

if you do want to analyze the plausibility of an intelligence explosion then it seems worthwhile to respond in detail to previous work

If you replace "analyze the plausibility" with "convincingly demonstrate to skeptics" then this seems right.

The OP seems to be written more in the spirit of exploration rather than conclusive argument though, which seems valuable and doesn't necessarily require responding in detail to prior work (in this case ~100 pages). Seems like kind of a soul-crushing way to respond to curiosity :)

(I hope my own comments didn't come across harshly.)

[-]Vanessa Kosoy8y30

You're right, sorry. Edited.

[-]daozaich8y40

(1) As Paul noted, the question of the exponent alpha is just the question of diminishing returns vs returns-to-scale.

Especially if you believe that the rate $f = f (R)$ is a product of multiple terms (like e.g. Paul's suggestion $f = R^{α_{t}} \cdot R^{α_{a}}$ with one exponent for computer tech advances and another for algorithmic advances) then you get returns-to-scale type dynamics (over certain regimes, i.e. until all fruit are picked) with finite-time blow-up.

(2) Also, an imho crucial aspect is the separation of time-scales between human-driven research and computation done by machines (transistors are faster than neurons and buying more hardware scales better than training a new person up to the bleeding edge of research, especially considering Scott's amusing parable of the alchemists).

Let's add a little flourish to your model: You had the rate of research $I$ and the cumulative research $R$ ; let's give a name $C$ to the capability of the AI system. Then, we can model $\partial_{t} R = I = f (R) = g (C) = g (h (R))$ . This is your model, just splitting terms into $h$ , which tells us how hard AI progress is, and $g$ which tells us how good we are at producing research.

Now denote by $q = q (C)$ the fraction of work that absolutely has to be done by humans, and by $ε$ the speed-up factor for silicon over biology. Amdahl's law gives you $g (C) = \frac{1}{q (C) + ε (1 - q (C)) C}$ , or somewhat simplified $g (C) \geq \frac{1}{q + ε C}$ . This predicts a rate of progress that first looks like $1 / q$ , as long as human researcher input is the limiting factor, then becomes $1 / (ε C)$ when we have AIs designing AIs (recursive self-improvement, aka explosion), and then probably saturates at something (when the AI approaches optimality).

The crucial argument for fast take-off (as far as I understood it) is that we can expect $q (C)$ to hit $q = 0$ at some cross-over $C^{*}$ , and we can expect this to happen with a nonzero derivative $\partial_{C} q (C^{*}) \neq 0$ . This is just the claim that human-level AI is possible, and that the intelligence of the human parts of the AI research project is not sitting at a magical point (aka: this is generic, you would need to fine-tune your model to get something else).

The change of the rate of research output from the $1 / q (C)$ regime to the $1 / (ε C)$ regime sure looks like a hard-take-off singularity to me! And I would like to note that the function $h$ , i.e. the hardness AI research and the diminishing-returns vs returns-to-scale debate does not enter this discussion at any point.

In other words: If you model AI research as done by a team of humans and proto-AIs assisting the humans; and if you assert non-fungibility of humans vs proto-AI-assistents (even if you buy a thousand times more hardware, you still need the generally intelligent human researchers for some parts); and if you assert that better proto-AI-assistents can do a larger proportion of the work (at all); and if you assert that computers are faster than humans; then you get a possibly quite wild change at $q = 0$ .

I'd like to note that the cross-over is not "human-level AI", but rather " $q \approx 0$ " , i.e. an AI that needs (almost) no human assistence to progress the field of AI research.

On the opposing side (that's what Robin Hanson would probably say) you have the empirical argument that $q$ should decay like a power-law long before we $q = 0$ ("the last 10% take 90% of the work" is a folk formulation for "percentile 90-99 take nine time as much work as percentile 0-89" aka power law, and is borne out quite well, empirically).

This does not have any impact on whether we cross $q = 0$ with non-vanishing derivative, but would support Paul's view that the world will be unrecognizably crazy long before $q = 0$ .

PS. I am currently agnostic about the hard vs soft take-off debate. Yeah, I know, cowardly cop-out.

edit: In the above, C kinda encodes how fast / good our AI is and q encodes how general it is compared to humans. All AI singularity stuff tacitly assumes that human intelligence (assisted by stupid proto-AI) is sufficiently general to design an AI that exceeds or matches the generality of human intelligence. I consider this likely. The counterfactual world would have our AI capabilities saturate at some subhuman level for a long time, using terribly bad randomized/evolutionary algorithms, until it either stumbles unto an AI design that has better generality or we suffer unrelated extinction/heat-death. I consider it likely that human intelligence (assisted by proto-AI) is sufficiently general for a take-off. Heat-death is not an exaggeration: Algorithms with exponentially bad run-time are effectively useless.

Conversely, I consider it very well possible that human intelligence is insufficiently general to understand how human intelligence works! (we are really, really bad at understanding evolution/gradient-descent optimized anything, an that's what we are)

[-]ESRogs8y40

the Machine Intelligence Research Institute at Berkeley

Just wanted to clarify that MIRI is in Berkeley (the city), but is not affiliated with UC Berkeley (the university).

[-]jsteinhardt8y60

Very minor nitpick, but just to add, FLI is as far as I know not formally affiliated with MIT. (FHI is in fact a formal institute at Oxford.)

[-]Aaron Roth8y10

Thanks for the corrections. I changed the text to "in Berkeley". How should FLI be described? (I was just cribbing from Scott's FAQ when claiming it was at MIT)

[-]ESRogs8y20

You could say that it's in Cambridge, MA...

See more here: https://en.wikipedia.org/wiki/Futureof Life_Institute

[-]abramdemski5y20

One thing which confused me momentarily -- I looked at your differential equations and mentally substituted with $f (R (t))$ , to get something just in terms of $R$ , for convenience. Then I was temporarily confused by your graphs in terms of $I$ , because I was getting very different graphs (graphs in $R$ ) working things out in my head.

This pointed me at the question: should we be graphing in terms of $I$ or $R$ ?

A large part of the analysis is to pinpoint where we get sublinear growth vs superlinear, and subexponential vs superexponential. This quantifies different meanings of explosive growth (ie the "explosion" in "intelligence explosion"). But perhaps we should be looking at the growth of $R$ , instead of $I$ .

It seems like, in this model, $R$ represents capabilities -- if you know a lot of concrete things, you can do a lot. $I$ represents the pace at which capabilities increase. An explosion in capabilities could be alarming despite a rather modest graph of intelligence increase.

Put simply, you're graphing the derivative of capabilities. What happens when we graph capabilities?

Considering the cases you look at:

$f (x) = x$ : This is the one case where the two graphs are just the same anyway. $R$ grows exponentially, just like $I$ .
$f (x) = l o g (x)$ : capabilities see very nearly linear growth (since the derivative is very nearly constant).
$f (x) = x^{1 / 3}$ : capabilities grow like $x^{3 / 2}$ .
$f (x) = \sqrt{x}$ : capabilities grow like $x^{2}$ .
$f (x) = x^{2 / 3}$ : capabilities grow like $x^{3}$ .
$f (x) = x^{a}$ : capabilities grow polynomially for $a < 1$ , exponentially at $a = 1$ , and hyperbolically at $a > 1$ .

This gives a very different picture: some sort of superlinear growth seems almost inevitable. We get an explosion unless returns are extremely diminishing. On the other hand, the crossover from subexponential to superexponential happens at exactly the same point.

Of course, "cababilities" is a rather ambiguous notion. What does it really entail? Perhaps the salient feature of the world ends up being the log of capabilities.

[-]habryka8y20

Are you open to me copying over the complete content of the post? This makes it easier for people to reference and read over here.

[-]Aaron Roth8y30

Sure

[-]habryka8y40

Done! (with proper LaTeX rendering!)

[-]Aaron Roth8y10

Thanks!

[-]Charlie Steiner8y10

I agree with this, but I think you have to remember that many things with diminishing returns also have accelerating returns earlier on.

That is to say, logistic curves are all over the place. Business growth, practicing a new instrument, functionality of a software project over time, learning a language through immersion...

It's absolutely plausible for intelligence self-improvement to work for a few IQ points and then peter out, for some architecture. Humans, for example, are horrible at improving their own brains - but also see EURISKO. But I'm skeptical that returns are always going to be so sharply diminishing, and if everyone else is improving slowly, whatever system "goes critical" first is going to be the one that matters.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

21

Takeoff Speed: Simple Asymptotics in a Toy Model.

21

21

First, Some Background.

A Toy Model for Rates of Self Improvement

Thoughts

Postscript