Review of Soft Takeoff Can Still Lead to DSA

by Daniel Kokotajlo5 min read10th Jan 202110 comments

62

Ω 32

AI TakeoffLessWrong ReviewAI
Frontpage
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

A few months after writing this post I realized that one of the key arguments was importantly flawed. I therefore recommend against inclusion in the 2019 review. This post presents an improved version of the original argument, explains the flaw, and then updates my all-things-considered view accordingly.

Improved version of my original argument

  1. Definitions:
    1. “Soft takeoff” is roughly “AI will be like the Industrial Revolution but 10x-100x faster”
    2. “Decisive Strategic Advantage” (DSA) is “a level of technological and other advantages sufficient to enable it to achieve complete world domination.” In other words, DSA is roughly when one faction or entity has the capability to “take over the world.” (What taking over the world means is an interesting question which we won’t explore here. Nowadays I’d reframe things in terms of potential PONRs.)
    3. We ask how likely it is that DSA arises, conditional on soft takeoff. Note that DSA does not mean the world is actually taken over, only that one faction at some point has the ability to do so. They might be too cautious or too ethical to try. Or they might try and fail due to bad luck.
  2. In a soft takeoff scenario, a 0.3 - 3 year technological lead over your competitors probably gives you a DSA.
    1. It seems plausible that for much of human history, a 30-year technological lead over your competitors was not enough to give you a DSA.
    2. It also seems plausible that during and after the industrial revolution, a 30-year technological lead was enough. (For more arguments on this key point, see my original post.)
    3. This supports a plausible conjecture that when the pace of technological progress speeds up, the length (in clock time) of technological lead needed for DSA shrinks proportionally.
  3. So a soft takeoff could lead to a DSA insofar as there is a 0.3 - 3 year lead at the beginning which is maintained for a few years.
  4. 0.3 - 3 year technological leads are reasonably common today, and in particular it’s plausible that there could be one in the field of AI research.
  5. There’s a reasonable chance of such a lead being maintained for a few years.
    1. This is a messy question, but judging by the table below, it seems that if anything the lead of the front-runner in this scenario is more likely to lengthen than shorten!
    2. If this is so, why did no one achieve DSA during the Industrial Revolution? My answer is that spies/hacking/leaks/etc. are much more powerful during the industrial revolution than they are during a soft takeoff, because they have an entire economy to steal from and decades to do it, whereas in a soft takeoff ideas can be hoarded in a specific corporation and there’s only a few years (or months!) to do it.
  6. Therefore, there’s a reasonable chance of DSA conditional on soft takeoff.
Factors that might shorten the leadFactors that might lengthen the lead
If you don’t sell your innovations to the rest of the world, you’ll lose out on opportunities to make money, and then possibly be outcompeted by projects that didn’t hoard their innovations. Hoarding innovations gives you an advantage over the rest of the world, because only you can make use of them.
Spies, hacking, leaks, defections, etc. Big corporations with tech leads often find ways to slow down their competition, e.g. by lobbying to raise regulatory barriers to entry.
 Being known to be the leading project makes it easier to attract talent and investment.
 There might be additional snowball effects (e.g. network effect as more people use your product providing you with more data)

I take it that 2, 4, and 5 are the controversial bits. I still stand by 2, and the arguments made for it in my original post. I also stand by 4. (To be clear, it’s not like I’ve investigated these things in detail. I’ve just thought about them for a bit and convinced myself that they are probably right, and I haven’t encountered any convincing counterarguments so far.)

5 is where I made a big mistake. 

(Comments on my original post also attacked 5 a lot, but none of them caught the mistake as far as I can tell.)

My big mistake

Basically, my mistake was to conflate leads measured in number-of-hoarded-ideas with leads measured in clock time. Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead. 

Here’s a toy model, based on the one I gave in the original post:

There are some projects/factions. There are many ideas. Projects can have access to ideas. Projects make progress, in the form of discovering (gaining access to) ideas. For each idea they access, they can decide to hoard or not-hoard it. If they don’t hoard it, it becomes accessible to all. Hoarded ideas are only accessible by the project that discovered them (though other projects can independently rediscover them). The rate of progress of a project is proportional to how many ideas they can access.

Let’s distinguish two ways to operationalize the technological lead of a project. One is to measure it in ideas, e.g. “Project X has 100 hoarded ideas and project Y has only 10, so Project X is 90 ideas ahead.” But another way is to measure it in clock time, e.g. “It’ll take 3 years for project Y to have access to as many ideas as project X has now.” 

Suppose that all projects hoard all their ideas. Then the ideas-lead of the leading project will tend to lengthen: the project begins with more ideas, so it makes faster progress, so it adds new ideas to its hoard faster than others can add new ideas to theirs. However, the clocktime-lead of the leading project will remain fixed. It’s like two identical cars accelerating one after the other on an on-ramp to a highway: the distance between them increases, but if one entered the ramp three seconds ahead, it will still be three seconds ahead when they are on the highway.

But realistically not all projects will hoard all their ideas. Suppose instead that for the leading project, 10% of their new ideas are discovered in-house, and 90% come from publicly available discoveries accessible to all. Then, to continue the car analogy, it’s as if 90% of the lead car’s acceleration comes from a strong wind that blows on both cars equally. The lead of the first car/project will lengthen slightly when measured by distance/ideas, but shrink dramatically when measured by clock time.

The upshot is that we should return to that table of factors and add a big one to the left-hand column: Leads shorten automatically as general progress speeds up, so if the lead project produces only a small fraction of the general progress, maintaining a 3-year lead throughout a soft takeoff is (all else equal) almost as hard as growing a 3-year lead into a 30-year lead during the 20th century. In order to overcome this, the factors on the right would need to be very strong indeed.

Conclusions

My original argument was wrong. I stand by points 2 and 4 though, and by the subsequent posts I made in this sequence. I notice I am confused, perhaps by a seeming contradiction between my explicit model here and my take on history, which is that rapid takeovers and upsets in the balance of power have happened many times, that power has become more and more concentrated over time, and that there are not-so-distant possible worlds in which a single man rules the whole world sometime in the 20th century. Some threads to pull on:

  1. To the surprise of my past self, Paul agreed DSA is plausible for major nations, just not for smaller entities like corporations: “I totally agree that it wouldn't be crazy for a major world power to pull ahead of others technologically and eventually be able to win a war handily, and that will tend happen over shorter and shorter timescales if economic and technological progress accelerate.”) Perhaps we’ve been talking past each other, because I think a very important point is that it’s common for small entities to gain control of large entities. I’m not imagining a corporation fighting a war against the US government; I’m imagining it taking over the US government via tech-enhanced lobbying, activism, and maybe some skullduggery. (And to be clear, I’m usually imagining that the corporation was previously taken over by AIs it built or bought.)
  2. Even if takeoff takes several years it could be unevenly distributed such that (for example) 30% of the strategically relevant research progress happens in a single corporation. I think 30% of the strategically relevant research happening in a single corporation at beginning of a multi-year takeoff would probably be enough for DSA.
  3. Since writing this post my thinking has shifted to focus less on DSA and more on potential AI-induced PONRs. I also now prefer a different definition of slow/fast takeoff. Thus, perhaps this old discussion simply isn’t very relevant anymore.
  4. Currently the most plausible doom scenario in my mind is maybe a version of Paul’s Type II failure. (If this is surprising to you, reread it while asking yourself what terms like “correlated automation failure” are euphemisms for.) I’m not sure how to classify it, but this suggests that we may disagree less than I thought.

Thanks to Jacob Laggeros for nudging me to review my post and finally get all this off my chest. And double thanks to all the people who commented on the original post!

62

Ω 32

10 comments, sorted by Highlighting new comments since Today at 3:28 PM
New Comment

Meta: Strong upvote for pulling a specific mistake out and correcting it; this is a good method because in such a high-activity post it would be easy for the discussion to get lost in the comments (especially in the presence of other wrong criticisms).

That being said, I disagree with your recommendation against inclusion in the 2019 review for two reasons:

  1. The flaw doesn't invalidate the core claim of the essay. More detailed mechanisms for understanding how technical leads are established and sustained at most adjusts the time margins of the model; the updated argument does not call into question whether slow takeoff is a thing or weigh against DSA being achievable.
  2. This kind of change clearly falls within the boundary of reasonable edits to essays which are being included.

I am heartened to hear this. I do agree that the core claim of the essay is not invalidated -- "Soft takeoff can still lead to DSA." However, I do think the core argument of the essay has been overturned, such that it leads to something close to the opposite conclusion: There is a strong automatic force that works to make DSA unlikely to the extent that takeoff is distributed (and distributed = a big part of what it means to be soft, I think).

Basically, I think that if I were to rewrite this post to fix what I now think are errors and give what I now think is the correct view, including uncertainties, it would be a completely different post. In fact, it would be basically this review post that I just wrote! (Well, that plus the arguments for steps 2 and 4 from the original post, which I still stand by.) I guess I'd be happy to do that if that's what people want.

the opposite conclusion: There is a strong automatic force that works to make DSA unlikely to the extent that takeoff is distributed

By the standards of inclusion, I feel like this is an even better contribution! My mastery of our corpus is hardly complete, but it appears to me until you picked up this line of inquiry deeper interrogation of the circumstances surrounding DSA was sorely lacking on LessWrong. Being able to make more specific claims about causal mechanisms is huge.

I propose a different framing than opposite conclusion: rather you are suggesting some causal mechanism for why a slow takeoff DSA is different in character from FOOM with fewer gigahertz.

I am going to add a review on the original post so this conversation doesn't get missed in the voting phase.

Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead. 

Have you considered how this model changes under the low hanging fruit hypothesis, by which I mean more advanced ideas in a domain are more difficult and time consuming to discover than the less advanced ones? My reasoning for why it matters:

  • DSA relies on one or more capability advantages.
  • Each capability depends on one or more domains of expertise to develop.
  • A certain amount of domain expertise is required to develop the capability.
  • Ideas become more difficult in terms of resources and time to discover as they approach the capability threshold.

Now this doesn't actually change the underlying intuition of a time advantage very much; mostly I just expect that the '10x faster innovation' component of the example will be deeply discontinuous. This leads naturally to thinking about things like a broad DSA, which might consist of a systematic advantage across capabilities, versus a tall DSA, which would be more like an overwhelming advantage in a single, high import capability.

I haven't specifically tried to model the low hanging fruit hypothesis, but I do believe the hypothesis and so it probably doesn't contradict the model strongly. I don't quite follow your reasoning though--how does the hypothesis make discontinuities more likely? Can you elaborate?

Sure!

I have a few implicit assumptions that affect my thinking:

  • A soft takeoff starts from something resembling our world, distributed
  • There is at least one layer above ideas (capability)
  • Low hanging fruit hypothesis

The real work is being done by an additional two assumptions:

  • The capability layer grows in a way similar to the idea layer, and competes for the same resources
  • Innovation consists of at least one capability

So under my model, the core mechanism of differentiation is that developing an insurmountable single capability advantage competes with rapid gains in a different capability (or line of ideas), which includes innovation capacity. Further, different lines of ideas and capabilities will have different development speeds.

Now a lot of this differentiation collapses when we get more specific about what we are comparing, like if we choose Google, Facebook and Microsoft on the single capability of Deep Learning. It is worth considering that software has an unusually cheap transfer of ideas to capability, which is the crux of why AI weighs so heavily as a concern. But this is unique to software for now, and in order to be a strategic threat it has to cash out in non-software capability eventually, so keeping the others in mind feels important.

OK, so if I'm getting this correctly, the idea is that there are different capabilities, and the low hanging fruit hypothesis applies separately to each one, and not all capabilities are being pursued successfully at all times, so when a new capability starts being pursued successfully there is a burst of rapid progress as low-hanging fruit is picked. Thus, progress should proceed jumpily, with some capabilities stagnant or nonexistent for a while and then quickly becoming great and then levelling off. Is this what you have in mind?

That is correct. And since different players start with different capabilities and are in different local environments under the soft takeoff assumption, I really can't imagine a scenario where everyone winds up in the same place (or even tries to get there - I strongly expect optimizing for different capabilities depending on the environment, too).

OK, I think I agree with this picture to some extent. It's just that if things like taking over the world require lots of different capabilities, maybe jumpy progress in specific capabilities distributed unevenly across factions all sorta averages out thanks to law of large numbers into smooth progress in world-takeover-ability distributed mostly evenly across factions.

Or not. Idk. I think this is an important variable to model and forecast, thanks for bringing it up!

Currently the most plausible doom scenario in my mind is maybe a version of Paul’s Type II failure. (If this is surprising to you, reread it while asking yourself what terms like “correlated automation failure” are euphemisms for.) 

This is interesting, and I'd like to see you expand on this. Incidentally I agree with the statement, but I can imagine both more and less explosive, catastrophic versions of 'correlated automation failure'. On the one hand it makes me think of things like transportation and electricity going haywire, on the other it could fit a scenario where a collection of powerful AI systems simultaneously intentionally wipe out humanity.

Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead. 

What if, as a general fact, some kinds of progress (the technological kinds more closely correlated with AI) are just much more susceptible to speed-up? I.e, what if 'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them? In that case, if the parts of overall progress that affect the likelihood of leaks, theft and spying aren't sped up by as much as the rate of actual technology progress, the likelihood of DSA could rise to be quite high compared to previous accelerations where the order of magnitude where the speed-up occurred was fast enough to allow society to 'speed up' the same way.

In other words - it becomes easier to hoard more and more ideas if the ability to hoard ideas is roughly constant but the pace of progress increases. Since a lot of these 'technologies' for facilitating leaks and spying are more in the social realm, this seems plausible.

But if you need to generate more ideas, this might just mean that if you have a very large initial lead, you can turn it into a DSA, which you still seem to agree with:

  • Even if takeoff takes several years it could be unevenly distributed such that (for example) 30% of the strategically relevant research progress happens in a single corporation. I think 30% of the strategically relevant research happening in a single corporation at beginning of a multi-year takeoff would probably be enough for DSA.