Against easy superintelligence: the unforeseen friction argument

by Stuart_Armstrong7 min read10th Jul 201348 comments

39

FuturismForecasting & PredictionAI
Personal Blog

In 1932, Stanley Baldwin, prime minister of the largest empire the world had ever seen, proclaimed that "The bomber will always get through". Backed up by most of the professional military opinion of the time, by the experience of the first world war, and by reasonable extrapolations and arguments, he laid out a vision of the future where the unstoppable heavy bomber would utterly devastate countries if a war started. Deterrence - building more bombers yourself to threaten complete retaliation - seemed the only counter.

And yet, things didn't turn out that way. Against all past trends, the light fighter plane surpassed the heavily armed bomber in aerial combat, the development of radar changed the strategic balance, and cities and industry proved much more resilient to bombing than anyone had a right to suspect.

Could anyone have predicted these changes ahead of time? Most probably, no. All of these ran counter to what was known and understood, (and radar was a completely new and unexpected development). What could and should have been predicted, though, was that something would happen to weaken the impact of the all-conquering bomber. The extreme predictions would be unrealistic; frictions, technological changes, changes in military doctrine and hidden, unknown factors, would undermine them.

This is what I call the "generalised friction" argument. Simple predictive models, based on strong models or current understanding, will likely not succeed as well as expected: there will likely be delays, obstacles, and unexpected difficulties along the way.

I am, of course, thinking of AI predictions here, specifically of the Omohundro-Yudkowsky model of AI recursive self-improvements that rapidly reach great power, with convergent instrumental goals that make the AI into a power-hungry expected utility maximiser. This model I see as the "supply and demand curve" of AI prediction: too simple to be true in the form described.

But the supply and demand curves are generally approximately true, especially over the long term. So this isn't an argument that the Omohundro-Yudkowsky model is wrong, but that it will likely not happen as flawlessly as described. Ultimately, the "bomber will always get through" turned out to be true: but only in the form of the ICBM. If you take the old arguments and replace "bomber" with "ICBM", you end with strong and accurate predictions. So "the AI may not foom in the manner and on the timescales described" is not saying "the AI won't foom".

Also, it should be emphasised that this argument is strictly about our predictive ability, and does not say anything about the capacity or difficulty of AI per se.

Why frictions?

An analogy often used for AI is that of the nuclear chain reaction: here is a perfect example of a recursive improvement, as the chain reaction grows and grows indefinitely. Scepticism about the chain reaction was unjustified, though experts were far too willing to rule it out ahead of time, based on unsound heuristics.

In contrast, many examples of simple models were slowed or derailed by events. The examples that came immediately to mind, for me, were the bomber example, the failure of expansion into space after the first moon landing, and the failure of early AI predictions. To be fair, there are also examples of unanticipated success, often in economic policy, but even in government interventions. But generally, dramatic predictions fail, either by being wrong or by being too optimistic on the timeline. Why is this?

Beware the opposition

One reason that predictions fail is because they underestimate human opposition. The bomber fleets may have seemed invincible, but that didn't take into account that large number of smart people were working away to try and counter them. The solution turned out to be improved fighters and radar; but even without knowing that, it should have been obvious some new methods or technologies were going to be invented or developed. Since the strength of the bomber depended on a certain strategic landscape, it should have been seen that deliberate attempts to modify that landscape would likely result in a reduction of the bomber's efficacy.

Opposition is much harder to model, especially in such a wide area as modern warfare and technology. Still, theorisers should realised that there would have been some opposition, and that, historically, ways have been found to counter most weapons, in ways that were not obvious at the time of the weapon's creation. It is easier to change the strategic landscape than to preserve it, so anything that depends on the current strategic landscape will most likely be blunted by human effort.

This kind of friction is less relevant to AI (though see the last section), and not relevant at all to the chain reaction example: there are no fiendish atoms plotting how to fight against human efforts to disintegrate then.

If noise is expected, expect reduced impact

The second, more general, friction argument, is just a rephrasing of the truism that things are rarely as easy as they seem. This is related to "the first step fallacy", the argument that just because we can start climbing a hill, doesn't mean we can reach the sky.

Another way of phrasing it is in terms of entropy or noise: adding noise to a process rarely improves it, and almost always makes it worse. Here the "noise" is all the unforeseen and unpredictable details that we didn't model, didn't (couldn't) account for, but that would have their bearing on our prediction. These details may make our prediction more certain or faster, but they are unlikely to do so.

The sci-fi authors of 1960 didn't expect that we would give up on space: they saw the first steps into space, and extrapolated to space stations and martian colonies. But this was a fragile model, dependent on continued investment in space exploration, and assuming there would be no setbacks. But changes in government investment and unexpected setbacks were not unheard of: indeed, they were practically certain, and would have messed up any simplistic model.

Let us return to chain reactions. Imagine that an alien had appeared and told us that our theory of fission was very wrong, that there were completely new and unexpected phenomena that happened at these energies that we hadn't yet modelled. Would this have increased or decreased the likelihood of a chain reaction? This feels like it can only decrease it: the chain reaction depended on a feedback loop, and random changes are more likely to break the loop than reinforce it. Now imagine that the first chain reaction suffered not from an incomplete theory, but from very sloppy experimental proceeding: now we're nearly certain we won't see the chain reaction, as this kind of noise degraded very strongly towards the status quo.

So why then, were the doubters wrong to claim that the chain reaction wouldn't work? Because we were pretty certain at the time that these noises wouldn't materialise. We didn't only have a theory that said we should expect to see a chain reaction, barring unexpected phenomena; we had a well-tested theory that said we should not expect to see unexpected phenomena. We had an anti-noise theory: any behaviour that potentially broke the chain reaction would have been a great surprise. Assuming a minimum of competence of the experimenters (a supposition backed up by history), success was the most likely outcome.

Contrast that with AI predictions: here, we expect noise. We expect AI to be different from our current models, we expect developments to go in unpredictable directions, we expect to see problems that are not evident from our current vantage point. All this noise is likely to press against our current model, increasing its uncertainty, extending its timeline. Even if our theory was much more developed that it is now, even if we thought about it for a thousand years and had accounted for every eventuality we could think of, if we expect that there is still noise, we should caveat our prediction.

Who cares?

Right, so we may be justified in increasing our uncertainty about the impact of AI foom, and in questioning the timeline. But what difference does it make in practice? Even with all the caveats, there is still a worryingly high probability of a fast, deadly foom, well worth putting all our efforts into preventing. And slow, deadly fooms aren't much better, either! So how is the argument relevant?

It becomes relevant in accessing the relative worth of different interventions. For instance, one way of containing an AI would be to build a community of fast uploads around it: with the uploads matching the AI in reasoning speed, they have a higher chance of controlling it. Or we could try and build capacity for adaptation at a later date: if the AIs have a slow takeoff, it might be better to equip the people of the time with the tools to contain it (since they will have a much better understanding of the situation), rather than do it all ahead of time. Or we could try and build Oracles or reduced impact AIs, hoping that we haven't left out anything important.

All these interventions share a common feature: they are stupid to attempt in the case of a strong, fast foom. They have practically no chance of working, and are just a waste of time and effort. If, however, we increase the chances of weaker, slower fooms, then they start to seem more attractive - possibly worth putting some effort into, in case the friendly AI approach doesn't bear fruits in time.

39

48 comments, sorted by Highlighting new comments since Today at 8:47 AM
New Comment

One thing that has always bothered me is that people confuse predictions of technological capability with predictions of policy. In terms of the former, the predictions of moon colonies and such were quite accurate. We could indeed have had cities on the moon before 2000 if we wished. We had the technological capability. In terms of the latter, though, they were not accurate at all. Ultimately, space policy turned away from such high-cost endeavours. The problem is that the trajectory of technology is much easier to predict than the trajectory of policy. Technology usually follows a steady curve upward. Human policy, on the other hand, is chaotic and full of free variables.

In short, we could indeed have had moon colonies, flying cars, and jetpacks, but we chose not to, for a variety of (good) reasons.

The 'bomber will always get through' prediction was inaccurate because it underestimated technological progress. Namely, the development of radar and later the development of ICBMs. Similarly, people dramatically failed to predict the rise of the internet or computing technology in general, due to 1. underestimating technological progress and 2. underestimating human creativity. In fact, failures of technological prediction tend to more frequently be failures of underestimation than overestimation.

This is why it's always good to draw a line between these two concepts, and I fear that in your post you have not done so. We could, in principle, have a FOOM scenario. There is nothing we know about computing and intelligence that suggests this is not possible. On the other hand, whether we will choose to create a FOOM-capable AI is a matter of human policy. It could be that we unilaterally agree to put a ban on AI above a certain level (a sad and unenlightened decision, but it could happen). The point to all of this is that trying to foresee public policy is a fool's errand. It's better to assume the worst and plan accordingly.

Two other examples that I'm familiar with:

In the mid 1990s, cryptographers at Microsoft were talking (at least privately, to each other) about how DRM technology is hopeless, which has turned out to be the case as every copy protection for mass market products (e.g., DVDs, Blu-Rays, video games, productivity software) has been quickly broken.

A bit more than 10 years ago I saw that the economics of computer security greatly favored the offense (i.e., the cyberweapon will always get through) and shifted my attention away from that field as a result. This still seems to be the case today, maybe to an even greater extent.

Maybe not. DRM does not prevent copying. It does, however, enable the control of who is allowed to produce which devices. E.g. DRM makes it much harder to market a DVR, DVD player, cable box, or software that can connect to the iTunes Music Store. It's not a significant technical challenge, but it is a legal one. HTML 5 editor Ian Hickson has made this really clear.

A bit more than 10 years ago I saw that the economics of computer security greatly favored the offense (i.e., the cyberweapon will always get through) and shifted my attention away from that field as a result. This still seems to be the case today, maybe to an even greater extent.

When do you foresee that changing to an advantage for the defense? Presumably sometime before FAI needs to be invulnerable to remote exploits. All of the technological pieces are in place (proof carrying code, proof-generating compilers) but simply aren't used by many in the industry and importantly by none of the operating systems I'm aware of.

When do you foresee that changing to an advantage for the defense? Presumably sometime before FAI needs to be invulnerable to remote exploits.

I don't currently foresee the economics of computer security changing to an advantage for the defense. The FAI, as well as the FAI team while it's working on the FAI, will probably have to achieve security by having more resources than the offense, which is another reason why I'm against trying to build an FAI in a basement.

All of the technological pieces are in place (proof carrying code, proof-generating compilers) but simply aren't used by many in the industry and importantly by none of the operating systems I'm aware of.

I'm not an expert in this area, but the lack of large scale deployments makes me suspect that the technology isn't truly ready. Maybe proof carrying code is too slow or otherwise too resource intensive, or it's too hard to formalize the security requirements correctly? Can you explain what convinced you that "all of the technological pieces are in place"?

Speaking as somebody who works in computer systems research:

I agree with Pentashagon's impression: we could engineer a compiler and operating system with proof-carrying code tomorrow, without needing any major research breakthroughs. Things very similar to proof-carrying code are in routine deployment. (In particular, Java bytecode comes with proofs of type safety that are checked at load time and researchers have built statically-verified kernels and compilers.)

I believe the real barrier at this point is that any sort of verification effort has to go bottom-up, and that means building new libraries, operating systems, etc ad nausiam before anything else runs. And that's just a huge expense and means losing a lot of legacy code.

My impression is that it's not a performance problem. In the schemes I've seen, PCC is checked at load or link time, not at run-time, so I wouldn't expect a big performance hit.

Separately, I'm not sure PCC gets you quite as much security as you might need. Users make mistakes -- grant too many permissions, put their password where they shouldn't, etc. That's not a problem you can solve with PCC.

I don't currently foresee the economics of computer security changing to an advantage for the defense. The FAI, as well as the FAI team while it's working on the FAI, will probably have to achieve security by having more resources than the offense, which is another reason why I'm against trying to build an FAI in a basement.

If that's true then I'm worried about the ability of the FAI developers to protect the hardware from the FAI as it learns. What safeguards the FAI from accidentally triggering a bug that turns it into UFAI as it explores and tests its environment? The period between when the initial self-improving FAI is turned on and the point that it is confident enough in the correctness of the system it runs on seems to be unnecessarily risky. I'd prefer that the FAI along with its operating system and libraries are formally proven to be type-safe at a minimum.

Hardware is potentially even harder. How does the FAI ensure that a bit flip or hardware bug hasn't turned it into UFAI? Presumably running multiple instances in voting lock-step with as much error correction as possible on as many different architectures as possible would help, but I think an even more reliable hardware design process will probably be necessary.

I'm not an expert in this area, but the lack of large scale deployments makes me suspect that the technology isn't truly ready. Maybe proof carrying code is too slow or otherwise too resource intensive, or it's too hard to formalize the security requirements correctly? Can you explain what convinced you that "all of the technological pieces are in place"?

As asr points out, economics is probably the biggest reason. It's cost-prohibitive to formally prove the correctness of every component of a computer system and there's a break-even point for the overall system where hardware reliability drops below software reliability. The security model will be the most difficult piece to get right in complex software that has to interact with humans, but type-safety and memory-safety are probably within our grasp now. To the best of my knowledge the bugs in Java are not type errors in the byte-code but in the implementation of the JVM and native library implementations which are not proven to be type-safe. Again, the economic cost of type-safe bytecode versus fast C/C++ routines.

DRM technology is quite widely deployed. It also stops lots of copying. So: your comments about it being "hopeless" seem a bit strange to me.

Well, hopeless relative to the hopes that some people had at that time. For example, from Wikipedia:

BD+ played a pivotal role in the format war of Blu-ray and HD DVD. Several studios cited Blu-ray Disc's adoption of the BD+ anti-copying system as the reason they supported Blu-ray Disc over HD DVD. The copy protection scheme was to take "10 years" to crack, according to Richard Doherty, an analyst with Envisioneering Group.

and

The first titles using BD+ were released in October 2007. Since November 2007, versions of BD+ protection have been circumvented by various versions of the AnyDVD HD program.[

DRM is not very effective at protecting static targets - such as a large installed base of identical DVD players - where one crack can compromise all the content. It's rather better at protecting content which is more dynamic - such as software - where each game can ship with its own type of polymorphic DRM.

Despite a massive base of installed readers, Kindle DRM has been somewhat effective - despite being cracked. Much content that people are prepared to pay for has not, in practice, been ripped yet.

Much content that people are prepared to pay for has not, in practice, been ripped yet.

Evidence, numbers? (This is my second request for evidence and numbers.) There's a long tail of books available for Kindle that have approximately no readers.

People buy stuff because they think they should and it's easy to, not because of DRM. (This was the surprise for the record industry that the iTunes model actually worked - they had previously been creating terrible music stores that didn't work just for the purpose of creating evidence that filesharing was costing them actual money.)

It also stops lots of copying.

(a) Numbers?

(b) What's your evidence that it makes a damn bit of difference? What people want to copy, they do copy.

DRM is sold as security from copying. It has failed utterly, because such security is impossible in theory, and has turned out impossible in practice.

"In theory" is a bit of a slippery term, since all encryption can be cracked in theory. Apart from that, DRM is possible in practice, if you can completely control the hardware. Once you're allowed to hook any TV you want into your DVD player, uncrackable DRM goes out the window, because the player has to supply the TV with unencrypted video. The other way DRM can work is if users aren't viewing all of the content, and there's a way to require external credentials. For instance, people can be forced to buy separate copies of Diablo III if they want to play on BattleNet.

all encryption can be cracked in theory

Is it too pedantic to mention one-time pads?

Is it too pedantic to mention one-time pads?

No, that's an entirely valid point and I even suggest you were in error when you conceded. If two individuals have enough private mutual information theory allows them encryption that can not be cracked.

[-][anonymous]8y 1

A one-time pad has to be transmitted, too. MITM will crack it.

A one-time pad has to be transmitted, too. MITM will crack it.

A one-time pad that needs to be transmitted can be violated by MITM. But if the relevant private mutual information is already shared or is shared directly without encryption then the encryption they use to communicate is not (in theory required to be) crackable. Since the claim was that "all encryption can be cracked in theory" it is not enough for some cases to be crackable, all must be.

Fair enough - I was out-pedanted!

"In theory" is a bit of a slippery term, since all encryption can be cracked in theory.

This is what we call The Fallacy of Gray. There is a rather clear difference between the possibility of brute forcing 1024 bit encryption and the utter absurdity of considering a DRMed multimedia file 'secure' when I could violate it using a smartphone with a video camera (and lossless proof-of-concept violations are as simple as realising that vmware exists.)

it should have been obvious some new methods or technologies were going to be invented or developed. Since the strength of the bomber depended on a certain strategic landscape, it should have been seen that deliberate attempts to modify that landscape would likely result in a reduction of the bomber's efficacy.

This sounds like hindsight bias to me. The opposite rationalization: it should have been obvious that since bombers were the best way to wage war, people like Stanley Baldwin would recognize this and put resources in to making them better.

Or to put it another way, according to your way of thinking, after the invention of the ICBM, it "should have been obvious" that humans would soon modify the strategic landscape to make ICBMs less effective. (They're a heck of a lot worse than regular bombs, so the incentives are much stronger. Obviously we would come up with a reliable way to defend against them, right?)

Anyway, I think people are too confident in their knowledge/predictions in general. I suspect that a well-calibrated predictioneer would waffle a lot with their predictions, but waffling a lot looks bad and people want to look good. If looking good is a contributing factor for being high status, we'd expect many peoples' heroes to be high-status folks who make incorrectly confident predictions (in the course of optimizing for high status or just because there was some other selection effect where confident predictors ended up becoming high status). So this is one area where everyone would just have to individually figure out for themselves that making confident predictions was a bad idea. (Maybe--it's just an idea I'm not especially confident in :P)

Or to put it another way, according to your way of thinking, after the invention of the ICBM, it "should have been obvious" that humans would soon modify the strategic landscape to make ICBMs less effective.

Didn’t that happen, in a way? I don’t remember an ICBM (even non-nuclear) actually being used, despite a relatively large number of wars fought since their invention.

Or to put it another way, according to your way of thinking, after the invention of the ICBM, it "should have been obvious" that humans would soon modify the strategic landscape to make ICBMs less effective.

The logic of MAD, public opinion, and the lack of direct great power conflicts, seems to have precluded them from seriously trying - which is an interesting development.

A key point about intelligent agency is that it can produce positive noise as well as negative noise. Imagine someone who'd watched the slow evolution of life on Earth over the last billion years, looking at brainy hominids and thinking, "Well, these brains seem mighty efficient, maybe even a thousand times as efficient as evolution, but surely not everything will go as expected." They would be correct to widen their confidence intervals based on this "not everything will go as I currently expect" heuristic. They would be wrong to widen their confidence intervals only downward. Human intelligence selectively sought out and exploited the most positive opportunities.

Yes, but extra positive noise in the strong FOOM scenarios doesn't make much difference. If a bad AI fooms in 30 minutes rather than an hour, or even in 5 minutes, we're still equally dead.

Positive noise might mean being able to FOOM from a lower starting base.

Point taken, though I've already increased my probability of early FOOM ( http://www.youtube.com/watch?v=ad4bHtSXiFE )

And I stand by the point that most noise will be negative. Start changing random things in, say, the earth's ecosystem, may open great new opportunities, but is most likely to cost us than benefit us.

The bomber will always get through

And yet, things didn't turn out that way.

No? It's true that we don't have literal bombers devastating cities, as Douhet and Harris envisioned; but the nuclear-tipped ICBM occupies the same strategic role. It seems to me that Baldwin was correct about the strategic principle, that there is no defense except retaliation, and wrong only about the implementation and timescale. For that matter, if you used atomic bombs, even literal bombers would be quite sufficient; ten percent was considered heavy casualties in a WWII raid, but would presumably have been acceptable if the enemy really lost a city in exchange. With nuclear bombs you could achieve that - the problem wasn't so much the defenses, it was that the throw weight wasn't high enough. The bomber did always get through; it just didn't do as much damage as people thought. That was a problem with bomb technology, not with the strategic conception.

It's also worth noting that Douhet envisioned a mix of high explosive, incendiary, and chemical bombardment. Clearly this would have given even 1940 bombers a much higher effective throw weight, and both sides contemplated the use of gas, and took precautions - witness the amount of gas masks handed out in London during the Blitz. Why didn't they use it? Because they were afraid of retaliation in kind!

In the bomber analogy, it seems to me that friction is an argument for increasing our timescale and perhaps looking to unexpected implementations; it does not obviously decrease the overall probability.

I point you to the paragraph where I talk of ICBMs :-)

Agh. I don't understand how I missed that.

It was one point in a lot of them :-)

I concur that almost nothing is ever quite that easy.

That said, it would be interesting if anyone has any examples of when they were. I think of Moore's Law holding for decades, such that if you made outlandish predictions based on a doubling of everything every two years you would mostly be correct.

The bomber will always get through

Did the British military ever really believe this? When did they start to build air defenses and how much money did they put into these defenses?

Grrr.... very hard to find figures. But it seems that bomber command lost 55,573 killed out of a total of 125,000 aircrew http://en.wikipedia.org/wiki/RAF_Bomber_Command#Casualties

Whereas fighter command lost 507 out of 2945 during the battle of Britain. http://www.oocities.org/mchirnside/fcac.htm Let's multiply this by 6 to cover the entire war.

Now, crew and casualties are weak proxies for actual expenditure, which I couldn't find, but it does seem certain that even when caught in the middle of an actual war, the UK put a gargantuan effort into bombing as compared to anti-bombing.

Whereas fighter command lost 507 out of 2945 during the battle of Britain. http://www.oocities.org/mchirnside/fcac.htm Let's multiply this by 6 to cover the entire war.

This is a bit low, as the Battle of Britain was a relatively easy fight for the RAF. The full losses were 3,690 killed, 1,215 wounded and 601 POW.

Thanks! where did you find those figures?

Still confirms the main point, though.

Wikipedia

It cites: Bowyer, Chaz. RAF Fighter Command, 1936-1968. BCA/J.M. Dent, 1980. ISBN 0-460-04388-9.

Duh, thanks! I read that article, but missed the useful summary...

But one of the reasons why the "bomber doesn't always get through" is that offensive air missions (over enemy territory and antiaircraft guns) are much more dangerous than defensive ones (over friendly territory).

Indeed. But the total servicemen also points to the same conclusion.

Do bombers require comparatively greater crew numbers or anything along those lines?

Yes; fighters are generally 1-2 crewmember planes, bombers in WWII had up to 11. Their prices were similarly higher, though, so his point stands.

cities and industry proved much more resilient to bombing than anyone had a right to suspect.

What information was unavailable about the damage that would be caused by a given amount of bombing?

My guess is that people overreacted to the complacency that led to WW1, and thought it safer to overstate the harm done by war in order to motivate efforts to avoid it.

I think it was partially the effect of the (relatively tiny) Zeppelin/Gotha raids on the UK in WW1, and the somewhat exaggerated stories from Guernica.

cities and industry proved much more resilient to bombing than anyone had a right to suspect.

While this may be true, I don’t think this is relevant for ICBMs replacing bombers. ICBMs are effective because of nukes (compact high power). Compare how many bombers were used in the last half century with how many ICBMs were used. Without nukes (and other game-breaking things like that), ICBMs would not be at all useful. With nukes, the resilience of cities is mostly irrelevant (once a single nuke gets there, it doesn’t matter how).

(It’s hard to build a good example scenario, since the nukes changed strategy completely and we’d probably have many more and different kinds of wars than we do now without them. But I think if nukes weren't discovered/possible the bomber would still be "the king".)

(ETA: I’m not arguing against your post, just pointing out that it would be a bit stronger without that line.)