AI can win a conflict against us

steven0461; Vishakha

Context: This is a linkpost for https://aisafety.info/questions/NM3O/8:-AI-can-win-a-conflict-against-us

This is an article in the new intro to AI safety series from AISafety.info. We'd appreciate any feedback. The most up-to-date version of this article is on our website.

Suppose an AI has realized that controlling the world would let it achieve its aims. If it’s weak, it might just obey humans. If it’s roughly at our power level, it might make deals or trades with us. But if it’s sufficiently capable, it might make a play to seize control from us. That is, it might attempt takeover. Could it succeed?

Intelligence confers a kind of power. (Keep in mind that by intelligence, we don’t just mean clever puzzle-solving, but any traits that go into making effective decisions.) Humans have power over chimpanzees because we’re smarter. If the chimpanzees of the world banded together against us, we’d defeat them despite being physically weaker. Even if they were equal in number to us, we’d still win.

For similar reasons, a superintelligence could win against us. We don’t know how, exactly — compare playing chess against a grandmaster such as Magnus Carlsen. You can’t predict the exact moves he’ll use to beat you. If you could reliably do that, you’d be just as good as him. But you can still abstractly say that he’s better at finding better winning chess moves.

That type of abstract argument only matters if there’s a large and complex space of strategies to search through, so there’s room to find surprisingly effective ones. But the real world is plenty complex. In the course of a takeover attempt, superintelligent AI could:

Outcompete humans economically and build wealth, perhaps by exploiting financial markets or with other clever schemes. Consider the case of Satoshi Nakamoto, who anonymously made billions by inventing Bitcoin. At no point in this process did Satoshi need to be a human.
Use superhuman persuasive skills. Maybe the most straightforward way of persuading people to do things is by paying them money, as above. But AI could also build extremely good psychological models, both of the human mind in general, and of specific people. Working off of these models, it could then use superhuman skills at deception, fraud, argument, intimidation, and charm.
Hack into devices. Many computer systems have vulnerabilities that can be exploited by a sufficiently skilled attacker. A superintelligent AI could have abilities greater than those of thousands of the most talented human hackers working patiently for millennia.
Replicate itself onto any sufficiently powerful computer it has access to, or install smaller AIs that it expects to serve its interests.
Develop new technologies, including superweapons. For example, it could bring down human civilization with engineered bioweapons, if it has lab access via humans it has employed or otherwise persuaded, or via robotics.
Strategize at a superhuman level, combining the above ideas by chaining them together and executing many at the same time.
Do things we’ve never thought of at all.

With strategies like these, a misaligned superintelligence may well put together a winning plan. And the result wouldn’t be a one-off recoverable disaster, or a technically advanced utopia. It would probably result in the loss of most of what we’d consider valuable about the future — for example, by killing everyone on Earth.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

6

AI can win a conflict against us

6

6

6

AI can win a conflict against us

6

Related

6