I very much agree. Historical analogies:

To a tiger, human hunter-gatherers must be frustrating and bewildering in their ability to coordinate. "What the hell? Why are they all pouncing on me when I jumped on the little one? The little one is already dead anyway, and they are risking their own lives now for nothing! Dammit, gotta run!"

To a tribe of hunter-gatherers, farmers must be frustrating and bewildering in their ability to coordinate. "What the hell? We pillaged and slew that one village real good, they sure didn't have enough warr... (read more)

2SoerenMind4moAlso interesting to see that all of these groups were able to coordinate to the disadvantage of less coordinates groups, but not able to reach peace among themselves. One explanation might be that the more coordinated groups also have harder coordination problems to solve because their world is bigger and more complicated. Might be the same with AI?

I wonder also if the conflicts that remain are nevertheless more peaceful. When hunter-gatherer tribes fight each other, they often murder all the men and enslave the women, or so I hear. Similar things happened with farmer societies sometimes, but also sometimes they just become new territories and have to pay tribute and levy conscripts and endure the occasional pillage. And then industrialized modern nations even have rules about how you can't rape and pillage and genocide and sell into slavery the citizens of your enemy. Perhaps AI conflicts would... (read more)

Strategic implications of AIs' ability to coordinate at low cost, for example by merging

by Wei_Dai 1 min read25th Apr 201942 comments

57

Ω 18


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

It seems likely to me that AIs will be able to coordinate with each other much more easily (i.e., at lower cost and greater scale) than humans currently can, for example by merging into coherent unified agents by combining their utility functions. This has been discussed at least since 2009, but I'm not sure its implications have been widely recognized. In this post I talk about two such implications that occurred to me relatively recently.

I was recently reminded of this quote from Robin Hanson's Prefer Law To Values:

The later era when robots are vastly more capable than people should be much like the case of choosing a nation in which to retire. In this case we don’t expect to have much in the way of skills to offer, so we mostly care that they are law-abiding enough to respect our property rights. If they use the same law to keep the peace among themselves as they use to keep the peace with us, we could have a long and prosperous future in whatever weird world they conjure. In such a vast rich universe our “retirement income” should buy a comfortable if not central place for humans to watch it all in wonder.

Robin argued that this implies we should work to make it more likely that our current institutions like laws will survive into the AI era. But (aside from the problem that we're most likely still incurring astronomical waste even if many humans survive "in retirement"), assuming that AIs will have the ability to coordinate amongst themselves by doing something like merging their utility functions, there will be no reason to use laws (much less "the same laws") to keep peace among themselves. So the first implication is that to the extent that AIs are likely to have this ability, working in the direction Robin suggested would likely be futile.

The second implication is that AI safety/alignment approaches that aim to preserve an AI's competitiveness must also preserve its ability to coordinate with other AIs, since that is likely an important part of its competitiveness. For example, making an AI corrigible in the sense of allowing a human to shut it (and its successors/subagents) down or change how it functions would seemingly make it impossible for this AI to merge with another AI that is not corrigible, or not corrigible in the same way. (I've mentioned this a number of times in previous comments, as a reason why I'm pessimistic about specific approaches, but I'm not sure if others have picked up on it, or agree with it, as a general concern, which partly motivates this post.)

Questions: Do you agree AIs are likely to have the ability to coordinate with each other at low cost? What other implications does this have, especially for our strategies for reducing x-risk?

57

Ω 18