Singletons Rule OK


11


Eliezer_Yudkowsky

Reply toTotal Tech Wars

How does one end up with a persistent disagreement between two rationalist-wannabes who are both aware of Aumann's Agreement Theorem and its implications?

Such a case is likely to turn around two axes: object-level incredulity ("no matter what AAT says, proposition X can't really be true") and meta-level distrust ("they're trying to be rational despite their emotional commitment, but are they really capable of that?").

So far, Robin and I have focused on the object level in trying to hash out our disagreement.  Technically, I can't speak for Robin; but at least in my own case, I've acted thus because I anticipate that a meta-level argument about trustworthiness wouldn't lead anywhere interesting.  Behind the scenes, I'm doing what I can to make sure my brain is actually capable of updating, and presumably Robin is doing the same.

(The linchpin of my own current effort in this area is to tell myself that I ought to be learning something while having this conversation, and that I shouldn't miss any scrap of original thought in it - the Incremental Update technique. Because I can genuinely believe that a conversation like this should produce new thoughts, I can turn that feeling into genuine attentiveness.)

Yesterday, Robin inveighed hard against what he called "total tech wars", and what I call "winner-take-all" scenarios:

Robin:  "If you believe the other side is totally committed to total victory, that surrender is unacceptable, and that all interactions are zero-sum, you may conclude your side must never cooperate with them, nor tolerate much internal dissent or luxury."

Robin and I both have emotional commitments and we both acknowledge the danger of that.  There's nothing irrational about feeling, per se; only failure to update is blameworthy.  But Robin seems to be very strongly against winner-take-all technological scenarios, and I don't understand why.

Among other things, I would like to ask if Robin has a Line of Retreat set up here - if, regardless of how he estimates the probabilities, he can visualize what he would do if a winner-take-all scenario were true.

Yesterday Robin wrote:

"Eliezer, if everything is at stake then 'winner take all' is 'total war'; it doesn't really matter if they shoot you or just starve you to death."

We both have our emotional commitments, but I don't quite understand this reaction.

First, to me it's obvious that a "winner-take-all" technology should be defined as one in which, ceteris paribus, a local entity tends to end up with the option of becoming one kind of Bostromian singleton - the decisionmaker of a global order in which there is a single decision-making entity at the highest level.  (A superintelligence with unshared nanotech would count as a singleton; a federated world government with its own military would be a different kind of singleton; or you can imagine something like a galactic operating system with a root account controllable by 80% majority vote of the populace, etcetera.)

The winner-take-all option is created by properties of the technology landscape, which is not a moral stance.  Nothing is said about an agent with that option, actually becoming a singleton.  Nor about using that power to shoot people, or reuse their atoms for something else, or grab all resources and let them starve (though "all resources" should include their atoms anyway).

Nothing is yet said about various patches that could try to avert a technological scenario that contains upward cliffs of progress - e.g. binding agreements enforced by source code examination or continuous monitoring, in advance of the event.  (Or if you think that rational agents cooperate on the Prisoner's Dilemma, so much work might not be required to coordinate.)

Superintelligent agents not in a humanish moral reference frame - AIs that are just maximizing paperclips or sorting pebbles - who happen on the option of becoming a Bostromian Singleton, and who have not previously executed any somehow-binding treaty; will ceteris paribus choose to grab all resources in service of their utility function, including the atoms now composing humanity.  I don't see how you could reasonably deny this!  It's a straightforward decision-theoretic choice between payoff 10 and payoff 1000!

But conversely, there are possible agents in mind design space who, given the option of becoming a singleton, will not kill you, starve you, reprogram you, tell you how to live your life, or even meddle in your destiny unseen.  See Bostrom's (short) paper on the possibility of good and bad singletons of various types.

If Robin thinks it's impossible to have a Friendly AI or maybe even any sort of benevolent superintelligence at all, even the descendants of human uploads - if Robin is assuming that superintelligent agents will act according to roughly selfish motives, and that only economies of trade are necessary and sufficient to prevent holocaust - then Robin may have no Line of Retreat open, as I try to argue that AI has an upward cliff built in.

And in this case, it might be time well spent, to first address the question of whether Friendly AI is a reasonable thing to try to accomplish, so as to create that line of retreat.  Robin and I are both trying hard to be rational despite emotional commitments; but there's no particular reason to needlessly place oneself in the position of trying to persuade, or trying to accept, that everything of value in the universe is certainly doomed.

For me, it's particularly hard to understand Robin's position in this, because for me the non-singleton future is the one that is obviously abhorrent.

If you have lots of entities with root permissions on matter, any of whom has the physical capability to attack any other, then you have entities spending huge amounts of precious negentropy on defense and deterrence.  If there's no centralized system of property rights in place for selling off the universe to the highest bidder, then you have a race to burn the cosmic commons, and the degeneration of the vast majority of all agents into rapacious hardscrapple frontier replicators.

To me this is a vision of futility - one in which a future light cone that could have been full of happy, safe agents having complex fun, is mostly wasted by agents trying to seize resources and defend them so they can send out seeds to seize more resources.

And it should also be mentioned that any future in which slavery or child abuse is successfully prohibited, is a world that has some way of preventing agents from doing certain things with their computing power.  There are vastly worse possibilities than slavery or child abuse opened up by future technologies, which I flinch from referring to even as much as I did in the previous sentence.  There are things I don't want to happen to anyone - including a population of a septillion captive minds running on a star-powered Matrioshka Brain that is owned, and defended against all rescuers, by the mind-descendant of Lawrence Bittaker (serial killer, aka "Pliers").  I want to win against the horrors that exist in this world and the horrors that could exist in tomorrow's world - to have them never happen ever again, or, for the really awful stuff, never happen in the first place.  And that victory requires the Future to have certain global properties.

But there are other ways to get singletons besides falling up a technological cliff.  So that would be my Line of Retreat:  If minds can't self-improve quickly enough to take over, then try for the path of uploads setting up a centralized Constitutional operating system with a root account controlled by majority vote, or something like that, to prevent their descendants from having to burn the cosmic commons.

So for me, any satisfactory outcome seems to necessarily involve, if not a singleton, the existence of certain stable global properties upon the future - sufficient to prevent burning the cosmic commons, prevent life's degeneration into rapacious hardscrapple frontier replication, and prevent supersadists torturing septillions of helpless dolls in private, obscure star systems.

Robin has written about burning the cosmic commons and rapacious hardscrapple frontier existences.  This doesn't imply that Robin approves of these outcomes.  But Robin's strong rejection even of winner-take-all language and concepts, seems to suggest that our emotional commitments are something like 180 degrees opposed.  Robin seems to feel the same way about singletons as I feel about ¬singletons.

But why?  I don't think our real values are that strongly opposed - though we may have verbally-described and attention-prioritized those values in different ways.