"General Thud! General Thud! Wake up! The aliens have landed. We must surrender!" General Thud's assistant Fred turned on the lights and opened the curtains to help Thud wake up and confront the situation. Thud was groggy because he had stayed up late supervising an ultimately successful mission carried out by remotely piloted vehicles in some small country on the other side of the world. Thud mumbled, "Aliens? How many? Where are they? What are they doing?" General Thud looked out the window, expecting to see giant tripods walking around and destroying buildings with death rays. He saw his lawn, a bright blue sky, and hummingbirds hovering near his bird feeder.

Fred was trying to bring Thud up to speed as quickly as possible. "Thousands of them, General! 2376, to be precise. They gave us a map; we know where they all are. They aren't doing anything overt, but the problem is their computation! I have one here, if you'd like to look." Fred removed a black sphere two inches in diameter from his pocket and gave it to Thud.

Thud sat on his bed holding the small sphere and staring at it dumbfounded. "Okay, you think we should surrender to a few thousand small spheres. Why is that, exactly?" The sphere seemed a little flexible in Thud's hand. As he experimented a few seconds to see just how flexible, it collapsed in his hand, converting itself into a loose clump of alien sand that landed in his lap and started to dribble onto his bed and the floor. Thud stood up and brushed the rest of the sand off of his pyjamas and bed, and thought for a moment about where he left his vacuum cleaner bags. He was not impressed with these aliens.

Fred said "I don't think you wanted to do that, sir. Their ultimatum states that for every alien we destroy, they'll manufacture two in the outer reaches of the Solar System where we'll never find them!"

Thud said, "Okay, so now you think we should surrender to 2375 small spheres, and two or more small spheres that are out of the battlefield for the moment. Why is that?"

Fred said "Well, you remember a few years back when some people copied their brain state into a computer and posted it to the Internet? Apparently somebody copied the data across an unencrypted wireless link, the aliens picked it up with their radio telescopes, and now they are simulating those poor people in these black spheres and torturing the simulations! They sent us videos!" Fred held up his cell phone, pushed a button, and showed the video to Thud.

Thud looked at the video for a moment and said, "Yep, that's torture. Do these people know anything potentially useful to the aliens?"

Fred said, "Well, they know how to break into a laboratory that has brain scanning tools and push some buttons. That was apparently the high point of their lives.  But none of that matters, the aliens don't seem to be torturing them for information anyway."

Thud was still suffering from morning brain fog. He rubbed his eyes. "And why should we surrender?"

Fred said, "The aliens have made a trillion copies of these poor people and will run the torture simulations on the little black spheres until we march all of our citizens into the death camps they demand we build! We have analyzed these black spheres and the engineering diagrams the aliens gave us, and we know this to be true. We only have ten billion citizens, and this simulated torture is much worse than simulated death, so the total utility is much greater if we surrender!"

Thud yawned.  "Fred, you're fired. Get out of my house." As Fred left, Thud closed his curtains and tried to get back to sleep.


Michael said "So I take it you no longer assist Thud. What are you doing now?"

Fred reclined comfortably on the analyst's couch. "I help out at the cafeteria as a short order cook. But I'm not worried about my career right now. I have nightmares about all these simulated people being tortured in the flimsy alien spheres."

"Thud surely knows the simulations are being tortured too. Do you think he has nightmares about this?"

"No, he doesn't seem to care."

"Have you always cared about the well-being of simulations?"

"No, when I was a teenager I was self-centered and conceited and didn't care about anybody else, including simulated people."

"So at some point you self-modified to care about simulations. If it helps you, you could self-modify again."

"But I don't want to!"

"Did you want to self-modify to care about simulations in the first place?"

"No, it just sort of happened as I grew up."

"Is there any logical inconsistency in Thud's position?"

Fred thought for a bit.  "Not that I can see.  The value one assigns to simulations seems to be an arbitrary choice.  Ignoring the alien invasion certainly hasn't harmed his career."

"Concern about simulations seems to give the aliens more influence over you than Thud would prefer. What would you prefer?"

"Well, I'd also prefer the aliens not to be able to jerk me around. I really don't have room in my life for it now.  In the grand scheme of things, it seems just wrong -- they shouldn't be able to genocide a species with a few thousand stupid spheres that just sit there converting sunlight to heat."

Michael passed Fred a piece of paper with a short list of bulleted items.  "This is the procedure I teach my clients who want to change their preferences.  After you've learned it, you can decide whether and how you want to use it..."

New Comment
158 comments, sorted by Click to highlight new comments since: Today at 5:16 AM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

The relevant question is, how does surrendering, or not surrendering, control the probability of the ultimatum having been given? If it doesn't, we should surrender. If the aliens sufficiently more likely wouldn't make the ultimatum if we wouldn't surrender if they did, we shouldn't surrender. Furthermore, we should look for third options whose choosing could also control aliens' actions.

Since this information is not given in the story, and the only thing we can go on is anthropomorphic intuition that we shouldn't give in to blackmail (since having the property to not give in really does control the probability of getting blackmailed by humans), the correct answer wasn't suggested, which defeats part of the appeal of a puzzle like this and can wreak some unnecessary memetic hazard.

For the same reason, focusing on whether one "cares about simulations" in this context is misleading, a false dilemma, since this is not the most relevant consideration. It's like asking whether you should cross a road on prime-numbered minutes, and pointing out examples of people who did cross the road on prime-numbered minutes and were run over, instead of focusing on how you should respond to traffic lights.

Is this the correct interpretation of the first three sentences? If the aliens are sufficiently less likely to present the ultimatum when they believe that we would not surrender upon being presented with the ultimatum, then we should not surrender. That is, our decisions procedures should not return "surrender" in the situation where having a decision procedure that returns "surrender" increases the counter-factual prior probability of being presented the ultimatum, even after we have been given the ultimatum. This correct decision of not surrendering when given the ultimatum (a decision which results in a loss of utility via the captives being tortured), if it is given or expected with sufficient certainty that "the aliens are more likely to not present the ultimatum if they think we will not surrender upon being presented the ultimatum" is analogous to the correct decision of paying the counterfactual mugger when given the losing outcome of a bet (a decision which results in a loss of utility via paying money), if it is provided or expected with sufficient certainty that "the counter-factual mugger would pay us if we would have won the bet given that the counter-factual mugger thought that we would pay em upon losing the bet". That is, in the same way that we act in accordance with how we have precommited to paying the losing counter-factual mugging bet, since doing so would have maximized our counterfactual prior expected utility, we now should act in accordance with how would we had precommited to not surrendering upon being presented the ultimatum, since doing so would have increased our counterfactual prior expected utility. That is, the reflectively consistent algorithm to which a friendly AI would self-modify in advance of being presented with this situation is such that it would choose to let the captives be tortured in order to decrease the prior expectation of captives being tortured. - If all of that was correct, would a FAI modify to such a refl

It seems you can set up a similar scenario without simulations: have the aliens physically kidnap 90% of the human race, then try to blackmail the other 10% into killing themselves. That would make for an interesting moral dilemma too.

And you can set up a scenario without dragging in torture and extinction. Aliens from Ganymede are about to ink a contract to trade us tons of Niobium in exchange for tons of Cobalt. But then the aliens reveal that they have billions of cloned humans working as an indentured proletariat in the mines of the Trojan asteroids. These humans are generally well treated, but the aliens offer to treat them even better - feed them ice cream - if we send the Cobalt without requiring payment in Niobium.

The central problem in all of these thought experiments is the crazy notion that we should give a shit about the welfare of other minds simply because they exist and experience things analogously to the way we experience things.

Is there a standard name for the logical fallacy where you attempt a reductio ad absurdum but fail to notice that you're deriving the absurdity from more than one assumption? Why conclude that it's the caring about far-away strangers that is crazy, as opposed to the decision algorithm that says you should give in to extortions like this?

I'm not sure words like "crazy" and "absurd" are even meaningful in this context. It's pretty easy to come up with internally consistent arguments generating both results, and the scenario's outlandish enough that it's not clear which one has more practical vulnerabilities; essentially we're dealing with dueling intuitions.
Good catch. Yes, I was deriving the absurdity from more than one assumption. Maybe with the right decision algorithm you wouldn't give in to extortions like this. However, this extortion attempt cost the aliens approximately nothing, so unless correctly inferring our decision algorithm cost them less than approximately nothing, the rational step for the aliens is to try the extortion regardless. Thus having a different decision algorithm probably wouldn't prevent the extortion attempt.

But then changing your values to not care about simulated torture won't prevent the extortion attempt either (since the aliens will think there's a small chance you haven't actually changed your values and it costs them nothing to try). Unless you already really just don't care about simulated torture, it seems like you'd want to have a decision algorithm that makes you go to war against such extortionists (and not just ignore them).

Wait, is this a variant on Newcomb's problem? (Am I just slow today? Nobody else seems to have mentioned it outright, at least.)
6Paul Crowley13y
This sort of thing is really the motivating example behind Newcomb's problem.
I'm not seeing the analogy. Can you explain? The extortion attempt cost the aliens almost nothing, and would have given them a vacant solar system to move into if someone like Fred was in power, so it's rational for them to make the attempt almost regardless of the odds of succeeding. Nobody is reading anybody else's mind here, except the idiots who read their own minds and uploaded them to the Internet, and they don't seem to be making any of the choices.
This case looks most like the 'transparent boxes' version of the problem, which I haven't read much about. In Newcomb's problem, Omega offers a larger amount of utility if you will predictably do something that intuitively would give a smaller amount of utility. In this situation, being less open to blackmail probably gives you less disutility in the long run (fewer instances of people trying to blackmail you) than acceding to the blackmail, even though acceding intuitively gives you less disutility. The other interesting part of this particular scenario is how to define 'blackmail' and differentiate it from, say, someone accidentally doing something that's harmful to you and asking you to help fix it. We've approached that issue, too, but I'm not sure if it's been given a thorough treatment yet.
They had other choices though. It would have been similarly inexpensive to offer to simulate happy people. Even limiting the spheres to a single proof-of-concept would have been a start.
That 'costs them nothing' part makes a potentially big difference. That the aliens must pay to make their attempt is what gives your decision leverage. The war that you suggest is another way of ensuring that there is a cost. Even though you may actually lose the war and be exterminated. (Obviously there are whole other scenarios where becoming a 'protectorate' and tithing rather than going to war constitutes a mutually beneficial cooperation. When their BATNA is just to wipe you out but it is slightly better for them to just let you pay them.)
I really don't care about simulated torture, certainly not enough to prefer war over self-modification if simulated torture becomes an issue. War is very expensive and caring about simulated torture appears to be cost without benefit. The story is consistent with this. Fred has problems because he cares about simulated torture, and Thud doesn't care and doesn't have problems. Hmm, perhaps we agree that the story has only one source of absurdity now? No big deal either way. (UDT is still worth my time to understand. I owe you that, and I didn't get to it yet.)

If Fred cared about the aliens exterminating China, and Thud didn't care; then if the aliens instead threatened to exterminate China, Fred would again have problems and Thud again wouldn't have.

A rock doesn't care about anything, and therefore it has no problems at all.

This topic isn't really about simulation, it's about the fact that caring about anything permits you to possibly sacrifice something else for it. Anything that isn't our highest value may end up traded away, sure.

You can travel from here to China and back. Therefore, caring about China has at least a potential instrumental consequence on the rest of my life. You can't travel from here to the aliens' simulation and back, so caring about what happens there imposes costs on the rest of my life but no benefits. The analogy is not valid. Now, if the black spheres had decent I/O capabilities and you could outsource human intellectual labor tasks to the simulations, I suppose it would make sense to care about what happens there. People can't do useful work while they're being tortured, so that wasn't the scenario in the story.

You can travel from here to China and back. Therefore, caring about China has at least a potential instrumental consequence on the rest of my life.

That's the only sane reason you believe can exist for caring about distant people at all? That you can potentially travel to them?

So if you're a paraplegic , who doesn't want to travel anywhere, can't travel anywhere, and know you'll die in two weeks anyway. You get a choice to push a button or not push it. If you push it you get 1 dollar right now, but 1 billion Chinese people will die horrible deaths in two weeks, after your own death.

Are you saying that the ONLY "sane" choice is to push the button, because you can use the dollar to buy bubblegum or something, while there'll never be a consequence on you for having a billion Chinese die horrible deaths after your own death?

If so, your definition of sanity isn't the definition most people have. You're talking about the concept commonly called "selfishness", not "sanity".

Fine. Explain to me why Fred shouldn't exterminate his species, or tell me that he should. The extortion aspect isn't essential. Fred could have been manipulated by true claims about making simulated people super happy. ETA: At one point this comment had downvotes but no reply, but when I complained that that wasn't a rational discussion, someone actually replied. LessWrong is doing what it's supposed to do. Thanks people for making it and participating in it.
I would give in to the alien demands in that situation, assuming we 'least convenient possible world' away all externalities (the aliens might not keep their promise, there might be quadrillions of sentient beings in other species who we could save by stopping these aliens). The way the story is told makes it easy for us to put ourselves in the shoes of Fred, Thud or anyone else on earth, and hard to put ourselves in the shoes of the simulations, faceless masses with no salient personality traits beyond foolishness. This combination brings out the scope insensitivity in people. A better way to tell the story would be to spend 1000 times as many words describing the point of view of the simulations as that of the people on earth. I wonder how ridiculous giving in would seem then.
It's good to have a variety of opinions on the mass-suicide issue. Thanks for posting. IMO scope insensitivity is a good thing. If you don't have scope insensitivity, then you have an unbounded utility function, and in the event you actually have to compute your utility function, you can't effectively make decisions because the expected utility of your actions is a divergent sum. See Peter de Blanc's paper and subsequent discussion here. If your values incorporate the sum of a divergent series, what happens in practice is that your opinion at the moment varies depending on whichever part of the divergent sum you're paying attention to at the moment. Being vulnerable to Pascal's Wager, Pascal's Mugging, and having Fred's mass-suicide preference in the story are all symptomatic of having unbounded utility functions. Shut Up and Multiply is just wrong, if you start from the assumption that scope insensitivity is bad. ETA: I should probably have said "Shut Up and Multiply" is just wrong, if one of the things you're multiplying by is the number of individuals affected.
If you truly believe this then I think we really do have no common ground. Scope insensitivity experiments have found that people will pay more to save one child than to save eight. Effectively, it will make you kill seven children and then make you pay for the privilege. It is, IMO, morally indefensible and probably responsible for a great deal of the pain and suffering going on as we speak. A central tenet of the theory of rationality, as I understand it, is that decision theory, the computational methods used and even method of updating, are all up for grabs if a more convenient alternative is found, but the utility function is not. My utility function is what I want. Rationality exists to help us achieve what we want, if it is 'rational' for me to change my utility function, then this 'rationality' is a spectacular failure and I will abandon it for an 'irrational alternative'. What you point out may be a problem, but the solution lies in decision theory or algorithmics, not in paying to kill children.
I agree that those people are confused in some way I do not understand. Agreed. We need some way to reconcile that with the aforementioned result about scope-insensitive people who would pay the wrong price for saving one child verses saving eight. True. I think your process for introspecting your utility function is wrong, and I think the procedure for inferring the utility function of the scope insensitive people who were thinking about saving one or eight children was flawed too, in a different way. Humans can't really have unbounded utility. The brain structures that represent those preferences have finite size, so they can't intuit unbounded quantities. I believe you care some finite amount for all the rest of humanity, and the total amount you care asymptotically approaches that limit as the number of people involved increases to infinity. The marginal utility to you of the happiness of the trillionth person is approximately zero. Seriously, what does he add to the universe that the previous 999,999,999 didn't already give enough of? Straw man argument. You're the only one who mentioned paying to kill children.
I'm afraid that backing away from the whole "one child over eight" think but standing by the rest of scope sensitivity doesn't save you from killing. For example, if you value ten million people only twice as much as an a million then you can be persuaded to prefer 20% chance of death for 10 million over certain death for 1 million, which means, on average, condemning 1 million people to death. Any utility function that does not assign utility to human life in direct proportion to the number of lives at stake is going to kill people in some scenarios. I go with the 'revealed preference' theory of utility myself. I don't think the human brain actually includes anything that looks like a utility function. Instead, it contains a bunch of pleasure pain drives, a bunch of emotional reactions independent of those drives, and something capable of reflecting on the former two and if necessary overriding them. Put together, under sufficient reflection, these form an agent that may act as if it had a utility function, but there's no little counter in its brain that's actually tracking utility. Thus, the way to deduce things about my utility function is not to scan my brain, but to examine the choices I make and see what they reveal about my preferences. For example, I think that if faced with a choice between saving n people with certainty and a 99.9999% chance of saving 2n people, I would always pick the latter regardless of n (I may be wrong about this, I have never actually faced such a scenario for very large values of n). This proves mathematically that my utility function is unbounded in lives saved.
The merit of those alternatives depends on how many people total there are. If there are only 10 million people, I'd much rather have 1 million certain deaths than 20% chance of 10 million deaths, since we can repopulate from 8 million but we can't repopulate from 0. Even if condemning 1 million to death on the average is wrong when all options involve the possible deaths of large numbers of people, deriving positive utility from condemning random children to death when there's no dilemma is an entirely different level of wrong. Utility as a function of lives should flatten out but not slope back down, assuming overpopulation isn't an issue. The analogy isn't valid. Let's give up on the killing children example. Yes! Agreed completely, in cases where doing the experiment is practical. All our scenarios seem to involve killing large numbers of people, so the experiment is not practical. I don't see any reliable path forward -- maybe we're just stuck with not knowing what people prefer in those situations any time soon. If 2n is the entire population, in one case we have 0 probability of ending up with 0 people and in the other case we have 0.0001% chance of losing the entire species all at once. So you seem to be more open to mass suicide than I'd like, even when there is no simulation or extortion involved. The other interpretation is that you're introspecting incorrectly, and I hope that's the case. Someone voted your comment down. I don't know why. I voted your comment up because it's worth talking about, even though I disagree.
Okay, I guess the long term existence of the species does count as quite a significant externality, so in the case where 2n made up the whole species I probably would (I generally assume, unless stated otherwise, that both populations are negligible proportions of the species as a whole. However, I don't think humanity is a priori valuable, and if humanity now consists of 99.9% simulations being tortured then I think we really are better off dead. It may be that, in a certain sense, one is more 'wrong' than the other. However, both amount to an intentional choice that more humans die, and I would say that if you value human life, they are equally poor choices. How can 10 million humans not be 10 times as valuable as 1 million humans. How does the value of a person's life depend on the number of other people in danger? I'm inclined to say that my intuitions are probably fairly good on these sort of hypothetical scenarios, provided that the implications of my choices are quite distant and do not affect me personally (i.e. I would be more sceptical of my intuitions if I was in one of the groups).
I already answered that. The first few hundred survivors are much more valuable than the rest. Even if survival isn't an issue, the trillionth human adds much less to what I value about humanity than the 100th human does. I haven't seen any argument for total utility being proportional to total number of people other than bald assertions. Do you have anything better than that? It's your choice whether you count those simulations as human or not. Be sure to be aware of having the choice, and to take responsibility for the choice you make. You're human and you're saying that humanity is not a priori valuable? What?
I don't have an absolute binding argument, just some intuitions. Some of these intuitions are: * It feels unfair to value human's differently based on something as arbitrary as the order in which they are presented to me, or the number of other humans they are standing next to. * It seems probable to me that the humans in the group of 1 trillion would want to be treated equally to the humans in the group of 100. * It does not seem like there is anything different about the individual members of a group of 100 humans and a group of a trillion, either physically or mentally. They all still have the same amount of subjective experience, and I have a strong intuition that subjective experience has something very important to do with the value of human life. * It does not feel to me like I become less valuable when there are more other humans around, and it doesn't seem like there's anything special enough about me that this cannot be generalised. * It feels elegant, as a solution. Why should they become less valuable? Why not more valuable? Perhaps oscillating between two different values depending on parity? Perhaps some other even weirder function? A constant function at least has a certain symmetry to it. These are just intuitions, they are convincing to me, but not to all possible minds. Are any of them convincing to you? Is it also my choice whether I count black people or women as human? In a trivial sense it is my choice, in that the laws of rationality do not forbid me from having any set of values I want. In a more realistic sense, it is not my choice, one option is obviously morally repugnant (to me at any rate) and I do not want it, I do not want to want it, I do not want to want to want it and so on ad infinitum (my values are in a state of reflective equilibrium on the question). Humans are valuable. Humanity is valuable because it consists of humans, and has the capacity to create more. There is no explicit term in my utility for 'humanity' as di
Odd, my intuitions are different. Taking the first example: If I'm doing something special nobody else is doing and it needs to be done, then I'd better damn well get it done. If I'm standing next to a bunch of other humans doing the same thing, then I'm free! I can leave and nothing especially important happens. I am much less important to the entire enterprise in that case. Be sure to watch the ongoing conversation at http://lesswrong.com/lw/5te/a_summary_of_savages_foundations_for_probability/ because there's a plausible axiomatic definition of probability and utility there from which one can apparently prove that utilities are bounded.
The instrumental value of a human may vary from one human to the next. It doesn't seem to me like this should always go down though, for instance if you have roughly one doctor per every 200 people in you group then each doctor is roughly as instrumentally valuable whether the total number of people is 1 million or 1 billion. But this is all besides the point, since I personally assign terminal value to humans, independent of any practical use they have (you can't value everything only instrumentally, trying to do so leads to an infinite regress). I am also inclined to say that except in edge cases, this terminal value is significantly more important than any instrumental value a human may offer. Coming back to the original discussion we see the following: * The simulations are doing no harm or good to anyone, so their only value is terminal. * The humans on earth are causing untold pain to huge numbers of sentient beings simply by breathing, and may also be doing other things. They have a terminal value, plus a huge negative instrumental value, plus a variety of other positive and negative instrumental values, which average out at not very much.
Yup, you really are on the pro-mass-suicide side of the issue. Whatever. Be sure to pay attention to the proof about bounded utility and figure out which of the premises you disagree with.
For the record, allow me to say that under the vast majority of possible circumstances I am strongly anti-mass-suicide. To counter your comment, I accuse you of being pro-torture ;)
. Well, it's good to hear that neither of us are against anything, and are fundamentally positive, up-beat people. :-) Sounds like a set-up for a debate: "Would you like to take the pro-mass-suicide point of view, or the pro-torture point of view?"
Heavily for most people - due to scope insensitivity. Saving 1 person makes you a hero. Saving a million people does not produce a million times the effect. Thus the size sensitivity.
I am aware that it happens. I'm just saying that it shouldn't. I'm making the case that this intuition does not fit in reflective equilibrium with our others, and should be scrapped.
Heavily for most people - due to scope insensitivity. Saving 1 person makes you a hero. Saving a million people does not produce a million times the effect. Thus the size sensitivity.
Yeah, just realized that I've just written, in a place where anyone who wants can see, that I would be prepared to wipe out humanity. :O I'm just lucky that nobody really cares enough to search the entire internet for incriminating comments. :)
I'm not the person who downvoted you, but I suspect the reason was that when you said this: You implied that it's wrong or nonsensical to care about other people's happiness/absence of suffering as a terminal value. We are "allowed" to have whatever terminal values we want, except perhaps contradictory ones.
That's presumably why he said "my."
I don't know what it means for a person to be simulated. I don't know if the simulated people have consciousness. Are we talking about people whose existence feels as real to themselves as it would to us? This is NOT an assumption I ever make about simulations, but should I consider it so for the sake of the argument? * If their experience doesn't feel real to themselves, then obviously there isn't any reason to care about what makes them happy or unhappy, that would be Fred being confused, as he conflates the experience of real people with the fundamentally different simulated people. * If their internal experience is as real as ours, then obviously it wouldn't be the extermination of Fred's species, some of his species would survive in the simulations, if in eternal captivity. He should or shouldn't exterminate his flesh-and-blood species based on whether his utility function assigns a higher value to a free (and aliive) humanity, than to a trillion of individual sentients being happy. On my part, I'd choose for a free and alive humanity still. But that's an issue that depends on what terminal values we each have.
Um, I never tried to define sanity. What are you responding to?
Apologies, I did indeed misremember who it was that was talking about "crazy notions", that was indeed Perplexed.
You seem to be collecting some downvotes that should have gone to me. To even things out, I have upvoted three of your comments. Feel free to downvote three of mine. I fully agree, by the way, on the distinction between the moral relevance of simulated humans, who have no ability to physically influence our world, and the moral relevance of distant people here on earth, who physically influence us daily (though indirectly through a chain of intermediary agents). Simulated persons do have the ability to influence us informationally, though, even if they are unaware of our existence and don't recognize their own status as simulations. I'm not sure what moral status I would assign to a simulated novelist - particularly if I liked his work. ETA: To Normal_Anomaly: I do not deny people the right to care about simulations in terms of their own terminal values. I only deny them the right to insist that I care about simulations. But I do claim the right to insist that other people care about Chinese, for reasons similar to those Tim has offered.
But where's the drama in that? :-)
Relevant to your interests, possibly.
Thanks! Cute story.

Err, the point of having a decision theory that makes you go to war against extortionists is not to have war, but to have no extortionists. Of course you only want to do that against potential extortionists who can be "dissuaded". Suffice it to say that the problem is not entirely solved, but the point is that it's too early to say "let's not care about simulated torture because otherwise we'll have to give in to extortion" given that we seem to have decision theory approaches that still show promise of solving such problems without having to change our values.

Generally the benefit of caring about about any bad thing is that if you care about it there will be less of it because you will work to stop it.
Well, Fred cared, and his reaction was to propose exterminating humanity. I assume you think his is a wrong decision. Can you say why? If you care about simulated torture (or simulated pleasure), and you're willing to shut up and multiply, then anybody with a big enough computer can get you to do anything even when that computer has no inputs or outputs and makes absolutely no difference to the real world. I think it's better to adjust oneself so one does not care. It's not like it's a well-tested human value that my ancestors on the savannah acted upon repeatedly.
Do your calculations and preferred choices change if instead of "simulations", we're talking about trillions of flesh-and-blood copies of human beings who are endlessly tortured to death and then revived to be tortured again? Even if they're locked in rooms without entrances or exists, and it makes absolutely no difference to the outside world? If you care about them, then anybody with a big enough copier-of-humans, and enough torture chambers "can get you to do anything", as you say. So it's not really an issue that depends on caring for simulations. I wish the concept of "simulations" wasn't needlessly added where it has no necessity to be entered. General Thud would possibly not care if it was the whole real-life population of China that got collected by the aliens, in exchange for a single village of Thud's own nation. The issue of how-to-deal-with-extortion is a hard one, but it's just made fuzzier by adding the concept of simulations into the mix.
I agree that it's a fuzzy mix, but not the one you have in mind. I intended to talk about the practical issues around simulations, not about extortion. Given that the aliens' extortion attempt cost them almost nothing, there's not much hope of gaming things to prevent it. Properly constructed, the black spheres would not have an audit trail leading back to the aliens' home, so a competent extortionist could prevent any counterattack. Extortion is not an interesting part of this situation. Right. It's an issue about caring about things that are provably irrelevant to your day-to-day activities.
If you don't want to be talking about extortion, we shouldn't be talking about simulations in the context of extortion. So far as I can tell, the points you've made about useless preferences only matter in the context of extortion, where it doesn't matter whether we're talking about simulations or real people who have been created. If it's about caring about things that are irrelevant to your everyday life, then the average random person on the other side of the world honestly doesn't matter much to you. They certainly wouldn't have mattered a few hundred years ago. If you were transported to the 1300s, would you care about Native Americans? If so, why? If not, why are you focusing on the "simulation" part. If it turns out that OUR universe is a simulation, I assume you do not consider our creators to have an obligation to consider our preferences?
Caring about those torturees feels a bit like being counterfactually mugged. Being the sort of person (or species) that doesn't care about things that are provably irrelevant to your day-to-day activities would avoid this case of extortion, but depending on the universe that you are in, you might be giving up bigger positive opportunities.
The primary similarity seems to only be that the logic in question gives results which clash with our moral intuition.
I don't understand yet. Can you give a more specific example? The counterfactual mugging example paid off in dollars, which are typically shorthand for utility around here. Both utility and dollars are relevant to your day-to-day-activities, so the most straightforward interpretation of what you said doesn't make sense to me.
Yes, its definitely not strictly a case of counterfactual mugging, it just struck me as having that flavor. I'll see if I can be more specific. At the point in time when you are confronted by omega in the counterfactual mugging scenario, there is provably no way in your day-to-day activities you will ever get anything for your $10 if you cough up. However, having the disposition to be counterfactually muggable is the winning move. The analogy is that when deciding what our disposition should be with regards to caring about people we will never interact with, it might be the winning move to care, even if some of decision branches lead to bad outcomes. The OP has a story where the caring outcome is bad. What about the other stories ? Like the one where everyone is living happily in their protected memory libertarian utopia until first contact day when a delegate from the galactic tourism board arrives and announces that he is blacklisting Earth because "I've seen some of the things you guys virtually do to each other and there's no way I'm letting tourists transmit themselves over your networks". And by the way he's also enforcing a no-fly zone around the Earth until we "clean up our act".
That sounds like a flaw in the decision theory. What kind of broken decision theory achieves its values better by optimizing for different values?
What do you mean by "the real world"? Why does it matter if it's "real"?
The real world generally doesn't get turned off. Simulations generally do. That's why it matters. If there were a simulation that one might reasonably expect to run forever, it might make sense to debate the issue.
Imagine that, instead of simulations, the spheres contained actual people. They are much smaller, don't have bodies the same shape, and can only seem to move and sense individual electrons, but they nonetheless exist in this universe. It's still exactly the same sphere. In any case, you only answered the first question. Why must something exist forever for it to matter morally? It's pretty integral to any debate about what exactly counts as "real" for this purpose.
Fundamentally this is a discussion about preferences. My point is that having preferences unconnected to your own everyday life doesn't promote survival or other outcomes you may want that are connected to your everyday life. In the long term the people we interact with will be the ones that win their everyday life, and in the shorter term the people who have power to do things will be the ones that win their everyday life. To the extent you get to choose your preferences, you get to choose whether you'll be relevant to the game or not. To answer your question, if something stops existing, it stops pertaining to anybody's everyday life. But fundamentally this conversation is broken. I really don't care much about whether you like my preferences or whether you like me. Human preferences generally do not have guiding principles behind them, so asking me needling questions trying to find contradictions in my guiding principles is pointless. If, on the other hand, you proposed a different set of preferences, I might like them and consider adopting them. As you can tell, I don't much like preferences that can be gamed to motivate people to exterminate their own species.
I thought this post was an attempt to argue for your set of preferences. If not, what is it?
It was an attempt to answer the question you asked and to indicate a potentially useful thing to talk about instead.
To a degree, arguing about extortion is arguing about definitions. In the context of the heuristic "don't give in to extortion", we would like to know exactly what the heuristic shouldn't give in to, though, and why. In my opinion, the main problem is that the extortionist is making a no-downside trade: the thing it is trading is "not torturing simulated humans" or "not killing hostages" or whatever, which probably wasn't worth anything to the extortionist anyway. A lot of no-downside trades are obviously unfair, so a useful heuristic is not to agree to no-downside trades in general. In fact, extremely unfair trades in general are metaphorically labeled "extortion" (for instance, I'm sure I've heard the term applied to the price of a diamond ring). We can see cases besides straightforward extortion where people apply the no-downside heuristic. For instance, buying music from iTunes is a no-downside trade for iTunes at first glance: iTunes doesn't lose anything and gains 99 cents. In fact, iTunes has already spent money buying the rights to the music in expectation you'll download it, so this is something of an acausal trade: much like Omega, iTunes is very good at predicting what people will want, and if enough people aren't going to download a track, iTunes won't offer it. Acausal trades are counterintuitive, though, so it makes sense that some people are repelled by this offer and torrent the music instead.
How is offering to supply ice cream characterized as "extortion"? In any case, I was not using the scenario as a reductio against universal unreciprocated altruism. That notion fails under its own weight, due to complete absence of support.
7Wei Dai13y
Sorry, I misread your comment and thought it was an extortion scenario similar to the OP. Now that I've read it more carefully, it's not clear to me that we shouldn't give up the Niobium in order to provide those humans workers with ice cream. (ETA: why did you characterize those humans as indentured workers? It would have worked as well if they were just ordinary salaried workers.) Altruists certainly claim to have support for their stated preferences. Or one could argue that preferences don't need to have support. What kind of support do you have for liking ice cream, for example?
Your reading wasn't far off: "in all of these thought experiments" makes your reply remain relevant.
True enough. My main objection to calling my ice cream negotiating tactic 'extortion' is that I really don't like the "just say 'No' to extortion" heuristic. I see no way of definitionally distinguishing extortion from other, less objectionable negotiating stances. Nash's 1953 cooperative game theory model suggests that it is rational to yield to credible threats. I.e. saying 'no' to extortion doesn't win! An AI that begins with the "just say no" heuristic will self-modify to one that dispenses with that heuristic.
Well you don't want to signal that you give in to extortion. That would just increase the chances of people attempting extortion against you. Better to signal that you are on a vendetta to stamp out extortion - at your personal expense!!!
There is an idea, surprisingly prevalent on a rationality website, that costless signaling is an effective way to influence the behavior of rational agents. Or in other words, that it is rational to take signalling at face value. I personally doubt that this idea is correct. In any case, I reiterate that I suggest yielding only to credible threats. My own announcements do not change the credibility of any threats available to agents seeking to exploit me.
Perhaps what is really being expressed is the belief that social costs are real, and that mere pseudonymous posting has costs.
Maybe if you provided examples of people seeming to say that "costless signaling is an effective way to influence the behavior of rational agents," we could ask them what they meant, and they might say something like "no signaling is actually costless". Statements like "Someone going on record as having opinion X has given decent reason to suppose that person (believes he or she) actually holds opinion X," are interpretable of having either of the two meanings above. Since you didn't provide examples, I wasn't persuaded that you are describing people's ideas, and I suspect ambiguous statements like that are behind our clash of intuitions about what people think.
Ok. That makes some sense. Though I still don't have a clue as to why you mention "social costs" or "pseudonymous posting". So, for the example of people seeming to say that costless signaling is an effective way to influence the behavior of rational agents, I would direct you to the comment to which I was replying. Tim wrote: I interpreted that as advocating costless signaling as a way of influencing the behavior of would-be extortionists. My response to that advocacy: Announcing that I am on a vendetta is cheap talk, and influences no one. No rational agent will believe such self-serving puffery unless I actually experience a level of personal expense commensurate with what I hope to gain by convincing them. Which makes the signaling not just costly, but irrational.
You seem to be the only one talking about "costless signaling" here.
I think the hidden cost is that if the signaler is called on the bluff, the signaler will be shown to not be fully committed to his or her pronouncements (and it will be reasonable to infer a good deal more flexibility in them than that). Generally I think if someone has an intuition a case of apparently costless signaling would be valuable, his or her intuition is usually correct, but the intellect hasn't found the cost of the signal yet. The intellect's claim that only signaling that has costs is valuable remains accurate, as you say. It seems like its irrationality would be made contingent on some variables, so it would sometimes actually be rational costly signalling. Following through on a costly commitment clearly has costs, but why assume benefits to reputation aren't greater? If you say "I will be careful not to betray lessdazed so long as his costly seeking revenge would be worth it for his reputation," you run into the paradox that such cases might not exist any more than one "[t]he smallest positive integer not definable in under eleven words" exists (Berry's Paradox). So long as my actions are best interpretable as being of negative utility, they get +3 stacking bonus to utility. Of course, I then run into the paradox because with the bonus I no longer qualify for the bonus! A well made, RPG would state whether or not the bonus counts towards calculating whether or not one qualifies for it, but Azathoth is a blind idiot god, and for all its advanced graphics and immersive gameplay, RL is not a well made RPG.
They inflluence the liklihood of them being made in the first place - by influencing the attacker's expected payoffs. Especially if it appears as though you were being sincere. Your comment didn't look much like signalling. I mean, it doesn't seem terribly likely that someone would deliberately publicly signal that they are more likely than unnamed others to capitulate if threatened with an attempt at extortion. Credibly signalling resistance to extortion is non-trivial. Most compelling would be some kind of authenticated public track record of active resistance.
7Wei Dai13y
I don't think anybody is suggesting building an explicit "just say 'No' to extortion" heuristic into an AI. (I agree we do not have a good definition of "extortion" so when I use the word I use it in an intuitive sense.) We're trying to find a general decision theory that naturally ends up saying no to extortion (when it makes sense to). Here's an argument that "saying 'no' to extortion doesn't win" can't be the full picture. Some people are more credibly resistant to extortion than others and as a result are less likely to be extorted. We want an AI that is credibly resistant to extortion, if such credibility is possible. Now if other players in the picture are intelligent enough, to the extent of being able to deduce our AI's decision algorithm, then isn't being "credibly resistant to extortion" the same as having a decision algorithm that actually says no to extortion? ETA: Of course the concept of "credibility" breaks down a bit when all agents are reasoning this way. Which is why the problem is still unsolved!
It does what? How so?
"Commit to just saying 'no' and proving that when just committing to just saying 'no' and proving that wins." Perhaps something like that.
That is pretty incoherent. If you are trying to come up with a general decision theory that wins and also says no to extortion, then you have overdetermined the problem (or will overdetermine it once you supply your definition). If you are predicting that a decision theory that wins will say no to extortion, then it is a rather pointless claim until you supply a definition. Perhaps what you really intend to do is to define 'extortion' as 'that which a winning decision theory says no to'. In which case, Nash has defined 'extortion' for you - as a threat which is not credible, in his technical sense. Why do you say the problem is still unsolved? What issues do you feel were not addressed by Nash in 1953? Where is the flaw in his argument? Part of the difficulty of discussing this here is that you have now started to use the word "credible" informally, when it also has a technical meaning in this context.
My objection to calling the ice cream negotiation tactic 'extortion' is it just totally isn't. It's an offer of a trade. Then it's a good thing we've made developments in our models in the last six decades!
Cute. But perhaps you should provide a link to what you think is the relevant development.
Well, the key concept underlying strong resistance to extortion is reputation management. Once you understand the long-term costs of becoming identified as a vulnerable "mark" by those in the criminal underground, giving in to extortion can start to look a lot less attractive.
Tim, we are completely talking past each other here. To restate my position: Nash in 1953 characterized rational 2 party bargaining with threats. Part of the solution was to make the quantitative distinction between 'non-credible' threats (which should be ignored because they cost the threatener so much to carry out that he would be irrational to do so), and 'credible' threats - threats which a threatener might rationally commit to carry out. Since Nash is modeling the rationality of both parties here, it is irrational to resist a credible threat - in fact, to promise to do so is to make a non-credible threat yourself. Hence, in Nash's model, cost-less signaling is pointless if both players are assumed to be rational. Such signaling does not change the dividing line between threats that are credible, and rationally should succeed, and those which are non-rational and should fail. As for the 'costly signalling' that takes place when non-credible threats are resisted - that is already built into the model. And a consequence of the model is that it is a net loss to attempt to resist threats that are credible. All of this is made very clear in any good textbook on game theory. It would save us all a great deal of time if you keep your amateur political theorizing to yourself until you read those textbooks.
I am kinda surprised that you are in such a muddle about this - and are willing to patronise me over the issue! "Don't negotiate with terrorists" and "don't give into extortion" are well-known maxims. As this thread illustrates, you don't seem to understand why they exist. I do understand. It isn't terribly complicated. I expect I can explain it to you. If a government gives in to terrorist demands during a hijacking, it sends a signal to all the other terrorists in the world that the government is vulnerable to extortion. Subsequently the government is likely to face more hijackings. So... in addition to the obvious cost associated with the immediate demands of the terrorists, there is a hidden cost associated with gaining a reputation for giving in to terrorists. That hidden cost is often huge. Thus the strategy of not giving in to terrorist demands - even if doing so looks attractive on the basis of a naive cost-benefit analysis. Other forms of extortion exhibit similar dynamics...
So if Thud cooperated with some less drastic version of Fred's plan that left a future to care about, he would be causing humans to get a reputation for giving in to extortion, even if the particular extortion he was faced with would not have been prevented by the aliens knowing he probably would not have given in. This is a different argument from the backward causality UDT seems to use in this situation, and AIXI could get it right by simulating the behavior of the next extortionist. Good idea. Thanks for posting.
To elaborate a bit: I'll give you utility if you give me utility is a trade. I won't cause you disutility if you give me utility is extortion.

I'll give you utility if you give me utility is a trade.

I won't cause you disutility if you give me utility is extortion.

I don't think that's exactly the right distinction. Let's say you go to your neighbour because he's being noisy.

Scenario A: He says "I didn't mean to disturb you, I just love my music loud. But give me 10 dollars, and sure, I'll turn the volume down." I'd call that a trade, though it's still about him not giving you disutility.

Scenario B: He says "Yeah, I do that on purpose, so that I can make people pay me to turn the volume down. It'll be 10 bucks. " I'd call that an extortion.

The difference isn't between the results of the offer if you accept or reject -- the outcomes and their utility for you is the same in each (loud music, silence - 10 dollars).

The difference is that in Scenario B, you wish the other person had never decided to make this offer. It's not the utility of your options that are to be compared with each other, but the utility of the timeline where the trade can be made vs the utility of the timeline where the trade can't be made...

In the Trade scenarios, if you can't make a trade with the person, he's still being noisy, and utility minimizes. In the Extortion scenarios, if you can't make a trade with the person, he has no reason to be noisy, and utility maximizes.

I'll probably let someone else to transform the above description into equations containing utility functions.

Yeah, I was being sloppy. The more important part for extortion is that they threaten to go out of their way to cause you harm. Schelling points and default states are probably relevant for the distinction. You can't read a payoff table and declare it extortion or trade.
Meh. I hope we can define extortion much simpler than that. How about "Extortion: Any offer of trade (t) by A to B, where A knows that the likely utility of B would be maximized if A had in advance treated (t) as certainly rejected." In short extortion is any offer to you in which you could rationally wish you had clearly precommitted to reject it (and signalled such precommitment effectively), and A knows that.
Another example. A and B share an apartment, and so far A has been doing all that household chores even though both A and B care almost equally about a clean house. (Maybe A cares slightly more, so that A's cleanliness threshold is always reached slightly before B's threshold, so that A ends up doing the chore every time.) So one day A gives B an ultimatum: if they do not share household chores equally, A will simply go on strike. B realizes, too late, that B should have effectively and convincingly pre-committed earlier to never doing household chores, since this way A would never be tempted to offer the ultimatum. A is aware of all this and breathes a sigh of relief that he made his ultimatum before B made that pre-commitment. By the above definition, A is an extortionist.
I'm almost convinced my definition is faulty, but not completely yet. In this case, if the offer was sure to be rejected, Alice (A) may move out, or evict Bob (B), or react in a different way that minimizes Bob's utility, or Alice may just decide to stop chores anyway because she just prefers a messy and just household than a clean and injust one. So precommitment to reject the offer doesn't necessarily help Bob. But I need to think about this. Upvoting both examples.
B is threatening to kill his hostage unless a million dollars is deposited in B's offshore account and B safely arrives outside of legal jurisdiction. A tells B that if B kills the hostage then A will kill B, but if B lets the hostage go then, in trade, A will not kill B. B realizes, too late, that B should have set things up so that the hostage would automatically be killed if B didn't get what he wanted even if B got cold feet late in the game (this could be done by employing a third party whose professional reputation rests on doing as he is initially instructed regardless of later instructions). This would have greatly strengthened B's bargaining position. A is aware of all this and breathes a sigh of relief that B did not have sufficient foresight. Is A an extortionist? He is by the above definition.
A's actions read like textbook extortion to me, albeit for a good cause. About the only way I can think of to disqualify them would be to impose the requirement that extortion has to be aimed at procuring resources -- which might be consistent with its usual sense, but seems pretty tortured.
A is walking down the street minding their own business carrying a purse. B wants what's in the purse but is afraid that if B tries to snatch the purse, A might cause trouble for B (such as by scratching and kicking B and calling for help). It is implicit in this situation that if B does not bother A, then, in trade, A will not cause trouble for B. B realizes, too late, that B should have worn something really scary to signal to A that B was committed to being bad, very bad, so that neither kicking or scratching nor calling for help would be likely to be of any use to A. This would have strengthened B's bargaining position. A, not being an idiot, is aware of this as general fact about people, including about B, and breathes a sigh of relief that there aren't any scary-looking people in sight. Is A an extortionist? Is A continually extorting good behavior from everyone around A, by being the sort of person who would kick and scratch and call for help if somebody tried to snatch A's purse, provided that the purse snatcher had not effectively signalled a pre-commitment to snatch the purse regardless of A's response? A is implicitly extending an offer to everyone, "don't try to take my purse and, in trade, I won't kick and scratch and call for help." A purse snatcher who effectively signals a pre-commitment to reject that offer (and thus to take the purse despite kicking and scratching and calling for help) places themselves in a stronger position in the implicit negotiation. This seems to follow all the rules of the offered definition of extortion, i.e.:
Hmm. Interesting edge case, but I think the fact the second extortion is retaliation aimed to disarm the first one with proportional retribution prevents our moral intuition from packaging it under the same label as "extortion". If A threatened to kill in retaliation B's mother, or B's child, or B's whole village -- then I don't think we would have trouble seeing both of them as extortionists. Or this scenario: Still, perhaps we can refine the definition further.
I offer a variant on the hostage negotiator here. In this variant, the hostage negotiator is replaced by somebody with a purse, and the hostage taker is replaced by a purse snatcher. As a point of comparison to the purse snatching scenario, consider the following toy-getting scenario: Whenever a certain parent takes a certain child shopping, the child throws a tantrum unless the child gets a toy. To map this to the purse snatching scenario (and to the other scenarios), the child is A and the parent is B. If the parent convincingly signals a pre-commitment not to get the child a toy, then the child will not bother throwing a tantrum, realizing that it would be futile. If the parent fails to convincingly signal such a pre-commitment, then the child may see an opportunity to get a toy by throwing a tantrum until he gets a toy. The child throwing the tantrum is in effect offering the parent the following trade: get me a toy, and I will stop throwing a tantrum. On future shopping trips, the child implicitly offers the parent the following trade: get me a toy, and I will refrain from throwing a tantrum. I would call the child an extortionist but I would not call the person with a purse an extortionist, and the main difference I see is that the child is using the threat of trouble to obtain something which was not already their right to have, while the person with the purse is using the threat of trouble to retain something which is their right to keep.
And what is the distinction between giving utility and not giving disutility? As consequentialists, I thought we were committed to the understanding that they are the same thing.
The distinction is that I can commit to not giving into extortion, and not also turn down possibly beneficial trades.
You seem to be assuming that committing to 'not giving in to extortion' will be effective in preventing rational threats from being made and carried out. Why do you assume that? Or, if you are not making that assumption, then how can you claim that you are not also turning down possibly beneficial trades?
Because then you don't get a reputation in the criminal underground for being vulnerable to extortion - and so don't face a circlling pack of extortionists, each eager for a piece of you.
Well, a simple way would be to use the legal definition of extortion. That should at least help prevent house fires, kidnapping, broken windows and violence. ...but a better definition should not be too difficult - for instance: the set of "offers" which you would rather not be presented with.
None at all. But then I don't claim that it is a universal moral imperative that will be revealed to be 'my own imperative' once my brain is scanned, the results of the scan are extrapolated, and the results are weighted in accordance with how "muddled" my preferences are judged to be.
3Wei Dai13y
I see, so you're saying that universal unreciprocated altruism fails as a universal moral imperative, not necessarily as a morality that some people might have. Given that you used the word "crazy" earlier I thought you were claiming that nobody should have that morality.
I think it is easily possible to imagine naturalists describing some kinds of maladaptive behaviour as being "crazy". The implication would be that the behaviour was being caused by some kind of psychological problem interfering with their brain's normal operation.
I do claim that. In two flavors. 1. Someone operating under that moral maxim will tend to dispense with that maxim as they approach reflective equilibrium. 2. Someone operating under that 'moral' maxim is acting immorally - this operationally means that good people should (i.e. are under a moral obligation to) shun such a moral idiot and make no agreements with him (since he proclaims that he cannot be trusted to keep his commitments). Part of the confusion between us is that you seem to want the word 'morality' to encompass all preferences - whether a preference for chocolate over vanilla, or a preference for telling the truth over lying, or a preference for altruism over selfishness. It is the primary business of metaethics to make the distinction between moral opinions (i.e. opinions about moral issues) and mere personal preferences.
2Wei Dai13y
No, I don't want that. In fact I do not currently have a metaethical position beyond finding all existing metaethical theories (that I'm aware of) to be inadequate. In my earlier comment I offered two possible lines of defense for altruism, because I didn't know which metaethics you prefer: In your reply to that comment you chose to respond to only the second sentence, hence the "confusion". Anyway, why don't you make a post detailing your metaethics, as well as your arguments against "universal unreciprocated altruism"? It's not clear to me what you're trying to accomplish by calling people who believe such things (many of whom are very smart and have already seriously reflected on these issues) "crazy" without backing up your claims.
I'm not sure why you think I have called anyone crazy. What I said above is that a particular moral notion is crazy. Perhaps you instead meant to complain that (in the grandparent) I had referred to the persons in question as "moral idiots". I'm afraid I must plead guilty to that bit of hyperbole. I am gradually coming to think that there is little agreement here as to what the word metaethics even means. My current understanding is that metaethics is what you do to prepare the linguistic ground so that people operating under different ethical theories and doctrines can talk to each other. Meta-ethics strives to be neutral and non-normative. There are no meta-ethical facts about the world - only definitions that permit discourse and disputation about the facts. Given this interpretation of "meta-ethics", it would seem that what you mean to suggest is that I make a post detailing my normative ethics, which would include an argument against "universal unreciprocated altruism" (which I take to be a competing theory of normative ethics). Luke and/or Eliezer and/or any trained philosopher here: I would appreciate feedback as to whether I finally have the correct understanding of the scope and purpose of meta-ethics.
0Wei Dai13y
I thought you might have certain metaethical views, which might be important for understanding your normative ethics. But yes, I'm mainly interested in hearing about your normative ethics.
Hidden assumptions play a role similar to the auxiliary hypotheses which undermine naive Popperianism. The fallacy of ignoring auxiliary assumptions seems like a special case of the fallacy of presenting an argument from ignorance.

No, I think the central "problem" is that having preferences that others can thwart with little effort is risky because it makes you more vulnerable to extortion.

For example, if you have a preference against non-prime heaps of pebbles existing, the aliens can try to extort you by building huge numbers of non-prime heaps on their home planet and sending you pictures of them, and therefore, the argument goes, it's crazy and stupid to care about non-prime heaps.

The argument also yields a heuristic that the farther away a thing is from you, the more stupid and crazy it is to care about it.

Right. What you are saying is related to the notion of "credible threats". If other agents can give you disutility with little disutility for themselves, then they have a credible threat against you. And unless you either change your utility function, or find a way of making it much more difficult and costly for them to harm you, the rational course is to give in to the extortion. One way to make it costly for others to harm you is to join a large coalition which threatens massive retaliation against anyone practicing extortion against coalition members. But notice that if you join such a coalition, you must be willing to bear your share of the burden should such retaliation be necessary. The alternative I suggested in the grandparent was to change your utility function so as to make you less vulnerable - only care about things you have control over. Unfortunately, this is advice that may be impossible to carry out. Preferences, as several commentators here have pointed out, tend to be incorrigible.
I took the obvious solution to that difficulty. I self modified to an agent that behaves exactly as if he had self modified to be an agent with preferences that make him less vulnerable. This is a coherent configuration for my atoms to be in terms of physics and is also one that benefits me.
Your variation is better than mine! Not sure about your solution though, it looks a little hurried.
However it is a different problem. An interesting problem in its own right but one for which many people's coherent preferences will produce a different answer for slightly different reasons.
Well, I see the central problem in the notion that we should care about something that happens to other people if we're not the ones doing it to them. Clearly, the aliens are sentient; they are morally responsible for what happens to these humans. While we certainly should pursue possible avenues to end the suffering, we shouldn't act as if we were.
Interesting. Though in the scenario I suggested there is no suffering. Only an opportunity to deploy pleasure (ice cream). I'm curious as to your reasons why you hold the aliens morally responsible for the human clones - I can imagine several reasons, but wonder what yours are. Also, I am curious as to whether you think that the existence of someone with greater moral responsibility than our own acts to decrease or eliminate the small amount of moral responsibility that we Earthlings have in this case.
Why would I not hold them responsible? They are the ones who are trying to make us responsible by giving us an opportunity to act, but their opportunities are much more direct - after all, they created the situation that exerts the pressure on us. This line of thought is mainly meant to be argued in Fred's terms, who has a problem with feeling responsible for this suffering (or non-pleasure) - it offers him an out of the conundrum without relinquishing his compassion for humanity (i.e. I feel the ending as written is illogical, and I certainly think "Michael" is acting very unprofessionally for a psychoanalyst). ["Relinquish the compassion" is also the conclusion you seem to have drawn, thus my response here.] Of course, the alien strategy might not be directed at our sense of responsibility, but at some sort of game theoretic utility function that proposes the greater good for the greater number - these utility functions are always sort of arbitrary (most of them on lesswrong center around money, with no indication why money should be valuable), and the arbitrariness in this case consists in including the alien simulations, but not the aliens themselves. If the aliens are "rational agents", then not rewarding their behaviour will make them stop it if it has a cost, while rewarding it will make them continue. (Haven't you ever wondered how many non-rational entities are trying to pose conundrums to rational agents on here? ;) I don't have a theory of quantifyable responsibility, and I don't have a definite answer for you. Let's just say there is only a limited amount of stuff we can do in the time that we have, so we have to make choices what to do with our lives. I hope that Fred comes to feel that he can accomplish more with his life than to indirectly die for a tortured simulation that serves alien interests.

Doubtful. It's likely that losing 90% of earth's population would result in collapse of the world's cultural and economic institutions, including nation states, resulting in a very different test. Maybe if they had clone-farms and produced physical humans to torture offworld?

Which brings to mind we should expend effort on "rescue" attempts, such as hacking the little spheres or stopping the aliens through other means.

It's also likely that an advanced alien species with an interest in getting rid of humans would do it by creating some nasty nanotech or redirecting a few asteroids, rather than going through this elaborate and unreliable blackmail scenario. It's not a dilemma if you parse out all the practical considerations, so don't.
Suspension of disbelief is important for what-if scenarios. I find little black spheres to be weirdly possible given a particular world-state ("crazy aliens"); thinking of 90% of humans being kidnapped evokes a very different possibility-view of the result. I can't stop thinking about post-apocalypse movies in the latter case.
But that would require actually doing something. Sending in a bunch of otherwise inert computational nodes is much cheaper. Hell, they don't even need to send them in. If they can convince people remotely that they're torturing simulations, they can commit genocide by telecommuting, assuming people with Fred's preferences are in power.
He's saying that Fred would end up with essentially the same dilemma, not that the aliens should have done it that way.

Thud yawned. "Fred, you're fired. Get out of my house." As Fred left, Thud closed his curtains and tried to get back to sleep.

I loved this line. It gave me the same warm glow inside that I get when I read about a guy heroically saving the entire world and Thud pulls it off by going to sleep. That's a whole new level above Chuck Norris. Even a level beyond The Doctor. Tennant went to all the trouble of challenging the alien leader to a duel to the death when the Sycorax pulled this stunt!

Would it make a difference if instead of simulation, they had gotten human dna and were speed-growing clones to torture?

It shouldn't, IMO. The reasonable job description for Thud should be some combination of protecting a particular set of people and a particular piece of land, and that doesn't cover rescuing synthesized humans.
If that is the extent of his job description, should he stop all humanitarian aid to foreign nations where it has no direct benefit to his own set of people? And would that allow him to eliminate the rest of humanity for some marginal benefit to his countrymen?
Along these lines, why do each of us individually do pretty much nothing, or at best something pretty minimal, to help the millions of people in the real world living in poverty or dying from preventable diseases? It seems to me our empathy has only a "limited range", something like the "monkeysphere" effect I suppose, whereby we only really care about those closest to us. We have some abstract empathy for the less fortunate, but not enough to really do much about it. I can imagine also that empathy asymptotes to almost zero at the emotional distance of simulated people (and that we can only think about large numbers of people logarithmically, so that there is hardly any difference between millions and trillions of people to the empathy portion of our utility function). As the scenario demonstrates, it seems like this behaviour has great survival value. And to answer your question, the real world demonstrates the answer. The amount of foreign aid given to developing countries is abysmally small compared to the value we should place on the lives of those living in disease and poverty, if we indeed cared about them anything like as much as we cared about those who affect our lives more directly. The current amount of foreign aid given is more closely proportional to its UN and local PR value. It is politically, not morally, motivated. For the second question, well we are generally more adverse to direct negative intervention that to lack of positive intervention, such as in the case of the runaway train that is about to kill a group of schoolchildren, and a big fat man either a) happens to fall in the path of the train (stopping it), but you could intervene and save him - but this would result in the deaths of the schoolchildren; or b) you are in a position where you could push the big fat man in front of the train, thus saving the children but killing the fat man. Most people would not save the fat man, but they would not push him in front of the train either. I
"And would that allow him to eliminate the rest of humanity for some marginal benefit to his countrymen?" That doesn't sound too far from the principle which many militaries throughout history and in the present follow, including what some very large factions within American politics want the military to follow. For example, 50,000 civilians were killed in the war in Afghanistan (and that estimate is a few years old); if the Afghanistan war is justified because of 9/11, that gives a value of at least 16 Afghans per American. If we assume that the war wasn't just barely* justified- that nobody would switch their opinion due to it being 60,000 rather than 50,000 casualties- a ratio of 20 Afghans to 1 American or probably higher seems quite reasonable. Extrapolating from that we can see that killing 6-7 Billion Afghans to save ~300 million Americans would be quite reasonable. (if we count based on the Iraq War, it would be to save 30 million Americans) the actual utilitarian preventative calculations are complicated here, namely because the war in Afghanistan most likely made further terrorist attacks more* likely rather than less, so I'm just assuming revenge is our utility function. I'm also not counting the American soldiers lost in Afghanistan. Now, to clarify, if the war in Afghanistan is justified because even more Afghans would have died under Taliban rule otherwise, then that wouldn't apply. But it seems to me that the main argument presented in favor of the war, even by liberals who support it, is over "preventing terrorism" rather than humanitarian interests, which seems to be widely appealing to the vast majority of the American public. Obviously that's to actually avenge or prevent the deaths of a large portion of the country, not "some marginal benefit". Still, I don't think it's too far off to say that quite a few people do hold that view and would not be at all inconsistent, in that regard, in wanting General Thud to control their armed forces.
You failed to consider a combination. For instance, the main objective of the war is preventing terrorism, but the fact that the Taliban would have killed people if left in power changes it from "lots of lives lost to prevent terrorism" to "few lives lost to prevent terrorism". Just because preventing the Taliban from killing wasn't our main goal doesn't mean that it can't affect the balance in favor of the goal that we did have.

Stuart Armstrong posted a similar scenario earlier. A lot of the discussion there are relevant here.

Some of the responses on that thread are hilarious, when did LW become so serious?

(I guess the other thing to do is to look at other similarly old threads and see if it has equally irreverent comments)

This story made me smile because it was an enjoyable, well-written piece of fiction :)


Don't give in to blackmail, duh.

This would be more interesting if they were offering some sort of simulated reward instead. For sufficiently large, otherwise unreachable rewards giving in might even be the correct answer.

What's "blackmail", especially given that these are aliens, and there is no understanding of their motive?
Understanding is not zero. If someone expends resources in a way that has a good chance of influencing a person similar to you to do X, you can conclude that there is a good chance they want you to do X. If this expenditure also harms you, then it may make sense to not do X.
I already answered in that thread

If the aliens care so much about us building death camps and marching our citizens into those, why don't they simulate that directly instead of simulating torture in an indirect attempt to achieve the primary goal?

They want real estate. That is, real real estate, not simulated real estate. If they persuade us to kill ourselves they can occupy the solar system without any opposition.
What's so much better about the "real" substrate, as opposed to the simulated one, that makes such an attempt worth investing the massive resources it must take to build all these computers - and commit to using them for torture? The "trade" that the aliens are proposing seems to be as follows: you have something we want, namely a rather inefficient world capable of supporting ten billion. We have a substrate capable of convincingly simulating trillions of people. Give us your inefficient world or we will devote our efficient substrate to making many copies of your folk miserable. This trade has strange payouts. The "no deal" case (putatively) creates massive loss for us, but also for the aliens - they lose the benefit of using the substrate for their own purposes AND they don't gain our world. The "deal" case disposes of all humanity and gains the aliens a tiny slice of elbow room. If you do care about what happens to simulations, a better trade would be to let us use the substrate to support a population in the trillions, in exchange for the use of our smaller world. Me, I'd rather be alive in a simulation (barring torture) than dead in the real world.

What happens if we surrender? Do they start treating the tortured simulations well? What kind of torture do they subject us to?

I assumed it would be obvious that the non-simulated humans would be killed if they were marched into the death camps, although I didn't actually say so. The torture is worse than death, so the simplest detail that makes the story work is that the aliens delete the simulated humans in that case.
The terms of the ultimatum weren't clear from the story. These are aliens, but we have to fill in the blanks using anthropomorphic intuitions.

I think we have to take this as motivation for war with the aliens, rather than giving into or ignoring their demands. Assume that the torturers try this sort of thing with more than one set of victims. If any of the victims successfully defeats the torturers, that spares the would-be subsequent victims from the dilemma. Assume that there are multiple species of aliens that would try this torture technique, and some will concoct more persuasive versions of their threat. If none are stopped, we receive a series of alien torture threats that continues un... (read more)

Why exactly is Fred changing his preferences? If he's trying to make it so his future preferences are fulfilled, he should change them to preferring the universe to be exactly the way it is. If he's trying to make it so his current preferences are fulfilled, he probably shouldn't change them.

And what's he changing them to? Don't care about cheaply made people?

As he said, his concerns about the simulated people interfere with his sleep. More abstractly, he has preferences that attract his attention even though he cannot act upon them, so he is thinking about getting rid of them.
He can try to raise awareness. He can try to do something to keep the remaining spheres from breaking. Also, I got the impression that you were implying we should change our preferences like that. I haven't lost any sleep to this issue, so it wouldn't apply to me.
It seems like people often change their preferences in the real world. Do you think these instances fit your model, or do you disagree that it occurs (for example, are we using different definitions of 'preference')?

This has been said below but in two parts after a fashion.

Thud is in a position of protecting the people of earth. Thus it is his job to do whatever is best for that group.

If Thud makes a policy of giving in to simulated torture, it seems likely (although our experience with inhuman intelligence is a limiting factor here) that aliens or whatnot would be more likely to simulate and torture people.

So someone in Thud's position has a specific obligation to stop people like Fred from making the universe see earth as a pushover planet that will give up their resources to simulators.

What about the reverse, then? The aliens promise that, if we give them all our stuff, they will simulate a bunch of super happy copies of us. Do we give them stuff?
If it were up to Thud, he'd say "no", since he doesn't care about simulations. However, he's a soldier, not a politician, so that decision wouldn't be part of his job. In retrospect, perhaps I should have done the story with that premise and "President Thud" instead. It would have allowed me to show an example about inappropriate care about simulations, but without the confusing issue of how to deal with extortion situations.
already suggested that. Depends on how many super happy copies, how super happy, whether we expect to be able to do that ourselves, and possibly whose copies and how representative/diverse they are. Depending on those answers possibly yes.
This. Basically this becomes an economic transaction; we are buying simulations. If it's a good price, buy them. If not, don't.

Fred is already a simulation trapped in a small sphere and just doesn’t know it.

Why would we be willing to believe that they will stop the torture if we killed ourselves? Further, if we killed ourselves why wouldn't they delete or kill the people being tortured?

They gave us the diagrams of themselves if I understand correctly and from the story we have the understanding to tell they tell the truth. What is to stop us from building massive super computers that can simulate some way of torturing trillions of them unless they stop simulating the torture of us? By communicating this possibility to the aliens and letting them know that if... (read more)

Destructive analysis of a few of the small spheres. Ah, but the torture is worse than death! Deletion would be an improvement. If I were going to continue the story in that vein, I'd give the aliens the same morality as Thud, and have them make an uninformed guess that someone like Fred might have power.

These aliens seem temptingly fragile. They would make perfect targets to aim at when cane toad whacking. There is something appealing about treating that kind of threat with complete and utter contempt. Take them out with projectile vermin!

Since there are only a couple of thousand aliens around using such an inaccurate cane toad launching mechanism will just serve to make the fun last longer. But if more of the aliens start floating around upgrading the targetting system could be viable.