Review

Here are some variations on the original red-pill-blue-pill question, recently discussed here and here.

In all cases, everyone makes their decision in ignorance of everyone else's.

For the first four, who dies and who lives is the same function of the actions. That is, the problems are the same in every way that matters for the purpose of doing the best thing. The others are not isomorphic to these, nor to each other, but I think they do live in the same general area of the landscape of decision problems.

They are presented here without comment. I have already written quite enough of my own views on some of these on the other threads. My only purpose here is to collect together these variations in a single place.

1. The original.

If more than 50% of people choose the blue pill, everyone lives.

If not, red pills live and blue pills die.

Blue or red?

2. arabaga's reformulation.

If you choose the red pill, you live.

If you choose the blue pill, you die unless more than 50% choose the blue pill.

Blue or red?

3. Roko's version.

There is a room-sized blender that kills everyone who steps into it.

But, if 50% or more of people step into the blender, there will be too much resistance and it will fail to start and everyone who steps in will be fine.

Those who don't step into it are always fine.

Enter or don't enter?

4. The monster on the mountain.

There's a monster at the top of that mountain. It doesn't hunt people, but it will eat anyone who walks into its mouth. Unless enough people walk in at once.

Enter or don't enter?

5. Kickstarter pills.

Those who take the red pill live.

Those who take the blue pill live, if the target number of blue-pillers is reached. Otherwise, all the blue-pillers can regurgitate their pills and live.

Blue or red?

6. Omega hands out the pills.

Omega appears to the whole world and infallibly proclaims, "If for the next 24 hours, no-one commits any sin, then paradise will arrive."

(The small print: Omega is available to answer whatever questions you have about what constitutes "sin", but there will be few surprises for anyone familiar with the "natural law". No-one will be told to torture kittens.)

Choice 1: Commit no sin for the next 24 hours.

Choice 2: Ignore Omega's challenge.

1 or 2?

Note: It was once a widespread superstition of Christendom, back when Christendom was a thing, that if for a whole day no-one sinned, the Second Coming would arrive. It is also a widespread superstition of Christendom, that no-one can refrain from sin for a single minute, let alone a whole day.

7. Havel's greengrocer is offered a pill.

The Party has decreed that every business must display a patriotic poster declaring allegiance to the Party. Those who refuse will be severely punished. Unless at least half refuse, because then it will be impractical to punish everyone and the whole matter will be quietly forgotten. In any case there will be no consequences for those who display the poster.

Refuse or display?

8. Insanity Wolf hands out the pills.

Choice 1: The Cause is everything. The Cause is the only thing. Nothing else matters. In every waking hour, in every waking second, you must work for the Cause. While you're sleeping, the Cause is faltering! Resting? Revealed preference -- you don't care about the Cause -- you are Evil! No rest for the good! No choice for the good! You have to always be doing the very best thing you can all the time forever! It's a theorem! You can't argue with a theorem! O feet happily fettered on the path to salvation!

Result: Burn out in a year, accomplish nothing, sink into depression, commit suicide. Unless enough people join the Cause! Then paradise will be created before anyone burns out and everything will be wonderful for ever!

Choice 2: Fuck that shit.

Result: A life well-lived, which may include some personal contribution to the repair of the world. Unless all of those contributions aren't enough, and all die.

1 or 2?

9. Mountain rescue.

Every year, hill-walkers and mountaineers venture into remote places, places where the landscape itself is trying to kill you. Some of them get into difficulties and need to be rescued, always a hazardous job. The mountain rescue service is entirely run by volunteers with charitable funding.

Do you volunteer?

10. Burning Man

Every year, tens of thousands of people venture into a single remote place, a place where the landscape itself is trying to kill you.

How do you deal with that, in regard to yourself and to all the other Burners?

New Comment
35 comments, sorted by Click to highlight new comments since:

A mad scientist has kidnapped you and lots of other people and presents you with four buttons: a yellow button, a green button, a cyan button and a purple button. He forces you to take press either the yellow or green (but not both); and to press either the cyan or purple (but not both).

The yellow button is labelled "KILL" and each time you press it, it prints a "THANKS!" sticker and a counter increases. The green button is labelled "PEACE" and does nothing. The next day, if the counter has increased to be half the people the mad scientist kidnapped, then the mad scientist grabs an axe and uses it to murder everyone who won't give him a "THANKS!" sticker.

The cyan button is labelled "SHRUG" and does nothing. The purple button is labelled "DEATH", and when you press it, it injects a remote-activated poison capsule. If more than 50% of people press the purple button, a signal goes out to activate the capsules, killing everyone who pressed it. Otherwise, eventually the capsules will be expelled by the body, making both pressers survive.

Which do you pick?

Trying to distill the essence of your first one:

A mad scientist kidnaps everyone and takes a secret ballot, where you can either vote for 'RELEASE EVERYONE' or 'KILL SOME PEOPLE'. He accepts the majority decision, and if it's the latter, he kills everyone who voted for the former.

Wait, I guess it should say "If less than 50% of people press the purple button".

Let's see: first choice: yellow=red,green=blue. An illustration in how different framings make this problem sound very different, this framing is probably the best argument for blue I've seen lol

Second choice: There's no reason to press purple. You're putting yourself at risk, and if anyone else pressed purple you're putting them even more at risk.

[-]fx10

Assuming you meant "less than 50%" in the second question, they're both isomorphic to the original pill problem, but the first choice would actually be harder to make.

With the first choice, there is somewhat of a moral dilemma - if you press the yellow button, you'll be safe, but you will potentially be responsible for deaths of many other people.

The second choice is closer to the original pill question, and the only reason to press the "DEATH" button is to "help the people who also pressed it" - but if you don't press it, it's the killer and those people's actions that led to their death, so you feel much less responsible for it

IMO (Green, Cyan) is the correct response (modulo issues of "do you actually trust the mad scientist?") and people who pick Yellow have too many game theory brainworms.

[-]fx10

Green is an obvious choice when this is a hypothetical situation, but if an actual mad scientist kidnapped you and other people and presented you with the choice, it wouldn't be as easy. You'll still probably pick green, but the most probable outcome is that the majority of people will pick it, and you'll very likely feel guilt for the deaths of those who didn't.

I didn't say it would be a hard choice, I just said it would be harder; you'll actually think about it for at least some time, unlike the second choice, where the correct response is immediately obvious

Once I have seen the isomorphism of some of these puzzles, I know that the correct decision is the same for all of them. If, seeing this and knowing this, I let myself be influenced by the framing, I am failing to act properly. Are my feelings more important than getting the best outcome? What do my feelings weigh against lives and deaths? Am I to consider other people's fates merely as psychotherapy for my feelings?

Once I have seen the isomorphism of some of these puzzles, I know that the correct decision is the same for all of them.

This argument is only valid if the game-theoretic payoff matrix captures all the decision-relevant information about the problems. And since real-world payoff depends on not just your decision but also on the other people's decisions, the other players' distribution of choices should also be considered decision-relevant information. But since the other players' distribution of choices depends on the framing rather than just the game-theoretic payoff matrix, we can infer that the game-theoretic payoff matrix does not capture all decision-relevant information.

Applying this general logic to my game: you are going to live if you pick Green, because most other people will also pick Green, so if you care about getting the best outcome, you will achieve this even if you do pick Green. On the other hand, encouraging people to pick Yellow is bad because if you partially but not fully succeed then that is going to lead to Greeners dying. But picking Cyan is fine because Purple is bad and stupid.

Ah, so it's other people's feelings that I must pander to? I pick red and scoff at the emotional blackmail.

I don't think I appealed to other's feelings. I appealed to other's lives.

It’s their misguided feelings that got them into that scrape.

I stick to don’t let them go there; if they’re hell-bent on it, leave them to it.

My inclination would be to say that Green is just correct because it's best to keep a wide margin against bad stuff happening, and Green helps protect against that, even if technically nothing bad would happen if everyone picked Yellow instead of making "mistakes".

However, even if you don't accept that logic, your take seems to be something like "if people make mistakes on counterintuitive game theory questions, then they're worthless and not worth saving". I think this is probably materially false (they're probably fairly economically productive, and they're probably less likely than average to create an AI that destroys the world, and they're probably trying to help their friends and family), and even if you're not interested in/convinced by the material argument, it still seems kind of psychopathic to reject their worth. But you do you I guess.

How do you get from "I will not risk my life to save these people" to "I think these people are worthless"? As I'd said several times in other comments, the way to deal with them is to keep them away from the suicide pills.

But you do you I guess.

You do like that phrase. Way to be smug, condescending, patronising, and passive-aggressive all at once! Well, "if yer conscience acquits ye", as one of my father's older relatives liked to say, with similar import.

I guess strictly speaking you're right that that position wasn't part of your comment here and instead I'm inferring it from your position in an earlier comment: https://www.lesswrong.com/posts/ZdEhEeg9qnxwFgPMf/a-short-calculation-about-a-twitter-poll?commentId=BKYssioxuxevLEWCy

I do say there that I will go so far as trying to dissuade them. But unless I'm in some personal relationship of care to them, I do not see what more I could reasonably do. I don't consider it reasonable to walk into the blender with them.

Since I consider the framing relevant, but you don't consider the framing relevant, I assume you wouldn't mind stopping bringing up the blender framing (where I agree with you that of course you should not enter the blender), and instead would be willing to focus solely on the yellow/green button framing (where we do have a disagreement)? I.e.

I do say there that I will go so far as trying to dissuade them. But unless I'm in some personal relationship of care to them, I do not see what more I could reasonably do. I don't consider it reasonable to press the "PEACE" button with them.

The "PEACE" button sounds so warm and fuzzy, how could anyone object to pushing a button called "PEACE"? But "PEACE" is just a suggestively named token.

This is a core part of rationality, looking past the names to consider what things are when we are not looking at them. If you insist that the framing in your head and other people's should override the reality, then I really do not know how to continue here. Reality does not know about your made-up rules. It will not do to say, I know that, but what about all the other benighted souls? Educate them, rather than pandering to their folly — a founding principle of LessWrong.

My underlying aim is to explain behavior in terms that would still apply if I were not present to observe and characterize it.

— William T. Powers

Maybe "PEACE" is a suggestively-named LISP token, but certainly "KILL" is not. The "KILL" button is hooked up to a counter which the mad scientist uses to determine whether to kill people. One could also make the "PEACE" button more correctly named by following bideup's suggestion of making it a majority vote rather than having the "PEACE" button do nothing.

But also, ignoring labels means that you can't even solve games such as the Schelling point game.

(And like, if you were to make the real-world decision, it's the game-theoretic payoff matrix that is an abstraction in your head, not the buttons you could press. They're real.

What this cashes out to is that strictly speaking, Yellow and Green are not the only options. You could also do stuff like attempting to flee or to punch the mad scientist or to trick the mad scientist into thinking you've pressed a button when you really have not. Of course this is probably a bad idea, because the mad scientist has kidnapped you and is capable of killing you, so you'll probably just die if you do these things, and therefore we remove them from the payoff matrix to simplify things. (Similarly, when reasoning about e.g. MAD, you don't include actions such as "launch the nukes onto yourself" because they are stupidly self-destructive.))

The "KILL" and "PEACE" buttons are both connected to the mad scientist's decision, so "KILL" is indeed just another suggestively-named token.

The game-theoretic payoff matrix is an objective description of how the game works. It is as objectively real as, say, the laws of chess, and within its fictional world, it is as real as the laws of physics. If you sit down to play chess with a different set of rules in your head, you will either attempt moves that will not be allowed, or never even think about some moves which are perfectly legal. If you try to do engineering with different laws of physics, at best you will get nowhere, and at worst tout crystal healing as a cure-all.

Yes, sometimes you do have to take into account what the other players are thinking. Pretty much all sufficiently complicated games are like that, even chess. The metagame, as in Magic: The Gathering, may develop its own complex, rich culture, without which you will fare poorly in a tournament. But for this game, I have said how I take the other players' ideas into account: prevent the clueless from playing.

But for this game, I have said how I take the other players' ideas into account: prevent the clueless from playing.

You don't decide who is playing, the mad scientist does, so this is not a valid action.

(Unless you mean something like, you try to argue with the mad scientist about who should be included? Or try to force the mad scientist to exclude people who are clueless?)

If you sit down to play chess with a different set of rules in your head, you will either attempt moves that will not be allowed, or never even think about some moves which are perfectly legal.

That's not necessarily true. If it's a casual game between casual players on a physical chessboard and e.g. your opponent goes to the bathroom, there's a good chance you could get away with cheating, especially if you focus on a part of the board that your opponent isn't paying attention to.

This is gonna be harder as the players are better (because then they better remember the boards and recognize when a position is not plausible), when it's more serious games (because then people would catch the cheating) and when it's played on computers (because then the computers can recognize the rules), but even then the game theory is still an abstraction that doesn't take e.g. computational limitations or outside tools such as anal vibrators into account.

If you try to do engineering with different laws of physics, at best you will get nowhere, and at worst tout crystal healing as a cure-all.

I was under the impression that small buildings are frequently built on the assumption of a flat earth, and never built on quantum gravity.

The "KILL" and "PEACE" buttons are both connected to the mad scientist's decision, so "KILL" is indeed just another suggestively-named token.

They're connected to the mad scientist's decision about whether to kill or to be peaceful. Hence, the names are not just suggestively-named LISP tokens, but instead a map that corresponds to the territory.

You don't decide who is playing, the mad scientist does, so this is not a valid action.

It’s a bit late to play the “Don’t question the hypothetical” card, given that a lot of the discussion, and not just between us, has been about variations on the original. Hypotheticals do not exist in a vacuum. In the space of hypotheticals it can be illuminating to explore the neighbourhood of the proposed problem, and in the world in which the hypothetical is proposed, there is usually an unstated agenda behind the design of the puzzle that should be part of the discourse around it.

Or to put that more pithily:

“I didn’t give you that option!”

“That’s right, you didn’t. I took it.”

I was under the impression that small buildings are frequently built on the assumption of a flat earth, and never built on quantum gravity.

Oh, come on! Good enough approximation for a building site — but not for LIGO.

They’re [the KILL and PEACE labels] connected to the mad scientist's decision about whether to kill or to be peaceful.

You have connected them in exact parallelism to your description of the mad scientist’s decision, but all that does is shift the bump in the carpet to that description, which now does not correspond to the actual rules of the problem as you stated them. The rules of the mad scientist’s decision are that if half or more press button K he kills those who didn’t, and if fewer than half do, he kills no-one. An equivalent way of describing it is that if fewer than half press button P he kills them, and if at least half do, he kills no-one. The idea that the PEACE button does nothing is wrong, because everyone is required to press one or the other. Pressing one has exactly the same consequences as not pressing the other.

You are still deciding what to do on the basis of what things are called and how they are conceptualised. I anticipate that you will say that how other people conceptualise things is, from your point of view, an objective fact of substantial import that you have to deal with, and indeed sometimes it is, but that does not justify adopting their conceptualisations yourself, let alone imagining that reality will take any notice of the knots that you or anyone else ties their brains into.

I recall a story (but not its author or title) depicting a society where the inhabitants are divided into respectable people and outcasts, who are both socially forbidden (but not in any other way) to so much as acknowledge the others’ existence. Then space pirates invade who don’t care about these strange local rules.

BTW, I’m about to go away on holiday for a couple of weeks, so I may be reading and posting somewhat less frequently. That might come as welcome news :)

It’s a bit late to play the “Don’t question the hypothetical” card, given that a lot of the discussion, and not just between us, has been about variations on the original. Hypotheticals do not exist in a vacuum. In the space of hypotheticals it can be illuminating to explore the neighbourhood of the proposed problem, and in the world in which the hypothetical is proposed, there is usually an unstated agenda behind the design of the puzzle that should be part of the discourse around it.

Or to put that more pithily:

“I didn’t give you that option!”

“That’s right, you didn’t. I took it.”

I suggested some valid ways of fighting the hypothetical within my framing. If you want to take additional ways not compatible with the framing, feel free to suggest a different framing to use. We might just not disagree on the appropriate answer within that framing.

You have connected them in exact parallelism to your description of the mad scientist’s decision, but all that does is shift the bump in the carpet to that description, which now does not correspond to the actual rules of the problem as you stated them. The rules of the mad scientist’s decision are that if half or more press button K he kills those who didn’t, and if fewer than half do, he kills no-one. An equivalent way of describing it is that if fewer than half press button P he kills them, and if at least half do, he kills no-one. The idea that the PEACE button does nothing is wrong, because everyone is required to press one or the other. Pressing one has exactly the same consequences as not pressing the other.

"You have to pick either yellow or green" is a mathematical idealization of an underlying reality. I see no reason to believe that the most robust decision-making algorithm would ignore the deeper mechanistic factors that get idealized.

The mad scientist is presumably using some means to force you (e.g. maybe threatening your family), and there's always some risk of other disturbances (e.g. electrical wiring errors) whose effects would differ depending on the specifics of the problem.

"You have to pick either yellow or green" is a mathematical idealization of an underlying reality. I see no reason to believe that the most robust decision-making algorithm would ignore the deeper mechanistic factors that get idealized.

If no variation on the hypothetical is allowed, the problem is isomorphic to the original red-blue question, and the story about the mad scientist is epiphenomenal, mere flavourtext, not a “deeper mechanistic factor”.

If you allow variation (the only way in which the presence of the mad scientist can make any difference), then I think my variation is as good as yours.

You are trying to maintain the isomorphism while making the flavourtext have real import. This is not possible.

Are we talking about transgenderism yet? (I have been wondering this for the last few exchanges.)

I don't know what you mean by "variation" in this comment.

By “variation” I mean things like preventing the mad scientist from carrying out his dastardly plan, keeping people away from the misleadingly named PEACE button, and so on. Things that are excluded by the exact statement of the problem.

This sounds similar to what I was saying with

Unless you mean something like, you try to argue with the mad scientist about who should be included? Or try to force the mad scientist to exclude people who are clueless?

so I'm not sure why you are saying that I'm saying that you are not allowed to talk about that sort of stuff.

So OK I guess. Let's say you're all standing in a line, and he's holding a gun to threaten you. You're first in the line, and he explains the game to you and shows you the buttons.

If I understand correctly, you're then saying that you'd yell "everyone! press yellow!"? And that if e.g. he introduces a new rule of "no talking to each other!" and threatens you with his gun, you'd assault him to try to stop his mad experiment?

That is, by my logic, a valid answer. I don't know whether you'll survive or what would happen in such a case. I probably wouldn't do it, because it is too brave.

It’s your puzzle. You can make up whatever rules you like. I understood your purpose to be making a version of the red-blue puzzle that would have the same underlying structure but would persuade a different answer. But if isomorphism is maintained, the right answer must be the same. If isomorphism is not maintained, the right answer will be whatever it designed to be, at the expense of not bearing on the original problem.

This circle cannot be squared.

Presumably this specific aspect is still isomorphic to the red-blue puzzle. With the red-blue puzzle, when you are standing in line for the pills, you could also yell out "take red!", or assault the scientist threatening you with his gun.

Of course there do seem to be other nonisomorphisms, such as if you press the buttons multiple times. I admit that it is reasonable to say that these nonisomorphisms distinguish my scenario, but I think that still disproves your claim that framing shouldn't matter, because the framing determines the nonisomorphisms and is the place where you'd actually end up making the decisions.

Games in decision theory are typically taken to be models of real-world decision problems, with the goal being to help you make better decisions. But real-world decision problems are open-ended in ways that games are not, so logically speaking the games must be an idealization that don't reflect your actual options.

I disagree. From the altruistic perspective these puzzles are fully co-operative co-ordination games with two equally good types of Nash equilibria (everyone chooses red, or at least half choose blue), where the strategy you should play depends on which equilibrium you decide to aim for. Players have to try to co-ordinate on choosing the same one, so it's just a classic case of Schelling point selection, and the framing will affect what the Schelling point is (assuming everyone gets told the same framing).

(What's really fun is that we now have two different framings to the meta-problem of "When different framings give different intuitions, should you let the framing influence your decision?" and they give different intuitions.)

[+][comment deleted]10

The original poll (and these variants) mostly deal with a combination of issues related to coordination and altruism, but I think a variant that reframes things entirely in terms of a coordination and counterparty-modeling problem (and removes the death element) is also informative.

Suppose you're playing a game with N other people and everyone has to choose a red or blue pill:

  • If a majority [alt: a supermajority, say >90%, to make it harder] choose blue, everyone gets $100
  • If everyone chooses red, everyone gets $100
  • If a majority (but not all) choose red:
    • reds get $90
    • blues get $0

Of course, this is a different game than the original poll, but it has some of the same properties: red is the safe choice, in that if you choose it for yourself, you get most of the theoretical maximum payout, and you don't have to worry or think about what anyone else might do.

OTOH, if you're playing with a large enough group of N random earthlings, it is highly likely that someone is going to choose blue, so you won't get the maximum possible payout by choosing red. If you're in a setup where you're confident that most people will choose red regardless of what you do though, choosing red is still the best you can do - getting the full $100 may be simply out of reach for certain parameters of this game.

OTOOH, if you're playing with a bunch of friends or rationalists and / or you can discuss beforehand, you can all agree to choose blue, and likely there will be enough trust and sanity between all of you that everyone will get the $100, even if there are a few random troublemakers / trolls / risk-averse people who choose red.

For a given population, payout configuration, and majority threshold, under what circumstances should you choose red vs. blue? This is mainly a question of how well you can model the other players (and your risk tolerance), including how well you can model them modelling you (and them modelling you modelling them, etc.), rather than a question about game theory or altruism. If you can discuss as a group beforehand, the modelling problem will generally become much easier, unless you're in very unfavorable conditions (lots of trolls / low trust situation, etc.)

Separately, it would be nice to live in a world where, for most parameter settings of this game (value of N, population the players are drawn from, specific payoff  values / configuration, threshold of blue-coordination required, level of prior communication allowed, etc.), most people will choose blue in most circumstances, with little or no prior coordination.

Which leads to the question of what general lessons and ideas about game theory, decision theory, and coordination that we can teach people and spread widely, in order to enable blue majorities to form even under difficult circumstances. (I'm not sure exactly what these lessons would look like, but my best guess is that most of them are currently found mostly in relatively obscure  web fiction.)

I think a lot of the controversy / confusion around the original twitter poll was because many people were getting these points mixed up and not distinguishing between how people would answer from how (they thought) people should answer, based on their own understanding of game theory or decision theory or altruism or whatever.

11. Existentialism.

If you pick a blue pill, you die.

If you pick a red pill you live.

Blue or red?

Everyone is playing this game at every moment, always and everywhere.

12. Existenzialism.

If you pick a blue pill, you die.

If you pick a red pill you live.

Someone told you that the blue pill doesn't kill you, but breaks you out of the simulation.

Blue or red?

TL;DR Red,Red,Red,Red,Red,Blue?,Depends,Red?,Depends,Depends

1,2: Both are the same, I pick red since all the harm caused by this decision is on people who have the option of picking red as well. Red is a way out of the bind, and it's a way out that everybody can take, and me taking red doesn't stop that. The only people you'd be saving by taking blue are the other people who thought they needed to save people by taking blue, making the blue people dying an artificial and avoidable problem.

3,4: Same answer for the same reason, but even more so since people are less likely to be bamboozled into taking the risk

5: Still red, even more so since blue pillers have a way out even after taking the pill

6: LOL this doesn't matter at all. I mean you shouldn't sin, kind of by definition, but omega's challenge won't be met so it doesn't change anything from how things are now.

7: This is disanalogous because redpilling in this case(i.e. displaying) is not harmless if everyone does it, it allows the government to force this action. Whether to display or refuse would depend on further details, such as how bad submission to this government would actually be, and whether there are actually enough potential resisters to make a difference.

8: In the first option you accomplish nothing, as stated in the prompt. Burnout is just bad, it's not like it gets better if enough people do it lol. It's completely disanalogous since option 2(red?) is unambiguously better, it's better for you and makes it more likely for the world to be saved. Unlike the original problem where some people can die as a result of red winning.

9: This is disanalogous since the people you're potentially saving by volunteering are not other volunteers, they are people going for recreation. There is an actual good being served by making people who want to hike more safe, and "just don't hike" doesn't work the same way "just don't bluepill" does since people hike for its own sake, knowing the risks. Weigh the risks and volunteer if you think decreasing risk to hikers is worth taking on some risk to yourself, and don't if you don't.

10: Disanalogous for the exact same reason. People go to burning man for fun, they know there might be some (minimal) risk. Go if you want to go enough to take on the risk, otherwise don't go. Except in this case going doesn't even decrease the risk for others who go, so it's even less analogous to the pill situation!

8: In the first option you accomplish nothing, as stated in the prompt. Burnout is just bad, it's not like it gets better if enough people do it lol.

The first option is stipulated to achieve paradise, if only enough people take it.