Summary: If you object to consequentialist ethical theories because you think they endorse horrible or catastrophic decisions, then you may instead be objecting to short-sighted utility functions or poor decision theories.

Recommended: Decision Theory Paradox: PD with Three Implies Chaos?

Related: The "Intuitions" Behind "Utilitarianism"

The simple idea that we ought to choose actions according to their probable consequences, ever since it was formulated, has garnered a rather shocking amount of dissent. Part of this may be due to causes other than philosophical objections, and some of the objections get into the metaphysics of metaethics. But there's a fair amount of opposition on rather simple grounds: that consequentialist reasoning appears to endorse bad decisions, either in the long run or as an effect of collective action.

Every so often, you'll hear someone offer a reductio ad absurdum of the following form: "Consider dilemma X. If we were consequentialists, then we would be forced to choose Y. But in the long run (or if widely adopted) the strategy of choosing Y leads to horrible consequence Z, and so consequentialism fails on its own terms."

There's something fishy about the argument when you lay it out like that: if it can be known that the strategy of choosing Y has horrible consequence Z, then why do we agree that consequentialists choose Y? In fact, there are two further unstated assumptions in every such argument I've heard, and it is those assumptions rather than consequentialism on which the absurdity really falls. But to discuss the assumptions, we need to delve into a bit of decision theory.

In my last post, I posed an apparent paradox: a case where it looked as if a simple rule could trump the most rational of decision theories in a fair fight. But there was a sleight of hand involved (which, to your credit, many of you spotted immediately). I judged Timeless Decision Theory on the basis of its long-term success, but each agent was stipulated to only care about its immediate children, not any further descendants! And indeed, the strategy of allowing free-riding defectors maximizes the number of an agent's immediate children, albeit at the price of hampering future generations by cluttering the field with defectors.1

If instead we let the TDT agents care about their distant descendants, then they'll crowd out the defectors by only cooperating when both other agents are TDT,2 and profit with a higher sustained growth rate once they form a supermajority. Not only do the TDTs with properly long-term decision theories beat out what I called DefectBots, but they get at least a fair fight against the carefully chosen simple algorithm I called CliqueBots. The paradox vanishes once you allow the agents to care about the long-term consequences of their choice.

Similarly, the purported reductios of consequentialism rely on the following two tricks: they implicitly assume that consequentialists must care only about the immediate consequences of an action, or they implicitly assume that consequentialists must be causal decision theorists.3

Let's consider one of the more famous examples, a dilemma posed by Judith Jarvis Thomson:

A brilliant transplant surgeon has five patients, each in need of a different organ, each of whom will die without that organ. Unfortunately, there are no organs available to perform any of these five transplant operations. A healthy young traveler, just passing through the city the doctor works in, comes in for a routine checkup. In the course of doing the checkup, the doctor discovers that his organs are compatible with all five of his dying patients. Suppose further that if the young man were to disappear, no one would suspect the doctor.

First, we can presume that the doctor cares about the welfare, not just of the five patients and the traveler, but of people more generally. If we drop the last supposition for a moment, it's clear that a consequentialist utilitarian doctor shouldn't kill the traveler for his organs; if word gets out that doctors do that sort of thing, then people will stay away from hospitals unless they're either exceptional altruists or at the edge of death, and this will result in people being less healthy overall.4

But what if the doctor is confident of keeping it a secret? Well, then causal decision theory would indeed tell her to harvest his organs, but TDT (and also UDT) would strongly advise her against it. Because if TDT endorsed the action, then other people would be able to deduce that TDT endorsed the action, and that (whether or not it had happened in any particular case) their lives would be in danger in any hospital run by a timeless decision theorist, and then we'd be in much the same boat. Therefore TDT calculates that the correct thing for TDT to output in order to maximize utility is "Don't kill the traveler,"5 and thus the doctor doesn't kill the traveler.

The question that a good consequentialist ought to be asking themselves is not "What happens in situation Y if I do X?", nor even "What happens in general if I do X whenever I'm in situation Y", but "What happens in general if everyone at least as smart as me deduces that I would do X whenever I'm in situation Y"? That, rather than the others, is the full exploration of the effects of choosing X in situation Y, and not coincidentally it's a colloquial version of Timeless Decision Theory. And as with Hofstadter's superrationality, TDT and UDT will avoid contributing to tragedies of the commons so long as enough people subscribe to them (or base their own decisions on the extrapolations of TDT and UDT).

In general, I'd like to offer (without proof) the following rationalist ethical inequality:

Your true valuation of all consequences + a good decision theory ≥ any particular deontology.

Now, a deontological rule might be easier to calculate, and work practically as well in the vast majority of circumstances (like approximating real physics with Newtonian mechanics). But if you have to deal with an edge case or something unfamiliar, you can get in trouble by persisting with the approximation; if you're programming a GPS, you need relativity. And as rule utilitarians can point out, you need to get your deontological rules from somewhere; if it's not from a careful consequentialist reckoning, then it might not be as trustworthy as it feels.6

Or it could be that particular deontological rules are much more reliable for running on corrupted hardware, and that no amount of caution will prevent people from shooting themselves in the foot if they're allowed to. That is a real concern, and it's beyond the scope of this post. But what's actually right probably doesn't include a component of making oneself stupid with regard to the actual circumstances in order to prevent other parts of one's mind from hijacking the decision. If we ever outgrow this hardware, we ought to leave the deontologies behind with it.


1. Note that the evolutionary setup is necessary to the "paradox": if Omega dished out utils instead of children, then the short-term strategy is optimal in the long run too.

2. This is only right in a heuristic sense. If the agents suspect Omega will be ending the game soon, or they have too high a temporal discount rate, this won't work quite that way. Also, there's an entire gamut of other decision theories that TDT could include in its circle of cooperators. That's a good feature to have- the CliqueBots from the last post, by contrast, declare war on every other decision theory, and this costs them relative to TDT in a more mixed population (thanks to Jack for the example).

3. One more implicit assumption about consequentialism is the false dichotomy that consequentialists must choose either to be perfectly altruistic utilitarians or perfectly selfish hedonists, with no middle ground for caring about oneself and others to different positive degrees. Oddly enough, few people object to the deontological rules we've developed to avoid helping distant others without incurring guilt.

4. I'm assuming that in the world of the thought experiment, it's good for your health to see a doctor for check-ups and when you're ill. It's a different question whether that hypothetical holds in the real world. Also, while my reply is vulnerable to a least convenient possible world objection, I honestly have no idea how my moral intuitions should translate to a world where (say) people genuinely didn't mind knowing that doctors might do this as long as it maximized the lives saved.

5. The sort of epistemic advantage that would be necessary for TDT to conclude otherwise is implausible for a human being, and even in that case, there are decision theories like UDT that would refuse nonetheless (for the sake of other worlds where people suspected doctors of having such an epistemic advantage).

6. The reason that morality feels like deontology to us is an evolutionary one: if you haven't yet built an excellent consequentialist with a proper decision theory, then hard-coded rules are much more reliable than explicit reasoning.

New Comment
120 comments, sorted by Click to highlight new comments since: Today at 6:09 AM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Summary: If you object to consequentialist ethical theories because you think they (have bad consequences)...

The surgeon situation happens to lots of professionals who have other people's lives or secrets in their hands.

I know a social worker who is gay and works with gay men. It's a small community, and in some cases he's been at a club and seen friends go home with clients who he knows are HIV positive. Even though he knows his friends are about to risk their lives, he can't say anything that reveals the client's HIV status. Because if people believed professionals were breaking the code of confidentiality, even to save lives, they wouldn't get tested for HIV in the first place.

A good code of ethics, professional or otherwise, takes that kind of long view.

What if, instead of deciding whether the doctor murders the patient in secret when she comes to the hospital, we have to decide whether the government (perhaps armed with genetic screening results seized from a police databases and companies like 23andMe) passes a law allowing police to openly kill and confiscate organs from anyone whose organs could presumably save five or more transplant patients?

As far as I can tell, this would have no bad effects beyond the obvious one of killing the people involved - it wouldn't make people less likely to go to hospitals or anything - but it keeps most of the creepiness of the original. Which makes me think although everything you say in this post is both true and important (and I've upvoted it) it doesn't get to the heart of why most people are creeped out by the transplant example.

It would have quite a few bad knockon effects:

1) you have handed the government the ability to decide, at any point, to kill anyone they consider undesirable, provided they can find five compatible transplant recipients; this is a massive increase in their power, and a big step towards totalitarian society.

2) You are discouraging people from undergoing genetic screening

3) you are discouraging people from living healthily. If you are unhealthy, your organs are of less use to the government, and hence you are more likely to survive.

4) you are encouraging people to go off the grid; as people who are off the grid are less likely to be found for the purposes of harvesting.

Yes, these logical reasons are not directly the reason people are creeped out; but were you to find a less harmful scenario, you would also likely find the scenario less creepy.

For instance, most people would find it less creepy if the harvesting was limited only to those who are already in prison, on long (20 year+) sentences; and it also seems that that policy would have less indirect harms.

There’s an incentive here to raise frequency and length of prison sentences, though. I think I actually saw a “death sentence for repeated jaywalking” scenario in some TV show, and IIRC it was actual caused by some medically-better-for-everyone “reason”, and it was pretty creepy.

As far as I can tell, this would have no bad effects beyond the obvious one of killing the people involved - it wouldn't make people less likely to go to hospitals or anything

No, but it would make them afraid to go outside, or at least within the vicinity of police. This law might encourage people to walk around with weapons to deter police from nabbing them, and/or to fight back. People would be afraid to get genetic screening lest they make their organs a target. They would be afraid to go to police stations to report crimes lest they come out minus a kidney.

People with good organs would start bribing the police to defer their harvesting, and corruption would become rampant. Law and order would break down.

This sounds like an excellent plot for a science fiction movie about a dystopia, which indicates that it fails on consequentialist grounds unless our utility function is so warped that we are willing to create a police state to give organ transplants.

Not to mention an incentive to self mutilate. That is, to do damage to oneself such that the organs are no longer desirable but which leaves you better off than if you'd been harvested. Give yourself HIV for example.

Fourth reply: people deeply value autonomy.

Fifth reply: While in this case I don't think that the policy is the right consequentialist thing to do, in general I expect consequentialism to endorse some decisions that violate our current commonsense morality. Such decisions are usually seen as moral progress in retrospect.

Upvoted because fourth reply seems much closer to a true objection.

The probability of being killed in such a way would be tiny and wouldn't significantly alter expected lifespan. However people are bad at intuitive risk evaluation and even if any person would at least twice more likely have their life saved than destroyed because of the policy, people would feel endangered and unhappy, which fact may overweigh the positive benefit. But if this concern didn't apply (e.g. if most people learned to evaluate risks correctly on the intuitive level), I'd bite the bullet and vote for the policy.

By the way, upvoted for correct application of least convenient possible world technique.

Good point. My first objection is the same as prase's, my second is that a government that depends on popular support shouldn't enforce policies that creep out the citizens (either because they'd lose the next election in a landslide to a party with different values or decision theory, or because in a Parfit's Hitchhiker way, if it were clear they would do such things then they'd have lost the previous election). My third is that the creepiness here comes from our very real (and very understandable in consequentialist terms) fear of allowing the government too much power over citizens' life and death. If instead you asked, should we make it legal for people to voluntarily off themselves when by so doing they can save several other lives, and should we make it an honorable thing to do, then I'm not nearly as disturbed by the idea. (There are myriad variations to try out, but generally whenever it gets creepy I can identify some sort of awful incentive structure being set up.)
3Scott Alexander13y
I probably shouldn't have said "government". We get the same issues if the doctor just wanders around town, spots a genetically compatible person with his Super-Doctor-Vision, and kills them at their house. Or really through any means other than waiting for compatible donors to show up at the hospital. Your point five is well taken, though.
This doesn't fix the problem; it only changes the location. Giving your national medical association the power of citizens' life and death is almost as bad as giving it to the government. People won't be afraid in hospitals, instead they'll be afraid in their homes. They will have an incentive to try to hide from anyone who might be a doctor, or to kill them preemptively. This policy would be a declaration of war between doctors and citizens. I can't see the consequences going well.
Then I have an "easy" patch: let the entity that does the spotting and killing be incorruptible and infallible. Like, an AI, an army of robots, or something. With that, I don't see any obvious flaw beyond the fact that, with this level of technology, there are very probably better alternatives than transplantation. But the idea of creating a machine for the explicit purpose of killing people might be even more creepy than the police state we're vaguely familiar with. Compare with the comment I saw somewhere with this dilemma: * (a) Let the aliens that happen to visit us cure cancer, except for 1 random patient out of 100, that they will let die, then eat. * (b) Just let the aliens go, never to be heard of again.
I suspect that the space of killer organ-transplanting AIs programmed by humans has far many more negative outcomes than positive ones. Even if we stipulate that the AI is incorruptible and infallible, there are still potential negative consequences: * People still have an incentive to sabotage their own health and to not get tested (see wedrifid's argument) * People would live in fear of getting randomly killed, even if the chance was small * Allowing an AI to be built that can kill people might be a bad precedent * Allowing the practice of killing innocent people might be a bad precedent You could get rid of the self-sabotage and fear if you make the AI a secret. But a world where people can make secret AIs to kill innocents without any kind of consent or vote still seems like a bad place. Part of the reason it's creepy is because, just like a police state, the outcome is probably going to be bad in the vast majority of cases. This is an interesting case. My initial reaction was to let the aliens care cancer and eat 1/100 cancer patients (after they died). Yet as I thought about it more, and why people might find the scenario creepy, I became more and more worried. In a one-shot negotiation, it would make sense on consequentialist grounds to accept their deal. The 1% of patients that the aliens eat won't have a change in outcome: they would have died of cancer anyway. Yet, as usual with these thought experiments designed to test consequentialism, the answer changes when you consider the full range of possible consequences. This question hinges on why the aliens want to eat humans. If the aliens had some good reason why they need to eat humans to cure them, then that might be OK (like ingesting cancerous humans will let them figure out how to stop all the different types of cancer). Yet there are many ways that allowing aliens to get their hands on human specimens might be a bad idea. Maybe they could clone us, or engineer pathogens. Then there is the as
Plenty of people survive cancer. The specific cancer patients the aliens eat might have lived if not for the aliens.
If I was reading the thought experiment correctly, the aliens are only allowed to let people die and then eat them. So the aliens wouldn't be causing any patients to die who wouldn't have died anyway. If the aliens were allowed to eat humans before they died, then that would change the whole example and make consequentialists even more pessimistic.
It wasn't specified that they died of cancer, but yeah, my misreading, thanks.
Let die? As in wait until they die by natural(ish) causes then eat the corpse? This sounds like a no-brainer. (Cryonicists might have some concerns about the meaning of death, however. Hopefully the patients agree that they are dead at the time!)
Just in case you humans decide on this method, I volunteer to be the incorruptable and infallible machine. Oh, wait, I already did offer something like that, and you humans invented contorted rationalizations about how you wouldn't trust me, because I'd just use that power to impregnate female humans or something.
I think the first hurdle is whether this a worthwhile policy at all. We'd need to weigh the expected cost in donor lives against the improvement in recipient lives; I would think that the factors which make for the best donors like youth and health would tend to militate against the policy given the profile of recipients, often older and already known to be unhealthy. Secondly, even if the program would be positive sum, we'd need to weigh it against alternatives (like forced donation at death) to ensure that it was actually the best policy possible. Obviously, you could restate the hypothetical until all the factors which must be weighed demand Policy 145. But this is almost certainly a trivial exercise provided by the rule set governing consequentialism. At that point, however, I think there are a few responses available: (1) the world is not recognizable to me and I cannot, even with difficulty, really imagine what the balance of worlds close to it would be like; (2) while this world sounds great, I think I'm better off in this one and so I can safely say that I do not prefer it for myself; (3) the world is so different from the actual world that it is difficult to say whether such a world would be internally consistent if usefully similar to our own. I think response (1) allows us to "bite the bullet" on paper, knowing it will never be fired; response (2) seems like it may usefully encapsulate our problems with the hypothetical and generate the correct response "good for them, then"; (3) this response allows a denial or agnosticism about the world and "out there" hypotheticals in general. I think the proper response to this process is all three: I should agree were it so; I should properly recognize that I don't like it (and don't have to); and I can deny that the hypothetical reveals any important information about theory. I think these responses could be elided, though, simply by noting what was suggested earlier: given a static rule set and a fully malleabl
Why not just label the organ collection a "tax", and say "Even if the tax burdens some people disproportionately, it helps people more than it harms people, and is therefore justified"?
The idea of taxation is usually that people get taxed equally. You can't kill everyone 10%, you could only kill 10% a 100%.
Also, when you tax someone (especially someone with lots of money), they're usually capable of living afterwards.
I'd bet that if you look at the effects of ordinary taxes, and you count the benefits separately from the harms you'd find that statistically, the tax kills at least one person to help more than one person, just like the "organ tax". Of course, the organ tax vs. normal tax comparison is a comparison of seen versus unseen--you can't tell who the people are who were killed by the taxes since they are a statistical increase in deaths with nobody getting their hands bloody--but I hope we've learned that seen vs. unseen is a bias.
The claim that ordinary taxation directly causes any deaths is actually a fairly bold one, whatever your opinion of them. Maybe I'm missing something. What leads you to believe that?
In progressive tax regimes it's rather hard for people to literally be taxed into starvation, but that doesn't mean that no deaths occur on the margins. Consider for example the case where a person needs expensive medical treatment that's not covered by insurance, they (or their family) can't afford it, but it's close enough to their means that they would have been able to if it wasn't for their taxes. Or consider a semi-skilled laborer that's making enough money that their taxes are nontrivial, but not enough to support their family on base pay once taxes are factored in. In order to make ends meet they take a more dangerous position to collect hazard pay, and a year later they die in an industrial accident. And so forth. Looking at the margins often means looking at unusual cases, but that doesn't mean there aren't any cases where the extra money would have made a difference. That's not to say that dropping those taxes (and thus the stuff they fund) would necessarily be a utilitarian good, of course -- only that there's stuff we can put in the minus column, even if we're just looking at deaths.
Ah, the hazardous profession case is one that I definitely hadn't thought of. It's possible that Jiro's assertion is true for cases like that, but it's also difficult to reason about, given that the hypothetical world in which said worker was not taxed may have a very different kind of economy as a result of this same change.
I can think of a hypoothetical person who has a 99.9% chance of living without the tax, and a 99.8% with it. And I can also think of there being more than 1000 such hypothetical people. "Can afford to live without it but not with it" implies going all the way down to 0% chance. You don't need to go down to an 0% chance for there statistically to be deaths.
But how does that work? What mechanism actually accounts for that difference? Is this hypothetical single person we could have individually exempted from taxes just barely unable to afford enough food, for example? I don't yet buy the argument that any taxes I'm aware of impose enough of a financial burden on anyone to pose an existential risk, even a small one (Like a .1% difference in their survival odds). This is not entirely a random chance, since levels of taxation are generally calibrated to income, presumably at least partially for the purpose of specifically not endangering anyone's ability to survive. Also, while I realize that your entire premise here is that we're counting the benefits and the harms separately, doing so isn't particularly helpful in demonstrating that a normal tax burden is comparable to a random chance of being killed, since the whole point of taxation is that the collective benefits are cheaper when bought in bulk than if they had to be approximated on an individual level. While you may be in the camp of people who claim that citizenship in (insert specific state, or even states in general) is not a net benefit to a given individual's viability, saying "any benefits don't count" and then saying "it's plausible that this tax burden is a minor existential risk to any given individual given that" is not particularly convincing.
There are all sorts of random possibilities that could reduce someone's life expectancy by a tiny amount but which statistically over large numbers of people would result in more than one extra death. Imagine that someone has to work one extra hour per month and there's a tiny chance of dying associated with it, or that they delay a visit to the doctor by one week, etc. Or all the other mechanisms which cause poorer people to have lower life expectancies (I highly doubt you can't think of any), which mean that someone who gets marginally poorer by a tiny amount would on the average not live as long.
In Italy quite a few entrepreneurs have committed suicide since the time the tax rates were raised, which may or may not count depending on what you mean by “directly”.
Would that eliminate much of the remaining creepiness from Yvain's scenario? (It scarcely makes a difference to me, though others may differ.)
It does for me. "Tax" implies social acceptance in a way that "secret police" does not.
But people still die. I think a major part of how our instinctive morality works (and a reason humans, as a species, have been so successful) is that we don't go for cheap solutions. The most moral thing is to save everyone. The solution here is a stopgap that just diminishes the urgency of technology to grow organ replacements, and even if short-term consequentially it leaves more people alive, it in fact worsens out long-term life expectancy by not addressing the problem (which is that people's organs get damaged or wear out). If a train is heading for 5 people, and you can press a switch to make it hit 1 person, the best moral decision is "I will find a way to save them all!" Even if you don't find that solution, at least you were looking!
[parody mode] [/parody mode] Have you ever heard the saying "the perfect is the enemy of the good"? By insisting that only perfect solutions are worthwhile, you are arguing against any measure that doesn't make humans immortal.
My point was meant in the sense that random culling for organs is not the best solution available to us. Organ growth is not that far in the future, and it's held back primarily because of moral concerns. This is not analagous to your parody, which more closely resembles something like: "any action that does not work towards achieving immortality is wrong". The point is that people always try to find better solutions. If we lived in a world where, as a matter of fact, there is no way whatsoever to get organs for transplant victims except from living donors, then from a consequentialist standpoint some sort of random culling would in fact be the best solution. And I'm saying, that is not the world we live in.
Related, here is Eliezer's answer to the railroad switch dilemma from the ends don't justify the means (among humans):
Why stop there? Why not say that the moral thing is to save even more people than are present, or will ever be born, etc.?

"If you object to consequentialist metaethical theories"

There is no such thing as a 'consequentialist metaethical theory'.

Consequentialism is a first-order ethical theory.

While most people here despise philosophy (see here ), I do wonder how much people actually understand philosophy.

I think strictly speaking consequentalism is a property of first-order ethical theories. That is they either are consequentialist or are not. But it is not by itself a first-order theory.

Ah, that was a mistake- thanks for catching it.

This post really brings to light an inkling I had a while ago: TDT feels vaguely Kantian.


  1. "Choose as though controlling the logical output of the abstract computation you implement, including the output of all other instantiations and simulations of that computation."

  2. "Act only according to that maxim whereby you can at the same time will that it should become a universal law without contradiction."

Now, they're clearly not the same but they are similar enough that we shouldn't be surprised that consequentialism under TDT alleviates some of our concerns about traditional consequentialism. I find this exciting-- but it also makes me suspicious. Kantian moral theory has some serious problems and so I wonder if there might be analogous issues in CON+TDT. And I see some. I'll leave out the Kantian equivalent unless someone is interested:

"What happens in general if everyone at least as smart as me deduces that I would do X whenever I'm in situation Y"?

The problem is that no two situations are strictly speaking identical (putting aside exact simulations and other universes). That means CON+TDT doesn't prohibit a decision to carve up a vagran... (read more)

You've identified a subtle problem with implementing decision theories, but the answer that the commonsense version of TDT "should" give is pretty clear: if the differences between two situations don't massively affect the utilities involved (from the perspective of the deciding agent), then they should belong to the same reference class. If you shouldn't kill the patient in a house, then you still shouldn't kill him with a mouse, or on a boat, with a goat, on a train, in the rain, here or there, or anywhere.
this is somewhat circular. It works for the example but not knowing how similar two situations have to be before similar decisions produce similar utility is part of the problem.
TDT seems to sometimes fail this, depending on whether the other scenario counts as the 'same computation', whatever that means. UDT postulates the existence of a 'mathematical intuition module' that can tell which computations influence which others, but it is not known how such a module could be created. Developing one would probably constitute a large fraction of the difficulty of creating AGI.
Provided that the unique feature is relevant, no it does not. For example, if the vagrant's parts were capable of saving 1,000 lives (a very unlikely situation, and not one anyone needs to worry of finding themself in) that would be a relevant unique feature. However merely noticing that the vagrant is wearing a red baseball cap, made in 1953, and has $1.94 in their left pants pocket; while unique, is irrelevant. And as such it is easily modelled by using the protocol "insert random, irrelevant, unique aspect".
No disagreement about relevance of baseball caps for organ transplantations, but if TDT is defined using "all other instantiations and simulations of that computation", any small difference, however irrelevant, may exclude the agent from the category of instantiations of the same computation. The obvious countermeasure would be to ask TDT to include outputs not only of other instantiations of itself, but of a broader class of agents which behave similarly in all relevant aspects (in given situation). Which leads to the question how to precisely define "relevant", which, as far as I understand, is the parent comment asking.

But what if the doctor is confident of keeping it a secret? Well, then causal decision theory would indeed tell her to harvest his organs, but TDT (and also UDT) would strongly advise her against it. Because if TDT endorsed the action, then other people would be able to deduce that TDT endorsed the action, and that (whether or not it had happened in any particular case) their lives would be in danger in any hospital run by a timeless decision theorist, and then we'd be in much the same boat. Therefore TDT calculates that the correct thing for TDT to outpu

... (read more)
That's a good point, and why I like Gary Drescher's handling of such dilemmas in Good and Real (ch. 7), which allows a kind of consequentialist but acausal (or extracausal) reasoning. His decision theory is the TDT-like "subjunctive reciprocity" which says that "You should do what you self-interestedly wish all similarly-situtated beings would do, because if you would regard it as optimal, so would they". That is, as a matter of the structure of reality, beings that are similar to you have similar input/output relationships -- so make sure you instantiate "good" I/O behavior. The upshot is that subjunctive reciprocity can justify not harvesting in this case, and similar cases where we can assume no negative causal fallout from the doctor's decision. Why? Because if the doctor reasons that way (kill/harvest the patient "for the greater good") it logically follows that other people in similar situations would do the same thing (and thereby expect and adapt for it), even though there would be (by supposition) no causal influence between the doctor's decision and those other cases. On the contrary: like in Newcomb-like problems, the relevant effect is not one your decision causes; instead, your decision and the effect are common causal descendents of the same parent -- the parent being "the extent to which humans instantiate a decision theory like this". Similarly, Drescher reasons e.g. that "I should not cheat business partners, even if I could secretly get away with it, because if I regarded this as optimal, it's more likely that my business partner regards it as optimal to do to me." More importantly, "If I make my decision depend on whether I'm likely to get caught, so will my partner's decision, a dependency I do not wish existed." Consider the case of a 1000x run of Prisoner's Dilemma between 100 "similar" people, where they get randomly paired and play the game many times, but don't know anyone else's decision until it's all over. In that case, if you always c
On the other hand, other people knowing that you're a TDT agent is the reason TDT agents are able to cooperate. TDT on a large scale kind of depends on other people suspecting it. Please note that Alicorn's point is correct. If you are using this as a thought experiment, take "no one will ever suspect the doctor is a TDT agent" as an axiom. If you are looking at it to analyze decision theory, you do have to consider other factors.
Is there anything formalized, or even informalized, showing the boundaries of the circumstances wherein TDT agents should reveal their TDTness, and the subset of those where they should reveal their source code? I'm not aware of any; only of regions within that space where we're pretty sure they should be open about it.
This least convenient world basically requires the doctor (or perhaps all doctors) to have a massive epistemic advantage over every other human being, such that the idea won't even cross any patient's mind. In general, even if you're that much smarter than 99.9% of the world, you need to take into account that other people in the 0.1% can communicate their suspicions to everyone else.
I don't think that follows. The LCPW can allow that people could imagine others being TDT agents, but that the scenario does not provide sufficient evidence that the doctor is killing patients. Furthermore, if only 0.1% of the population can detect the doctor's decision theory, it's unlikely that their arguments will be comprehensible to the rest of the population, at least at a cost low enough to have the purported "people stop going to doctors" effect.
Yeah, but at that point you're not really talking medical ethics, you're playing against the rest of the world in a game of "God dicks you over in mysterious ways."
If someone is known, by their friends and family, to be relatively aware when it comes to such issues; and warns said friends and family of this danger, they will not need to give a comprehensible argument. Their statement is, in itself, evidence to those who trust them.
That that, specific, doctor runs TDT perhaps; but it is implausible to the point of irrelevance that no-one would ever suspect that any doctor anywhere runs on a TDT-esque thought process. And people suspecting that any doctor might run such processes is sufficient harm.
I wonder what fraction of the world's population has the necessary concepts in their heads to believe (or disbelieve) anything even slightly like "some doctors use TDT or something like it". I'd have thought well below 0.1%.
I think they would probably frame a question with similar content as "how trustworthy are doctors?"
I don't think the content of that question is similar at all.
The relevant thing to a TDT person is "how likely is it that there's someone simulating my mind sufficiently accurately?" "how trustworthy are doctors?" is a question that results in a simulation of a doctors mind. It seems, to me, that many people simulating that doctors mind will be capable of simulating it sufficiently accurately; even if they don't understand (on a conscious level) all the necessary jargon to explain what they are doing.
I was aware of, and practising, timeless decision theory before ever stumbling across Lesswrong, and, while I know this may just be the "typical mind fallacy" I would be surprised if only 0.1% of people had similar thoughts. Sure, I didn't call it TDT, because that is a piece of jargon only present in this community, but the basic principle is certainly not unique, or unknown, and I would expect that even many who don't undestand it would use it subconsciously.

In general, I'd like to offer (without proof) the following rationalist ethical inequality:

Your true valuation of all consequences + a good decision theory ≥ any particular deontology.

Where '≥' is defined, of course, as "gives better expected consequences than". It's the obvious tautology but bizarrely enough people do get it wrong enough that it's worth making a post on!

Huh. I took '≥' to mean "better describes human morality than"
In which case the claim is just false. That does not better describe human morality.
It seemed a normative '≥' to me.
Aren't they the same thing? If A describes morality better than B, clearly you should_morality follow A more than you should_morality follow B. And vice versa.
"Morality" != "human morality". If you say "A better describes human morality than B", you're just making an anthropologist-like statement about the behavior code of a certain tribe ("humans").
Well, you make a good point. They're not strictly identical since "better describes human morality than" has an interpretation which has different truth values in counterfactual worlds full of murderers, whereas "better describes morality than" is explicitly independent of the counterfactual values of humans. So... I retract my statement, I guess? I mean, I assumed that Jack meant "better describes morality than", which is a statement about what is right and wrong, and not a statement about what humans in any particular counterfactual world might think is "right" and "wrong". Just because in this world "human morality" and "morality" have the same referent, and because assuming that orthonormal was talking anthropologically about humans would be a weird thing to do. (I blame language, for having no built-in support for making this "dereferencing" explicit, except by actually pasting the referent itself in, as I did in 'I assumed that Jack meant "better describes morality than"'.)

I think this essay is about important topics, but too many of them and they would be better covered if they were separated. I'm just going to focus on the summary: typical objections to consequentialism.* The vast majority of objections to consequentialism are consequentialist, and thus incoherent. This essay explains this well, but I don't think previous post is a good example of this and I don't think that TDT is useful here, as a practical matter. Yes, there are examples where the problem is CDT-consequentialism, but that isn't the usual problem. Even w... (read more)

Thanks for the critique; the post you were looking for was the post I wanted to write, but I had too much difficulty writing it, and I chose to post what I did rather than risk getting burnt out and posting nothing at all. That's also, in fact, what happened with my qualia sequence. So I've now had two sequences of posts that were received tepidly despite most readers seeming to agree with them and finding them insightful; the most obvious conclusion is that I really need to improve as a writer. (Further evidence: people get much more excited about my ideas when I explain them face-to-face than when they read my writing.) I'm now publicly committing to relearn writing, with the help of Less Wrong... in two weeks, after the initial rush of the academic year subsides.
All your decisions sound correct. Good luck with the writing!

I think the point can be made more simply as follows:

Consequentialism is a theory about which states of the world (and thus implicitly actions) are preferable. For it to be true does not require or imply that worlds in which some particular bounded agent (or even all humans) believe in consequentialism or attempt to act according to it's dictates is preferable to the alternatives.

The obvious and simple counterexample is to have an alien race who tortures all humanity if they use their mind reading technology on anyone and ascertain they believe in consequentialism. In such a situation consequentialism obviously entails that it is better for everyone to believe in some non-consequentialist view of morality.

It's a stupid dilemma, since the optimal move is obviously for the patients to play russian roulette. The doctor doesn't even have any decisions to make, and should optimally be kept ignorant until he or she hears the loud bang. In this highly artificial situation - five tissue matches each needing a different organ, all of whom will die without it and with no likely donors. (What is this, clone club health issues?) Well, they are all dead if they do nothing. So, russian roulette among the already doomed. Upsides clearly outweigh the downsides, and the re... (read more)

The transplant dilemma is framed in a way that personalizes the healthy young traveler while keeping the other five patients anonymous. This activates the part of our brains that treats people as individuals rather than numbers. There's nothing wrong with the math. We object to the unfairness to the only individual in the story.

This dilemma is usually paired with an example of triage. Here an emergency-room doctor has to choose between saving one severely injured patient or five moderately injured patients. Five lives or one, the numbers are the same, but as long as all six patients are anonymous, it remains a numeric problem, and no one has a problem with the math.

Interesting; your phrasing (one severely injured vs 5 moderately injured) seems to pass on both deontological and utilitarian grounds for me, but if it was saving 5 severely injured people by letting one moderately injured person expire of their injuries, it feels like the trolley problem again. Maybe it is less "unfairness to individuals" and more "unfairness to people better off than the subjects".
I put in the different degrees of injury to set the context for the doctor's choice... maybe it takes 5 times as long to save the severely injured person. I didn't mean to imply that the severity of the injury affects the moral calculation. You're right, this is like the trolley problem. When all 6 people are anonymous, we do the calculation and kill 1 to save 5. When the trolley problem is framed as "push the fat man off the bridge", that's enough personalization to trigger the other part of the brain. Moral philosophy in general tries to find universal principles whose logical consequences agree with our moral intuition. The OP is saying that we can fix consequentialism by making the moral calculations more complicated. Good luck with that! If moral intuition comes from two different parts of the brain that don't agree with each other, then we can always construct moral dilemmas by framing situations so that they activate one part or another of our brains.
"philosophy tries... to agree with our ...intuition..."? Bravo! See, I think that's crazy. Or if it's right, it means we're stipulating the intuition in the first place. Surely that's wrong? Or at least, we can look back in time to see "obvious" moral postulates we no longer agree with. In science we come up with a theory and then test it in the wind tunnel or something. In philosophy, is our reference standard kilogram just an intuition? That's unsatisfying!
I had fun with friends recently considering the trolley problem from a perspective of INaction. When it was an act of volition, even (say) just a warning shout, they (we) felt less compelled to let the fat man live. (He was already on the track and would have to be warned off, get it?) It seems we are responsible for what we do, not so much for what we elect NOT to do. Since the consequences are the same, it seems wrong that there is a perceptive difference. This highlights, I suppose the author's presumed contention (consequentialism generally) that the correct ethical choice is obviously one of carefully (perhaps expensively!) calculated long term outcomes and equal to what feels right only coincidentally. I think in the limit, we would (consequentialists all) just walk into the hospital and ask for vivisection, since we'd save 5 lives. The reason I don't isn't JUST altruism, because I wouldn't ask you to either, instead it's a step closer to Kant's absolutism: as humans we're worth something more than ants (who I submit are all consequentialists?) and have individual value. I need to work on expressing this better...

"But what's actually right probably doesn't include a component of making oneself stupid with regard to the actual circumstances in order to prevent other parts of one's mind from hijacking the decision.

What you probably meant: "Rational minds should have a rational theory of ethics; this leads to better consequences."

My late-night reading: "A deontological theory of ethics is not actually right. It is wrong. Morally wrong."

I am not sure what caused me to read it this way, but it cracked me up.

I wonder if it's a coincidence that it's currently late at night and I find myself agreeing with both those readings. "Deontological ethics is morally wrong" sounds about accurate.
The fact that it sounds accurate is what makes it a funny category error, rather than a boring category error. "2 + 2 = 3 is morally wrong" is not funny. "Deontological ethics is morally wrong" is funny. It calls to mind a scenario of a consequentialist saying: "True, Deontologist Dan rescued that family from a fire, which was definitely a good thing... but he did it on the basis of an morally wrong system of ethics." That''s how I reacted to it, anyway. It's been a day, I've had more sleep, and I STILL find the idea funny. Every time I seriously try to embrace consequentialist ethics, it's because I think that deontological ethics depend on self-deception. And lying is bad. ---------------------------------------- EDIT: I am in no way implying that other consequentialists arrive at consequentialism by this reasoning. I am simply noting that the idea that consequentialist principles are better and more rational, so we should be rational consequentialists (regardless of the results), is very attractive to my own mental hardware, and also very funny.
Cracked you up? Rather than just seeming like a straightforward implication of conflicting moral systems?
I think it is not a straightforward implication at all. Maybe this rephrasing would make the joke clearer: "A deontological theory of ethics is not actually right. It is morally wrong, in principle." If that doesn't help: "It is morally wrong to make decisions for deontological reasons." What makes it funny is that moment wherein the reader (or at least, this reader) briefly agrees with it before the punchline hits.
I understand what you intended to be the joke. I just don't think you get it. This is a straightforward implication of having a consequentialist morality. Making moral decisions for deontological reasons and not because of the consequences according to the same value function is, by that standard, immoral. It's not a big deal. Mind you, I have made jokes along the same lines myself at times. With the implication "it's funny 'cos it's true!". I begin to see why it had a mixed reception.

Similarly, the purported reductios of consequentialism rely on the following two tricks: they implicitly assume that consequentialists must care only about the immediate consequences of an action, or they implicitly assume that consequentialists must be causal decision theorists.

"TDT + consequentialism" seems like it isn't a consequentialist theory any more -- it's taking into account things that are not consequences. ("Acausal consequence" seems like an oxymoron, and if not, I would like to know what sort of 'acausal consequences' a TDT-consequentialist should consider.) This feels much more like the Kantian categorical imperative, all dressed up with decision theory.

As a sidenote to my previous comment: I do wonder to what extent deontological concepts of "Acting With Honor" and "Just Vengeance" evolved in human societies as an effective approximation of a TDT, which encouraged both initial cooperation and punishment of defections by making societal members into the sort of being that would instinctively cooperate and punish those people with accurate enough models of themselves. On the other hand, attitudes of vengeance towards non-intelligent beings (beings that can't model you) are seen as much more... insane. Captain Ahab is perceived as a more insane figure than the Count of Monte Cristo; though both are driven by vengeance, because the former seeks vengeance against a whale, not against people. Though mind you, even against animals, vengeance is rather useful; because even animals can model humans to some extent. The wolves in The Jungle Book learned to "seven times never kill Man", after learning that to hurt one man, means many other men with guns coming to kill wolves in return.

The wolves in The Jungle Book learned to "seven times never kill Man", after learning that to hurt one man, means many other men with guns coming to kill wolves in return.

Using this to support your statement lowered my credence therein.

Upvoted for reminding me that some evidence are so weak, that to offer them does actually count as evidence against. :-)

Though mind you, even against animals, vengeance is rather useful; because even animals can model humans to some extent. The wolves in The Jungle Book learned to "seven times never kill Man", after learning that to hurt one man, means many other men with guns coming to kill wolves in return.

Beware fictional evidence. I suspect that wolves might be smart enough in individual cases to recognize humans are a big nasty threat they don't want to mess with. But that makes sense in a context without any understanding of vengeance.

Uh, yes, tongue-in-cheek about what poetry-using wolves in a fiction book "learned" from human vengeance. Still qualifies slightly as evidence in telling us how humans model animals to model humans.
Well, they could EVOLVE that reticence for perfectly good reasons. I'll dare in this context to suggest that evolution IS intelligence. Have you heard of thought as an act of simulating action and forecasting the results? Is that not what evolution does, only the simulations are real, and the best chess moves "selected?" a species thereby exhibits meta-intelligence, no?
That's a waste of a word. Call evolution an optimisation process (which is only a slight stretch). Then you can use the word 'intelligence' to refer to what you refer to as 'meta-intelligence'. Keeping distinct concepts distinct while also acknowledging the relationship tends to be the best policy. No, it really isn't and using that model encourages bad predictions about the evolution of a species. Species don't 'forecast and select'. Species evolve to extinction with as much enthusiasm as they evolve to new heights of adaptive performance. Saying that evolution 'learns from the past' would be slightly less of an error but I wouldn't even go there.
Hmm, I agree, except for the last part. Blindly trying (what genetic mixing & mutating does) it like poorly guided forecasting. (Good simulation engineers or chess players somehow "see" the space of likely moves, bad ones just try a lot) and the species doesn't select, but the environment does. I need to go read "evolve to extinction." Thanks
I'm not completely sure what you are trying to say. I agree they could potentially evolve such an attitude if the selection pressure was high enough. But evolution doesn't work like a chess player. Evolution does what works in the short term, blindly having the most successful alleles push forward to the next generation. If there were a chess analogy, evolution would be like a massive chess board with millions of players and each player making whatever move looks best at a quick glance, and then there are a few hundred thousand players who just move randomly.
Good point.. Easy to imagine a lot of biologically good designs getting left unexpressed because the first move is less optimal.
For an example of acausal consequences: getting a million dollars as a result of one-boxing in Newcomb's. Or getting a hundred dollars as a result of two-boxing. I would argue that TDT (or UDT) is actually a more consequentialist theory than CDT. The qualitative difference between consequentialism and deontology is that for consequentialists the most important thing is a good outcome, whereas deontology means following the correct rules, regardless of the outcome. But it's casual decision theorists, after all, that continue to adhere to their decision ritual that two-boxes, and loses, in the face of all the empirical evidence (well, hypothetical empirical evidence, anyway :p) that it's the wrong thing to do!
TDT basically takes into consideration the consequences of itself -- not just each particular action it endorses, but the consequences of you following a specific logic towards that action, and the consequences of other people knowing that you would follow such a logic. It's a consequentialist theory because it seeks to maximize the utility of consequent states of the world -- it doesn't have deontological instructions like "cooperate because it's the nice thing to do" -- it says things like "cooperate if and only if the other guy would be able to predict and punish your non-cooperation, because that leads to an optimal-utility state for you" All that having been said, I think some people are misusing TDT when they say people would know your non-cooperation. Omega would know your non-cooperation, but other people you may be able to trick. And TDT orders cooperation only in the cases of those you wouldn't be able to trick.
But then people you would (otherwise) be able to trick have the incentive to defect, making it harder to trick them, making (D,D) more likely than (C,C), which is bad for you. Having an intention to trick those you can trick can itself be a bad idea (for some categories of trickable opponents that respond to your having this intention).
Yes, it can be a bad idea -- I'm just saying TDT doesn't say it's always a bad idea.
(DefectBot is sufficient to demostrate it's not always a bad idea to defect. In other cases, it can be much more subtle.)
TDT can't reason about such things, it gets its causal graphs by magic, and this reasoning involves details of construction of the causal graphs (it can still make the right decisions, provided the magic comes through). UDT is closer to the mark, but we don't have a good picture of how that works. See in particular this thought experiment.

I wonder if most of the responses to JJT's thought experiment consider the least convenient possible world. (Recall Yvain's insightful discussion about Pascal's wager?)

Most of the responses that I have read try to argue that if the act of killing a healthy person to steal his organs for organ-missing people were generalized, this would make things worse.

By the way, this worry about generalizing one's individual act feels so close to thoughts of Kant - oh the irony! - whose "first formulation of the CI states that you are to 'act only in accordance wit... (read more)

Your doctor with 5 organs strikes me as Vizzini's princess bride dilemma, "I am not a great fool, so I can clearly not choose the wine in front of you."

So it goes, calculating I know you know I know unto silliness. Consequentialists I've recently heard lecturing went to great lengths, as you did, to rationalize what they 'knew" to be right. Can you deny it? The GOAL of the example was to show that "right thinking" consequentialists would come up with the same thing all our reptile brains are telling us to do.

When you throw a ball... (read more)

People's moral intuitions are incoherent, and will tend to return different answers for the same dilemma phrased in different terms. Our evolved heuristics have their uses, among them is not turning into a social pariah in a group that relies on the same heuristics, but they're certainly not isomorphic to strict consequentialism.

Consequentialism and deontologism can be encoded in terms of one another (please allow me to temporarily mischaracterize the discussion as if there were only two options consequentialism and deontologism). Both theories have "free parameters"; consequentialism has preferences over states, and deontologism has precepts (should-rules). By carefully setting the free parameters, you can turn one into the other. The deontologist can say "You should make decisions by considering their consequences according to this utility function", and the ... (read more)

How would this be different than inserting epicycles to express geocentrism as heliocentrism? From consequentialist and deontological perspectives, I don't think this is an insight that dissolves the question.
The difference is that this is an 'ought' problem rather than an 'is' problem and we have no reason to believe our values should be simple.
There are various distinguishing points between geocentrism and heliocentrism - one is that heliocentrism had fewer free parameters. More elaborately: Certain parameters of the epicycles had to be delicately balanced, and heliocentrism "predicted" that those parameters would be turn out to be balanced. This sense of prediction has nothing to do with previous-in-time, and everything to do with opportunities for falsification; if ever those parameters drifted from perfect balance, heliocentrism would be falsified. Heliocentrism was preferable (even though it was slightly less accurate at first) because it exposed itself to a more stringent test. Overall, what I'm saying is that those two theories are asymmetric.

If I understand correctly, you may also reach your position without using a of non-causal decision theory if you mix utilitarianism with the deontological constraint of being honest (or at least meta-honest [see]) about the moral decisions you would make.

If people would ask you whether you would kill/did kill a patient, and you couldn't confidently say "No" (because of the deontological constraint of (meta-)honesty), that would be pretty bad, so you must... (read more)

But what if the doctor is confident of keeping it a secret? Well, then causal decision theory would indeed tell her to harvest his organs, but TDT (and also UDT) would strongly advise her against it. Because if TDT endorsed the action, then other people would be able to deduce that TDT endorsed the action, and that (whether or not it had happened in any particular case) their lives would be in danger in any hospital run by a timeless decision theorist, and then we'd be in much the same boat. Therefore TDT calculates that the correct thing for TDT to outpu

... (read more)
It might be better if doctors could make hard choices like this and keep it absolutely secret, but it's nearly impossible to contrive. As long as most people strongly disapprove of that sort of action, and people who want to become doctors do not have overwhelmingly different inclinations, and the training itself does not explicitly advocate for that sort of action, the vast majority of doctors will not take the organ harvester side in the dilemma, even in circumstances where they think they can get away with it (which will be rare,) and those are the basic minimum requirements to pull it off without people guessing. A society where the public didn't mind doctors harvesting the few to save the many would probably be considerably better off, but that would require the thoughts and actions of the entire society to be different, not just the doctors within it. Following consequentialist ethics doesn't mean that you should behave as you would in the highest possible utility world, if that doesn't increase utility in the world in which you actually find yourself.
The world we find ourselves in would never expect the doctor to cut the guy up. Few people are doing that consequentialist math. Well, maybe a few long thinkers on this site. So, the supposed long view as reason for not doing it is baloney. I think on that basis alone the experiment fails to come up recommending the conventional behavior it's trying to rationalize.
We would never expect the doctors to cut the guy up, but hardly any doctors would cut the guy up. Doctors are drawn from the same pool as the rest of society, so society's expectations of their behavior are pretty much on point. In a world where doctors were likely to cut the person up, the public would also be a lot more likely to expect doctors to cut the person up.

I'm not so convinced that the doctor should not harvest the organs.

In order for there to be a general rule against organ harvesting by that doctor there have to be enough other people who are TDT and who will make the same disinterested decision that the doctor did and who will be caught and scandalized by the media that people all over the place stop going to the doctor's office. I don't think it's very likely that all of those conditions are met sufficiently. Also, the impact of having some people stop going to the doctor's and get sick might arguably no... (read more)

Because if TDT endorsed the action, then other people would be able to deduce that TDT endorsed the action, and that (whether or not it had happened in any particular case) their lives would be in danger in any hospital run by a timeless decision theorist, and then we'd be in much the same boat. Therefore TDT calculates that the correct thing for TDT to output in order to maximize utility is "Don't kill the traveler," and thus the doctor doesn't kill the traveler.

TDT could deduce that people would deduce that TDT would not endorse the action, ... (read more)

I don't think that's right. A TDT agent wants people to deduce that TDT would not endorse the action, and therefore TDT would not endorse the action. If it did, it would be the equivalent of defecting in the Prisoner's Dilemma - the other guy would simulate you defecting even if he cooperated, and therefore defect himself, and you end up choosing a sub-optimal option. You can't say "the other guy's going to cooperate so I'll defect" - the other guy's only going to cooperate if he thinks you are (and he thinks you wouldn't if he defects), and if your decision theory is open to the consideration "the other guy's going to cooperate so I'll defect", the other won't think you'll cooperate if he does, and will therefore defect. You can't assume that you've thought it all through one more time than the other guy.
[+][comment deleted]2y1