SotW: Check Consequentialism

by Eliezer Yudkowsky7 min read29th Mar 2012313 comments



(The Exercise Prize series of posts is the Center for Applied Rationality asking for help inventing exercises that can teach cognitive skills.  The difficulty is coming up with exercises interesting enough, with a high enough hedonic return, that people actually do them and remember them; this often involves standing up and performing actions, or interacting with other people, not just working alone with an exercise booklet and a pencil.  We offer prizes of $50 for any suggestion we decide to test, and $500 for any suggestion we decide to adopt.  This prize also extends to LW meetup activities and good ideas for verifying that a skill has been acquired.  See here for details.)

Exercise Prize:  Check Consequentialism

In philosophy, "consequentialism" is the belief that doing the right thing makes the world a better place, i.e., that actions should be chosen on the basis of their probable outcomes.  It seems like the mental habit of checking consequentialism, asking "What positive future events does this action cause?", would catch numerous cognitive fallacies.

For example, the mental habit of consequentialism would counter the sunk cost fallacy - if a PhD wouldn't really lead to much in the way of desirable job opportunities or a higher income, and the only reason you're still pursuing your PhD is that otherwise all your previous years of work will have been wasted, you will find yourself encountering a blank screen at the point where you try to imagine a positive future outcome of spending another two years working toward your PhD - you will not be able to state what good future events happen as a result.

Or consider the problem of living in the should-universe; if you're thinking, I'm not going to talk to my boyfriend about X because he should know it already, you might be able to spot this as an instance of should-universe thinking (planning/choosing/acting/feeling as though within / by-comparison-to an image of an ideal perfect universe) by having done exercises specifically to sensitize you to should-ness.  Or, if you've practiced the more general skill of Checking Consequentialism, you might notice a problem on asking "What happens if I talk / don't talk to my boyfriend?" - providing that you're sufficiently adept to constrain your consequentialist visualization to what actually happens as opposed to what should happen.


The skill of Checking Consequentialism isn't quite as simple as telling people to ask, "What positive result do I get?"  By itself, this mental query is probably going to return any apparent justification - for example, in the sunk-cost-PhD example, asking "What good thing happens as a result?" will just return, "All my years of work won't have been wasted!  That's good!"  Any choice people are tempted by seems good for some reason, and executing a query about "good reasons" will just return this.

The novel part of Checking Consequentialism is the ability to discriminate "consequentialist reasons" from "non-consequentialist reasons" - being able to distinguish that "Because a PhD gets me a 50% higher salary" talks about future positive consequences, while "Because I don't want my years of work to have been wasted" doesn't.

It's possible that asking "At what time does the consequence occur and how long does it last?" would be useful for distinguishing future-consequences from non-future-consequences - if you take a bad-thing like "I don't want my work to have been wasted" and ask "When does it occur, where does it occur, and how long does it last?", you will with luck notice the error.

Learning to draw cause-and-effect directed graphs, a la Judea Pearl and Bayes nets, seems like it might be helpful - at least, Geoff was doing this while trying to teach strategicness and the class seemed to like it.

Sometimes non-consequentialist reasons can be rescued as consequentialist ones.  "You shouldn't kill because it's the wrong thing to do" can be rescued as "Because then a person will transition from 'alive' to 'dead' in the future, and this is a bad event" or "Because the interval between Outcome A and Outcome B includes the interval from Fred alive to Fred dead."

On a five-second level, the skill would have to include:

  • Being cued by some problem to try looking at the consequences;
  • Either directly having a mental procedure that only turns up consequences, like trying to visualize events out into the future, or
  • First asking 'Why am I doing this?' and then looking at the justifications to check if they're consequentialist, perhaps using techniques like asking 'How long does it last?', 'When does it happen?', or 'Where does it happen?'.
  • Expending a small amount of effort to see if a non-consequentialist reason can easily translate into a consequentialist one in a realistic way.
  • Making the decision whether or not to change your mind.
  • If necessary, detaching from the thing you were doing for non-consequentialist reasons.

In practice, it may be obvious that you're making a mistake as soon as you think to check consequences.  I have 'living in the should-universe' or 'sunk cost fallacy' cached to the point where as soon as I spot an error of that pattern, it's usually pretty obvious (without further deliberative thought) what the residual reasons are and whether I was doing it wrong.

Pain points & Pluses:

(When generating a candidate kata, almost the first question we ask - directly after the selection of a topic, like 'consequentialism' - is, "What are the pain points?  Or pleasure points?"  This can be errors you've made yourself and noticed afterward, or even cases where you've noticed someone else doing it wrong, but ideally cases where you use the skill in real life.  Since a lot of rationality is in fact about not screwing up, there may not always be pleasure points where the skill is used in a non-error-correcting, strictly positive context; but it's still worth asking each time.  We ask this question right at the beginning because it (a) checks to see how often the skill is actually important in real life and (b) provides concrete use-cases to focus discussion of the skill.)

Pain points:

Checking Consequentialism looks like it should be useful for countering:

  • Living in the should-universe (taking actions because of the consequences they ought to have, rather than the consequences they probably will have).  E.g., "I'm not going to talk to my girlfriend because she should already know X" or "I'm going to become a theoretical physicist because I ought to enjoy theoretical physics."
  • The sunk cost fallacy (choosing to prevent previously expended, non-recoverable resources from having been wasted in retrospect - i.e., avoiding the mental pain of reclassifying a past investment as a loss - rather than acting for the sake of future considerations).  E.g., "If I give up on my PhD, I'll have wasted the last three years."
  • Cached thoughts and habits; "But I usually shop at Whole Foods" or "I don't know, I've never tried an electric toothbrush before."  (These might have rescuable consequences, but as stated, they aren't talking about future events.)
  • Acting-out an emotion - one of the most useful pieces of advice I got from Anna Salamon was to find other ways to act out an emotion than strategic choices.  If you're feeling frustrated with a coworker, you might still want to Check Consequentialism on "Buy them dead flowers for their going-away party" even though it seems to express your frustration.
  • Indignation / acting-out of morals - "Drugs are bad, so drug use ought to be illegal", where it's much harder to make the case that countries which decriminalized marijuana experienced worse net outcomes.  (Though it should be noted that you also have to Use Empiricism to ask the question 'What happened to other countries that decriminalized marijuana?' instead of making up a gloomy consequentialist prediction to express your moral disapproval.)
  • Identity - "I'm the sort of person who belongs in academia."
  • "Trying to do things" for simply no reason at all, while your brain still generates activities and actions, because nobody ever told you that behaviors ought to have a purpose or that lack of purpose is a warning sign.  This habit can be inculcated by schoolwork, wanting to put in 8 hours before going home, etc.  E.g. you "try to write an essay", and you know that an essay has paragraphs; so you try to write a bunch of paragraphs but you don't have any functional role in mind for each paragraph.  "What is the positive consequence of this paragraph?" might come in handy here.

(This list is not intended to be exhaustive.)

Pleasure points:

  • Being able to state and then focus on a positive outcome seems like it should improve motivation, at least in cases where the positive outcome is realistically attainable to a non-frustrating degree and has not yet been subject to hedonic adaptation.  E.g., a $600 job may be more motivating if you visualize the $600 laptop you're going to buy with the proceeds.

Also, consequentialism is the foundation of expected utility, which is the foundation of instrumental rationality - this is why we're considering it as an early unit.  (This is not directly listed as a "pleasure point" because it is not directly a use-case.)

Constantly asking about consequences seems likely to improve overall strategicness - not just lead to the better of two choices being taken from a fixed decision-set, but also having goals in mind that can generate new perceived choices, i.e., improve the overall degree to which people do things for reasons, as opposed to not doing things or not having reasons.  (But this is a hopeful eventual positive consequence of practicing the skill, not a use-case where the skill is directly being applied.)

Teaching & exercises:

This is the part that's being thrown open to Less Wrong generally.  Hopefully I've described the skill in enough detail to convey what it is.  Now, how would you practice it?  How would you have an audience practice it, hopefully in activities carried out with each other?

The dumb thing I tried to do previously was to have exercises along the lines of, "Print up a booklet with little snippets of scenarios in them, and ask people to circle non-consequentialist reasoning, then try to either translate it to consequentialist reasons or say that no consequentialist reasons could be found."  I didn't do that for this exact session, but if you look at what I did with the sunk cost fallacy, it's the same sort of silly thing I tried to do.

This didn't work very well - maybe the exercises were too easy, or maybe it was that people were doing it alone, or maybe we did something else wrong, but the audience appeared to experience insufficient hedonic return.  They were, in lay terms, unenthusiastic.

At this point I should like to pause, and tell a recent and important story.  On Saturday I taught an 80-minute unit on Bayes's Rule to an audience of non-Sequence-reading experimental subjects, who were mostly either programmers or in other technical subjects, so I could go through the math fairly fast.  Afterward, though, I was worried that they hadn't really learned to apply Bayes's Rule and wished I had a small little pamphlet of practice problems to hand out.  I still think this would've been a good idea, but...

On Wednesday, I attended Andrew Critch's course at Berkeley, which was roughly mostly-instrumental LW-style cognitive-improvement material aimed at math students; and in this particular session, Critch introduced Bayes's Theorem, not as advanced math, but with the aim of getting them to apply it to life.

Critch demonstrated using what he called the Really Getting Bayes game.  He had Nisan (a local LWer) touch an object to the back of Critch's neck, a cellphone as it happened, while Critch faced in the other direction; this was "prior experience".  Nisan said that the object was either a cellphone or a pen.  Critch gave prior odds of 60% : 40% that the object was a cellphone vs. pen, based on his prior experience.  Nisan then asked Critch how likely he thought it was that a cellphone or a pen would be RGB-colored, i.e., colored red, green, or blue.  Critch didn't give exact numbers here, but said he thought a cellphone was more likely to be primary-colored, and drew some rectangles on the blackboard to illustrate the likelihood ratio.  After being told that the object was in fact primary-colored (the cellphone was metallic blue), Critch gave posterior odds of 75% : 25% in favor of the cellphone, and then turned around to look.

Then Critch broke up the class into pairs and asked each pair to carry out a similar operation on each other:  Pick two plausible objects and make sure you're holding at least one of them, touch it to the other person while they face the other way, prior odds, additional fact, likelihood ratio, posterior odds.

This is the sort of in-person, hands-on, real-life, and social exercise that didn't occur to me, or Anna, or anyone else helping, while we were trying to design the Bayes's Theorem unit.  Our brains just didn't go in that direction, though we recognized it as embarrassingly obvious in retrospect.

So... how would you design an exercise to teach Checking Consequentialism?