SotW: Check Consequentialism

(The Exercise Prize series of posts is the Center for Applied Rationality asking for help inventing exercises that can teach cognitive skills.  The difficulty is coming up with exercises interesting enough, with a high enough hedonic return, that people actually do them and remember them; this often involves standing up and performing actions, or interacting with other people, not just working alone with an exercise booklet and a pencil.  We offer prizes of $50 for any suggestion we decide to test, and $500 for any suggestion we decide to adopt.  This prize also extends to LW meetup activities and good ideas for verifying that a skill has been acquired.  See here for details.)


Exercise Prize:  Check Consequentialism

In philosophy, "consequentialism" is the belief that doing the right thing makes the world a better place, i.e., that actions should be chosen on the basis of their probable outcomes.  It seems like the mental habit of checking consequentialism, asking "What positive future events does this action cause?", would catch numerous cognitive fallacies.

For example, the mental habit of consequentialism would counter the sunk cost fallacy - if a PhD wouldn't really lead to much in the way of desirable job opportunities or a higher income, and the only reason you're still pursuing your PhD is that otherwise all your previous years of work will have been wasted, you will find yourself encountering a blank screen at the point where you try to imagine a positive future outcome of spending another two years working toward your PhD - you will not be able to state what good future events happen as a result.

Or consider the problem of living in the should-universe; if you're thinking, I'm not going to talk to my boyfriend about X because he should know it already, you might be able to spot this as an instance of should-universe thinking (planning/choosing/acting/feeling as though within / by-comparison-to an image of an ideal perfect universe) by having done exercises specifically to sensitize you to should-ness.  Or, if you've practiced the more general skill of Checking Consequentialism, you might notice a problem on asking "What happens if I talk / don't talk to my boyfriend?" - providing that you're sufficiently adept to constrain your consequentialist visualization to what actually happens as opposed to what should happen.

Discussion:

The skill of Checking Consequentialism isn't quite as simple as telling people to ask, "What positive result do I get?"  By itself, this mental query is probably going to return any apparent justification - for example, in the sunk-cost-PhD example, asking "What good thing happens as a result?" will just return, "All my years of work won't have been wasted!  That's good!"  Any choice people are tempted by seems good for some reason, and executing a query about "good reasons" will just return this.

The novel part of Checking Consequentialism is the ability to discriminate "consequentialist reasons" from "non-consequentialist reasons" - being able to distinguish that "Because a PhD gets me a 50% higher salary" talks about future positive consequences, while "Because I don't want my years of work to have been wasted" doesn't.

It's possible that asking "At what time does the consequence occur and how long does it last?" would be useful for distinguishing future-consequences from non-future-consequences - if you take a bad-thing like "I don't want my work to have been wasted" and ask "When does it occur, where does it occur, and how long does it last?", you will with luck notice the error.

Learning to draw cause-and-effect directed graphs, a la Judea Pearl and Bayes nets, seems like it might be helpful - at least, Geoff was doing this while trying to teach strategicness and the class seemed to like it.

Sometimes non-consequentialist reasons can be rescued as consequentialist ones.  "You shouldn't kill because it's the wrong thing to do" can be rescued as "Because then a person will transition from 'alive' to 'dead' in the future, and this is a bad event" or "Because the interval between Outcome A and Outcome B includes the interval from Fred alive to Fred dead."

On a five-second level, the skill would have to include:

  • Being cued by some problem to try looking at the consequences;
  • Either directly having a mental procedure that only turns up consequences, like trying to visualize events out into the future, or
  • First asking 'Why am I doing this?' and then looking at the justifications to check if they're consequentialist, perhaps using techniques like asking 'How long does it last?', 'When does it happen?', or 'Where does it happen?'.
  • Expending a small amount of effort to see if a non-consequentialist reason can easily translate into a consequentialist one in a realistic way.
  • Making the decision whether or not to change your mind.
  • If necessary, detaching from the thing you were doing for non-consequentialist reasons.

In practice, it may be obvious that you're making a mistake as soon as you think to check consequences.  I have 'living in the should-universe' or 'sunk cost fallacy' cached to the point where as soon as I spot an error of that pattern, it's usually pretty obvious (without further deliberative thought) what the residual reasons are and whether I was doing it wrong.

Pain points & Pluses:

(When generating a candidate kata, almost the first question we ask - directly after the selection of a topic, like 'consequentialism' - is, "What are the pain points?  Or pleasure points?"  This can be errors you've made yourself and noticed afterward, or even cases where you've noticed someone else doing it wrong, but ideally cases where you use the skill in real life.  Since a lot of rationality is in fact about not screwing up, there may not always be pleasure points where the skill is used in a non-error-correcting, strictly positive context; but it's still worth asking each time.  We ask this question right at the beginning because it (a) checks to see how often the skill is actually important in real life and (b) provides concrete use-cases to focus discussion of the skill.)

Pain points:

Checking Consequentialism looks like it should be useful for countering:

  • Living in the should-universe (taking actions because of the consequences they ought to have, rather than the consequences they probably will have).  E.g., "I'm not going to talk to my girlfriend because she should already know X" or "I'm going to become a theoretical physicist because I ought to enjoy theoretical physics."
  • The sunk cost fallacy (choosing to prevent previously expended, non-recoverable resources from having been wasted in retrospect - i.e., avoiding the mental pain of reclassifying a past investment as a loss - rather than acting for the sake of future considerations).  E.g., "If I give up on my PhD, I'll have wasted the last three years."
  • Cached thoughts and habits; "But I usually shop at Whole Foods" or "I don't know, I've never tried an electric toothbrush before."  (These might have rescuable consequences, but as stated, they aren't talking about future events.)
  • Acting-out an emotion - one of the most useful pieces of advice I got from Anna Salamon was to find other ways to act out an emotion than strategic choices.  If you're feeling frustrated with a coworker, you might still want to Check Consequentialism on "Buy them dead flowers for their going-away party" even though it seems to express your frustration.
  • Indignation / acting-out of morals - "Drugs are bad, so drug use ought to be illegal", where it's much harder to make the case that countries which decriminalized marijuana experienced worse net outcomes.  (Though it should be noted that you also have to Use Empiricism to ask the question 'What happened to other countries that decriminalized marijuana?' instead of making up a gloomy consequentialist prediction to express your moral disapproval.)
  • Identity - "I'm the sort of person who belongs in academia."
  • "Trying to do things" for simply no reason at all, while your brain still generates activities and actions, because nobody ever told you that behaviors ought to have a purpose or that lack of purpose is a warning sign.  This habit can be inculcated by schoolwork, wanting to put in 8 hours before going home, etc.  E.g. you "try to write an essay", and you know that an essay has paragraphs; so you try to write a bunch of paragraphs but you don't have any functional role in mind for each paragraph.  "What is the positive consequence of this paragraph?" might come in handy here.

(This list is not intended to be exhaustive.)

Pleasure points:

  • Being able to state and then focus on a positive outcome seems like it should improve motivation, at least in cases where the positive outcome is realistically attainable to a non-frustrating degree and has not yet been subject to hedonic adaptation.  E.g., a $600 job may be more motivating if you visualize the $600 laptop you're going to buy with the proceeds.

Also, consequentialism is the foundation of expected utility, which is the foundation of instrumental rationality - this is why we're considering it as an early unit.  (This is not directly listed as a "pleasure point" because it is not directly a use-case.)

Constantly asking about consequences seems likely to improve overall strategicness - not just lead to the better of two choices being taken from a fixed decision-set, but also having goals in mind that can generate new perceived choices, i.e., improve the overall degree to which people do things for reasons, as opposed to not doing things or not having reasons.  (But this is a hopeful eventual positive consequence of practicing the skill, not a use-case where the skill is directly being applied.)

Teaching & exercises:

This is the part that's being thrown open to Less Wrong generally.  Hopefully I've described the skill in enough detail to convey what it is.  Now, how would you practice it?  How would you have an audience practice it, hopefully in activities carried out with each other?

The dumb thing I tried to do previously was to have exercises along the lines of, "Print up a booklet with little snippets of scenarios in them, and ask people to circle non-consequentialist reasoning, then try to either translate it to consequentialist reasons or say that no consequentialist reasons could be found."  I didn't do that for this exact session, but if you look at what I did with the sunk cost fallacy, it's the same sort of silly thing I tried to do.

This didn't work very well - maybe the exercises were too easy, or maybe it was that people were doing it alone, or maybe we did something else wrong, but the audience appeared to experience insufficient hedonic return.  They were, in lay terms, unenthusiastic.

At this point I should like to pause, and tell a recent and important story.  On Saturday I taught an 80-minute unit on Bayes's Rule to an audience of non-Sequence-reading experimental subjects, who were mostly either programmers or in other technical subjects, so I could go through the math fairly fast.  Afterward, though, I was worried that they hadn't really learned to apply Bayes's Rule and wished I had a small little pamphlet of practice problems to hand out.  I still think this would've been a good idea, but...

On Wednesday, I attended Andrew Critch's course at Berkeley, which was roughly mostly-instrumental LW-style cognitive-improvement material aimed at math students; and in this particular session, Critch introduced Bayes's Theorem, not as advanced math, but with the aim of getting them to apply it to life.

Critch demonstrated using what he called the Really Getting Bayes game.  He had Nisan (a local LWer) touch an object to the back of Critch's neck, a cellphone as it happened, while Critch faced in the other direction; this was "prior experience".  Nisan said that the object was either a cellphone or a pen.  Critch gave prior odds of 60% : 40% that the object was a cellphone vs. pen, based on his prior experience.  Nisan then asked Critch how likely he thought it was that a cellphone or a pen would be RGB-colored, i.e., colored red, green, or blue.  Critch didn't give exact numbers here, but said he thought a cellphone was more likely to be primary-colored, and drew some rectangles on the blackboard to illustrate the likelihood ratio.  After being told that the object was in fact primary-colored (the cellphone was metallic blue), Critch gave posterior odds of 75% : 25% in favor of the cellphone, and then turned around to look.

Then Critch broke up the class into pairs and asked each pair to carry out a similar operation on each other:  Pick two plausible objects and make sure you're holding at least one of them, touch it to the other person while they face the other way, prior odds, additional fact, likelihood ratio, posterior odds.

This is the sort of in-person, hands-on, real-life, and social exercise that didn't occur to me, or Anna, or anyone else helping, while we were trying to design the Bayes's Theorem unit.  Our brains just didn't go in that direction, though we recognized it as embarrassingly obvious in retrospect.

So... how would you design an exercise to teach Checking Consequentialism?

311 comments, sorted by
magical algorithm
Highlighting new comments since Today at 4:46 AM
Select new highlight date
Moderation Guidelines: Reign of Terror - I delete anything I judge to be annoying or counterproductiveexpand_more

So... how would you design an exercise to teach Checking Consequentialism?

I would check to see if such a thing already exists or if there are people who have experience designing such things. I know of a Belgian non-profit 'Center for Informative Games' that not only rents games designed to teach certain skills but will also help you create your own.

From their site: On request C.I.S. develops games for others. The applicant provides the content of the game, while C.I.S. develops the conceptual and game technical part to perfection. The applicant has the opportunity to attend some game tests and to redirect when necessary.

They also offer coaching if you want to work on your own: Do you want to create a game concept on your own, but you don't know where to start? No worries C.I.S. can give you a hand. During a number of concrete working sessions C.I.S. facilitates your development. In between sessions the applicant continues the work to, finally, end up with a solid game.

I have enjoyed their games in the past and can attest to their quality. The obvious problem is that it's a purely Belgian organization and the 'search' function only works with Dutch words. However if you want to check them out I'd be happy to act as a go-between. Since a couple of months there is even a Brussels LW meetup, I'm certain I could get a couple of members to help in the production process (again, if this seems interesting)

In a group, with a leader who knows the exercise:

Get a volunteer to act as a judge (or a few to act as a jury, in a large group). Have her leave the room. The leader presents the rest with a short set of Contrived Hypothetical Situations, each with finite options and either clearly-defined outcomes for each option, or a probabilistic distribution of outcomes for each option. The leader says, "Please write down your choice for each problem, sign your paper, and turn it in to me. Then I'll call in the judge, and have her decide on each problem. You get a point wherever her decision agrees with yours. The winner is the one with the most points." When the judge is called in, however, the leader doesn't tell them the actual problems. Rather, the leader just reports the outcomes (or distributions), and asks them to choose which outcome or distribution is best. The winners are announced based on that.

Example: One of the situations given is some variant of the trolley problem. When the judge comes in, she is just asked whether she'd prefer one person to get hit by a trolley, or five. Everybody laughs as she replies "...one?"

Example: The problem given to the group is "You drive 45 minutes away from home to go to a new restaurant for dinner. When you get there, you discover that you dislike the ambience and the selection is poor. You remember that you have decent leftovers at home. You're mildly hungry. Do you try the restaurant anyway (25-minute wait, 10% very enjoyable meal, 10% decent meal, 80% unenjoyable meal) or just head back home (5-minute-prep once you get home, 100% chance decent meal)?" The problem given to the judge is "You're mildly hungry. In 25 minutes, you can have a meal that is (10% very enjoyable, 10% decent, 80% unenjoyable). Or, in 50 minutes, you can have a guaranteed decent meal."

I think this is a fantastic idea, with a patch that is much easier than those suggested by the other responses. Simply tell everyone that for the purposes of this exercise, only that information directly presented in the example is to be considered. People sometimes overlook relevant information or clever third options, and these situations are to be judged only based on the data being considered by the hypothetical person in the given scenario.

If there is any concern about this set up encouraging people to think about things with an insufficient amount of thoroughness, you can save some time at the end for a just-for-fun period where everyone gets to offer their clever workarounds and extra information that would have changed what the proper decision was, had it been considered.

Two details the judge isn't told about are 1) you would have to pay for the former meal, but not for the latter, and 2) if you stay in the restaurant, you gain useful information you'll be able to take in account the next time you might want to eat there.

1) is patchable by specifying that the leftovers are non-perishable, so eating them is equivalent to buying a meal.

2) Either the judge is told that the variable meal is repeatable if it's good, or we specify in the group problem that you're not going back there no matter what.

Couldn't the problems others have brought up regarding this scenario be fixed by specifying that this is your last meal ever before the world ends tomorrow morning before breakfast? Then neither information nor money is valuable anymore.

I think I'd make a decision other than "try that new restaurant on the outskirts of town" for the evening before the world ends. If I don't know the world is going to end, then my decision now mightn't be optimal in light of that additional information (maybe that still tests something interesting, but it isn't quite the same thing).

Hmm. That could be a good point. If the world were ending, I probably wouldn't waste time on a sit-down meal.

How about if it's your last day in the country and you'll be fleeing to escape religious persecution tomorrow, taking nothing with you?

If you stay, you gain information about the restaurant. There's the dollar cost of dining out. It's actually not as easy as it looks to generate a "clean" example.

How much need we worry about excluding consequences we can't consciously list and/or quantify?

I patched this example by saying "you're on vacation in another city", so the value of information is mostly negligible.

But yeah, it's still pretty hard. Also, ideally not all of our examples end up being instances of sunk-cost-fallacy.