"Irrational" implies making a bad choice when a good choice is available. If Bob was mistaken, he was just mistaken. If he knew he could easily check the store hours on his phone but decided not to and spent 15 minutes driving to the store, he was irrational.

It seems like you’re burying a lot of the work in the word “available”. Is it “available” if it weren’t on his mind even if he could answer “yes, it would be easy to check” when asked? Is it “available” when it’s not on his mind but reminding him wouldn’t change his decision, but he has other reasons for it? If he doesn’t have other reasons, but would do things differently if you taught him? If a different path were taken on any of those forks?

I can think of a lot of different ways for someone to "know he could easily check store hours" and then not do it, and I would describe them all differently - and none of them seem best described as “irrational”, except perhaps as sloppy shorthand for “suboptimal decision algorithm”.

Because she is dumb and unable to exercise self-control.

That’s certainly one explanation, and useful for some things, but less useful for many others. Again, shorthand is fine if seen for what it is. In other cases though, I might want a more detailed answer that explains why she is “unable” to exercise self control - say, for example, if I wanted to change it. The word “irrational” makes perfect sense if you think changing things like this is impossible. If you see it as a matter of disentangling the puzzle, it makes less sense.

It seems to me you just don't like the word "irrational". Are there situations where you think it applies? In what cases would you use this word?

It’s not that I “don’t like” the word - I don't “try not to use it” or anything. It’s just that I’ve noticed that it has left my vocabulary on its own once I started trying to change behaviors that seemed irrational to me instead of letting it function as a mental stop sign. It just seems that the only thing “irrational” means, beyond “suboptimal”, is an implicit claim that there are no further answers - and that is empirically false (and other bad things). So in that sense, no, I’d never use the word because I think that the picture it tries to paint is fundamentally incoherent.

If that connotation is disclaimed and you want to use it to mean more than “suboptimal”, it seems like “driven by motivated cognition” is the probably one of the closer things to the feeling I get by the word “irrational”, but as this post by Anna shows, even that can have actual reasons behind it, and I usually want the extra precision by actually spelling out what I think is happening.

If I were to use the word myself (as opposed to running with it when someone else uses the word), it would only be in a case where the person I’m talking to understands the implicit “[but there are reasons for this, and there’s more than could be learned/done if this case were to become important. It’s not]”

EDIT: I also could conceivably use it in describing someone's behavior to them if I anticipated that they'd agree and change their behavior if I did.

instead of letting it function as a mental stop sign

I don't know why you let it function as a stop sign in the first place. "Irrational" means neither "random" nor "inexplicable" -- to me it certainly does not imply that "there are no further answers". As I mentioned upthread, I can consider someone's behaviour irrational and at the same time understand why that someone is doing this and see the levers to change him.

The difference that I see from "suboptimal" is that suboptimal implies that you'll still ... (read more)

"Flinching away from truth” is often about *protecting* the epistemology

by AnnaSalamon 6 min read20th Dec 201656 comments

124


Related to: Leave a line of retreat; Categorizing has consequences.

There’s a story I like, about this little kid who wants to be a writer. So she writes a story and shows it to her teacher.

“You misspelt the word ‘ocean’”, says the teacher.

“No I didn’t!”, says the kid.

The teacher looks a bit apologetic, but persists: “‘Ocean’ is spelt with a ‘c’ rather than an ‘sh’; this makes sense, because the ‘e’ after the ‘c’ changes its sound…”

No I didn’t!” interrupts the kid.

“Look,” says the teacher, “I get it that it hurts to notice mistakes. But that which can be destroyed by the truth should be! You did, in fact, misspell the word ‘ocean’.”

“I did not!” says the kid, whereupon she bursts into tears, and runs away and hides in the closet, repeating again and again: “I did not misspell the word! I can too be a writer!”.

I like to imagine the inside of the kid’s head as containing a single bucket that houses three different variables that are initially all stuck together:

Original state of the kid's head:

The goal, if one is seeking actual true beliefs, is to separate out each of these variables into its own separate bucket, so that the “is ‘oshun’ spelt correctly?” variable can update to the accurate state of "no", without simultaneously forcing the "Am I allowed to pursue my writing ambition?" variable to update to the inaccurate state of "no".

Desirable state (requires somehow acquiring more buckets):

The trouble is, the kid won’t necessarily acquire enough buckets by trying to “grit her teeth and look at the painful thing”. A naive attempt to "just refrain from flinching away, and form true beliefs, however painful" risks introducing a more important error than her current spelling error: mistakenly believing she must stop working toward being a writer, since the bitter truth is that she spelled 'oshun' incorrectly.

State the kid might accidentally land in, if she naively tries to "face the truth":

(You might take a moment, right now, to name the cognitive ritual the kid in the story *should* do (if only she knew the ritual). Or to name what you think you'd do if you found yourself in the kid's situation -- and how you would notice that you were at risk of a "buckets error".)

More examples:

It seems to me that bucket errors are actually pretty common, and that many (most?) mental flinches are in some sense attempts to avoid bucket errors. The following examples are slightly-fictionalized composites of things I suspect happen a lot (except the "me" ones; those are just literally real):

Diet: Adam is on a diet with the intent to lose weight. Betty starts to tell him about some studies suggesting that the diet he is on may cause health problems. Adam complains: “Don’t tell me this! I need to stay motivated!”

One interpretation, as diagramed above: Adam is at risk of accidentally equating the two variables, and accidentally *assuming* that the studies imply that the diet must stop being viscerally motivating. He semi-consciously perceives that this risks error, and so objects to having the information come in and potentially force the error.

Pizza purchase: I was trying to save money. But I also wanted pizza. So I found myself tempted to buy the pizza *really quickly* so that I wouldn't be able to notice that it would cost money (and, thus, so I would be able to buy the pizza):

On this narration: It wasn't *necessarily* a mistake to buy pizza today. Part of me correctly perceived this "not necessarily a mistake to buy pizza" state. Part of me also expected that the rest of me wouldn't perceive this, and that, if I started thinking it through, I might get locked into the no-pizza state even if pizza was better. So it tried to 'help' by buying the pizza *really quickly, before I could think and get it wrong*. [1]

On the particular occasion about the pizza (which happened in 2008, around the time I began reading Eliezer's LW Sequences), I actually managed to notice that the "rush to buy the pizza before I could think" process was going on. So I tried promising myself that, if I still wanted the pizza after thinking it through, I would get the pizza. My resistance to thinking it through vanished immediately. [2]

To briefly give several more examples, without diagrams (you might see if you can visualize how a buckets diagram might go in these):

  • Carol is afraid to notice a potential flaw in her startup, lest she lose the ability to try full force on it.
  • Don finds himself reluctant to question his belief in God, lest he be forced to conclude that there's no point to morality.
  • As a child, I was afraid to allow myself to actually consider giving some of my allowance to poor people, even though part of me wanted to do so. My fear I was that if I allowed the "maybe you should give away your money, because maybe everyone matters evenly and you should be consequentialist" theory to fully boot up in my head, I would end up having to give away *all* my money, which seemed bad.
  • Eleanore believes there is no important existential risk, and is reluctant to think through whether that might not be true, in case it ends up hijacking her whole life.
  • Fred does not want to notice how much smarter he is than most of his classmates, lest he stop respecting them and treating them well.
  • Gina has mixed feelings about pursuing money -- she mostly avoids it -- because she wants to remain a "caring person", and she has a feeling that becoming strategic about money would somehow involve giving up on that.

It seems to me that in each of these cases, the person has an arguably worthwhile goal that they might somehow lose track of (or might accidentally lose the ability to act on) if they think some *other* matter through -- arguably because of a deficiency of mental "buckets".

Moreover, "buckets errors" aren't just thingies that affect thinking in prospect -- they also get actually made in real life. It seems to me that one rather often runs into adults who decided they weren't allowed to like math after failing a quiz in 2nd grade; or who gave up on meaning for a couple years after losing their religion; or who otherwise make some sort of vital "buckets error" that distorts a good chunk of their lives. Although of course this is mostly guesswork, and it is hard to know actual causality.

How I try to avoid "buckets errors":

I basically just try to do the "obvious" thing: when I notice I'm averse to taking in "accurate" information, I ask myself what would be bad about taking in that information.[3] Usually, I get a concrete answer, like "If I noticed I could've saved all that time, I'll have to feel bad", or "if AI timelines are maybe-near, then I'd have to rethink all my plans", or what have you.

Then, I remember that I can consider each variable separately. For example, I can think about whether AI timelines are maybe-near; and if they are, I can always decide to not-rethink my plans anyhow, if that's actually better. I mentally list out all the decisions that *don't* need to be simultaneously forced by the info; and I promise myself that I can take the time to get these other decisions not-wrong, even after considering the new info.

Finally, I check to see if taking in the information is still aversive. If it is, I keep trying to disassemble the aversiveness into component lego blocks until it isn't. Once it isn't aversive, I go ahead and think it through bit by bit, like with the pizza.

This is a change from how I used to think about flinches: I used to be moralistic, and to feel disapproval when I noticed a flinch, and to assume the flinch had no positive purpose. I therefore used to try to just grit my teeth and think about the painful thing, without first "factoring" the "purposes" of the flinch, as I do now. But I think my new ritual is better, at least now that I have enough introspective skill that I can generally finish this procedure in finite time, and can still end up going forth and taking in the info a few minutes later.

(Eliezer once described what I take to be the a similar ritual for avoiding bucket errors, as follows: When deciding which apartment to rent (he said), one should first do out the math, and estimate the number of dollars each would cost, the number of minutes of commute time times the rate at which one values one's time, and so on. But at the end of the day, if the math says the wrong thing, one should do the right thing anyway.)


[1]: As an analogy: sometimes, while programming, I've had the experience of:

  1. Writing a program I think is maybe-correct;
  2. Inputting 0 as a test-case, and knowing ahead of time that the output should be, say, “7”;
  3. Seeing instead that the output was “5”; and
  4. Being really tempted to just add a “+2” into the program, so that this case will be right.

This edit is the wrong move, but not because of what it does to MyProgram(0) — MyProgram(0) really is right. It’s the wrong move because it maybe messes up the program’s *other* outputs.

Similarly, changing up my beliefs about how my finances should work in order to get a pizza on a day when I want one *might* help with getting the right answer today about the pizza — it isn’t clear — but it’d risk messing up other, future decisions.

The problem with rationalization and mental flinches, IMO, isn’t so much the “intended” action that the rationalization or flinch accomplishes in the moment, but the mess it leaves of the code afterward.

[2] To be a bit more nitpicky about this: the principle I go for in such cases isn’t actually “after thinking it through, do the best thing”. It’s more like “after thinking it through, do the thing that, if reliably allowed to be the decision-criterion, will allow information to flow freely within my head”.

The idea here is that my brain is sometimes motivated to achieve certain things; and if I don’t allow that attempted achievement to occur in plain sight, I incentivize my brain to sneak around behind my back and twist up my code base in an attempt to achieve those things. So, I try not to do that.

This is one reason it seems bad to me when people try to take “maximize all human well-being, added evenly across people, without taking myself or my loved ones as special” as their goal. (Or any other fake utility function.)

[3] To describe this "asking" process more concretely: I sometimes do this as follows: I concretely visualize a 'magic button' that will cause me to take in the information. I reach toward the button, and tell my brain I'm really going to press it when I finish counting down, unless there are any objections ("3... 2... no objections, right?... 1..."). Usually I then get a bit of an answer — a brief flash of worry, or a word or image or association.

Sometimes the thing I get is already clear, like “if I actually did the forms wrong, and I notice, I’ll have to redo them”. Then all I need to do is separate it into buckets (“How about if I figure out whether I did them wrong, and then, if I don’t want to redo them, I can always just not?”).

Other times, what I get is more like quick nonverbal flash, or a feeling of aversion without knowing why. In such cases, I try to keep “feeling near” the aversion. I might for example try thinking of different guesses (“Is it that I’d have to redo the forms?… no… Is it that it’d be embarrassing?… no…”). The idea here is to see if any of the guesses “resonate” a bit, or cause the feeling of aversiveness to become temporarily a bit more vivid-feeling.

For a more detailed version of these instructions, and more thoughts on how to avoid bucket errors in general (under different terminology), you might want to check out Eugene Gendlin’s audiobook “Focusing”.

124