Concept Safety
Multiagent Models of Mind
Keith Stanovich: What Intelligence Tests Miss


Why Productivity Systems Don't Stick

My experience with ideas related to this (e.g. Replacing Guilt, IFS) has been that I tend not to be able to muster compassion and understanding for whatever part of myself is putting up resistance. Rather, I just get frustrated with it for being so obviously wrong and irrational.

I think this is one of the situations where it really helps to have someone else facilitate your IFS session. What you describe often happens because you are blended with the part that wants to just "get rid" of the part creating the resistance, and it might be the anti-procrastination part which created your motivation to sit down for an IFS session in the fist place. Then you get an arguments are soldiers thing - if you were to actually listen to the procrastinating part, then it might turn out to have some good reason for procrastinating, and the anti-procrastinating part doesn't want to hear that. It doesn't want you to get kicked out of your PhD program, so it certainly doesn't want to consider an argument for something that might get you kicked out!

So then you are trying to unblend from the anti-procrastinating part in order to have empathy for the procrastinating part. But the anti-procrastinating part is also the one which is trying to drive the session forward, and it can't unblend from you while still driving the session! So the need to unblend and the desire to fix the procrastinating part get in conflict, and the process gets stuck.

Effectively, the anti-procrastination part would need to turn itself off, and it doesn't know how to do that. But what you can do, is give control of the session to somebody else, and let them tell you what to do. Once the anti-procrastinating part no longer needs to drive the session, it becomes possible for it to move to the side, and then for you to listen to both parts with empathy.

This is a "get out of the car" problem:

Suppose that one day, you happen to run into a complete stranger. You don’t think very much about needing to impress them, and as a result, you come off as relaxed and charming.

The next day, you’re going on a date with someone you’re really strongly attracted to. You feel that it’s really really important for you to make a good impression, and because you keep obsessing about this thought, you can’t relax, act normal, and actually make a good impression.

Suppose that you remember all that stuff about cognitive fusion. You might (correctly) think that if you managed to defuse from the thought of this being an important encounter, then all of this would be less stressful and you might actually make a good impression.

But this brings up a particular difficulty: it can be relatively easy to defuse from a thought that you on some level believe is, or at least may be, false. But it’s a lot harder to defuse from a thought which you believe on a deep level to actually be true, but which it’s just counterproductive to think about.

After all, if you really are strongly interested in this person, but might not have an opportunity to meet with them again if you make a bad impression... then it is important for you to make a good impression on them now. Defusing from the thought of this being important, would mean that you believed less in this being important, meaning that you might do something that actually left a bad impression on them!

You can’t defuse from the content of a belief, if your motivation for wanting to defuse from it is the belief itself. In trying to reject the belief that making a good impression is important, and trying to do this with the motive of making a good impression, you just reinforce the belief that this is important. If you want to actually defuse from the belief, your motive for doing so has to come from somewhere else than the belief itself.

The Case for a Journal of AI Alignment

IMO, a textbook would either overlook big chunks of the field or look more like an enumeration of approaches than a unified resource.

Textbooks that cover a number of different approaches without taking a position on which one is the best are pretty much the standard in many fields. (I recall struggling with it in some undergraduate psychology courses, as previous schooling didn't prepare me for a textbook that would cover three mutually exclusive theories and present compelling evidence in favor of each. Before moving on and presenting three mutually exclusive theories about some other phenomenon on the very next page.)

Great minds might not think alike

I would guess that this sort of reasoning happens a lot. In concrete terms:

  1. A person (call her Alice) forms a heuristic — “I am good at X” — where X isn’t perfectly defined. (“I am good at real-world reasoning”; “I am good at driving”; “I am a good math teacher”.) She forms it because she’s good at X on a particular axis she cares about (“I am good at statistical problem solving”; “I drive safely”; “My algebraic geometry classes consistently get great reviews”).



Here's a mistake which I've sometimes committed and gotten defensive as a result, and which I've seen make other people defensive when they've committed the same mistake.

Take some vaguely defined, multidimensional thing that people could do or not do. In my case it was something like "trying to understand other people".

Now there are different ways in which you can try to understand other people. For me, if someone opened up and told me of their experiences, I would put a lot of effort into really trying to understand their perspective, to try to understand how they thought and why they felt that way.

At the same time, I thought that everyone was so unique that there wasn't much point in trying to understand them by any other way than hearing them explain their experience. So I wouldn't really, for example, try to make guesses about people based on what they seemed to have in common with other people I knew.

Now someone comes and happens to mention that I "don't seem to try to understand other people".

I get upset and defensive because I totally do, this person hasn't understood me at all!

And in one sense, I'm right - it's true that there's a dimension of "trying to understand other people" that I've put a lot of effort into, in which I've probably invested more than other people have.

And in another sense, the other person is right - while I was good at one dimension of "trying to understand other people", I was severely underinvested in others. And I had not really even properly acknowledged that "trying to understand other people" had other important dimensions too, because I was justifiably proud of my investment in one of them.

But from the point of view of someone who had invested in those other dimensions, they could see the aspects in which I was deficient compared to them, or maybe even compared to the median person. (To some extent I thought that my underinvestment in those other dimensions was virtuous, because I was "not making assumptions about people", which I'd been told was good.) And this underinvestment showed in how I acted.

So the mistake is that if there's a vaguely defined, multidimensional skill and you are strongly invested in one of its dimensions, you might not realize that you are deficient in the others. And if someone says that you are not good at it, you might understandably get defensive and upset, because you can only think of the evidence which says you're good at it... while not even realizing the aspects that you're missing out on, which are obvious to the person who is better at them.

Now one could say that the person giving this feedback should be more precise and not make vague, broad statements like "you don't seem to try to understand other people". Rather they should make some more specific statement like "you don't seem to try to make guesses about other people based on how they compare to other people you know".

And sure, this could be better. But communication is hard; and often the other person doesn't know the exact mistake that you are making. They can't see exactly what is happening in your mind: they can only see how you behave. And they see you behaving in a way which, to them, looks like you are not trying to understand other people. (And it's even possible that they are deficient in the dimension that you are good at, so it doesn't even occur to them that "trying to understand other people" could mean anything else than what it means to them.)

So they express it in the way that it looks to them, because before you get into a precise discussion about what exactly each of you means by that term, that's the only way in which they can get their impression across.

Great minds might not think alike

I would guess that this sort of reasoning happens a lot. In concrete terms:

  1. A person (call her Alice) forms a heuristic — “I am good at X” — where X isn’t perfectly defined. (“I am good at real-world reasoning”; “I am good at driving”; “I am a good math teacher”.) She forms it because she’s good at X on a particular axis she cares about (“I am good at statistical problem solving”; “I drive safely”; “My algebraic geometry classes consistently get great reviews”).


Relevant Scott Alexander (again):

5. You know what you know, but you don’t know what you don’t know. Suppose each doctor makes errors at the same rate, but about different things. I will often catch other doctors’ errors. But by definition I don’t notice my own errors; if I did, I would stop making them! By “errors” I don’t mean stupid mistakes like writing the wrong date on a prescription, I mean fundamentally misunderstanding how to use a certain treatment or address a certain disease. Every doctor has studied some topics in more or less depth than others. When I’ve studied a topic in depth, it’s obvious to me where the average doctor is doing things slightly sub-optimally out of ignorance. But the topics I haven’t studied in depth, I assume I’m doing everything basically okay. If you go through your life constantly noticing places where other doctors are wrong, it’s easy to think you’re better than them. [...]

7. You do a good job satisfying your own values. [Every doctor] wants to make people healthy and save lives, but there are other values that differ between practitioners. How much do you care about pain control? How much do you worry about addiction and misuse? How hard do you try to avoid polypharmacy? How do you balance patient autonomy with making sure they get the right treatment? How do you balance harms and benefits of a treatment that helps the patient’s annoying symptom today but raises heart attack risk 2% in twenty years? All of these trade off against each other: someone who tries too hard to minimize use of addictive drugs may have a harder time controlling their patients’ pain. Someone who cares a lot about patient autonomy might have a harder time keeping their medication load reasonable. If you make the set of tradeoffs that feel right to you, your patients will do better on the metrics you care about than other doctors’ patients (they’ll do better on the metrics the other doctors care about, but worse on yours). Your patients doing better on the metrics you care about feels a lot like you being a better doctor.

Assessing Kurzweil predictions about 2019: the results

Another review of Kurzweil's 2019 predictions: [1, 2, 3, 4].

A non-mystical explanation of "no-self" (three characteristics series)

Your metaphor doesn't quite work, because you are trying really hard to show me the color red, only to then argue I'm a fool for thinking there is such a thing as red.

No? I am trying to point you to something in your subjective experience, exactly because it is something that exists in your experience, and which seems like an integral part of how minds are organized. I'm definitely not going to argue that you are a fool for having it, because by default everyone has it.

As in, it might be that no person on Earth has such a naive concept of subjective experience, but they are not used to expressing it in language, then when you try to make them express subjective experience in language and/or explain it to them, they say

  • Oh, that makes no sense, you're right

Instead of saying:

  • Oh yeah, I guess I can't define this concept central to everything about being human after 10 seconds of thinking in more than 1 catchphrase.

But my claim is not "there's a concept in your experience that you can't define in words"... I defined it in words in my article! I even explained it in third-person terms, in the sense of "if a computer program made the same mistake, what would be the objectively-verifiable mistake in that."

I am just saying that while the mistake is perfectly easy to define in third-person terms, I cannot give you a definition that would directly link it up to your first-person experience. Because while words can be used to point at the experience, they cannot define the experience in a way that would create it

We can see where a computer program that committed this mistake would go wrong, but we do not see ourselves from a third-person perspective, so I cannot give you a third-person explanation that would cause the third-person explanation and the first-person experience to link up directly. But I can suggest ways in which you can examine your first-person experience, and then when you have the third-person explanation, the two can link up.

(Note that I am explicitly deviating from the Buddhist writers who say that it's intrinsically impossible to understand what's going on. I get why they are saying that: the Buddhists of old didn't know about computers or simulations, so they didn't have a third-person framework in which the thing can be explained. But we do, and that's why I've explicitly given you the third-person framework, or at least tried to.)

A person who is shown red for the first time could also say "oh, right, that's red; you're right that I couldn't have defined it in words", but unlike your comment suggests, the "I couldn't have defined it in words" isn't the important part of the "oh". The important part is "oh, now I can assign a meaning to your sentence in a way that causes its odd syntax to make sense, and now I can think more clearly about what something like 'seeing red' means".

But again, what I'm saying above is subjective, please go back and consider my statement regarding language, if we disagree there, then there's not much to discuss (or the discussion is rather much longer and moves into other areas), because at the end of the day, I literally can not know what your talking about.

If I may ask, how much time did you spend actually following the suggestions in the post and trying to find what the thing that I'm pointing at?

It's certainly not "literally impossible". Some are lucky enough to find it the moment they are pointed towards it. Others may have difficulty, and of course, given the fact that human minds vary and some people lack universal experiences, I cannot disprove the possibility that there could some people who naturally lack this experience at all. 

But I do expect that most people can find it - maybe it takes a minute, maybe ten, maybe a year, I have no idea of what the average and the median here might be. But you have to actually try looking for it.

A non-mystical explanation of "no-self" (three characteristics series)

If you tried solving the problem, instead of calling paradox based on a silly formulation, if you tried rescuing the self, 

Why do you say that I'm not trying to solve the problem? Solving the problem is much of what this sequence is about; see this later post in particular.

A non-mystical explanation of "no-self" (three characteristics series)

Well, suppose you had never seen the color red, and I wanted us to have a discussion of what red looks like. You would tell me that in order to know what red looks like, I need to first define it in terms of the concepts you are already familiar with.

This makes sense, but if we had to do that with every concept, it wouldn't work, because then we wouldn't have any concepts to start out from. And if you've never seen anything reddish, I can't give you an explanation that would let you derive red from the concepts you are already familiar with.

So instead I might tell you "see that color that my finger is pointing at? That's red." And then you could look, and hopefully say "oh, okay, I get it now."

I'm trying to do the same thing here. Of course, the problem is, I'm trying to point at an aspect of internal experience, rather than anything in the external world.

But I've done the best I can to give you pointers towards the thing that I expect to be found within your experience if you just know where to look. To extend the color analogy, this is as if I knew there was a line of increasingly reddish objects arrayed somewhere, and I told you to go find the first object and follow along the line and watch them getting increasingly red, and then at the end, you would know what red looks like.

You said that the Kaj/Harris/Kelly/etc. thing is a rather bad philosophy. It is if you evaluate it in terms of a philosophy that is supposed to have a self-contained argument! But that's not its purpose - or at least not the starting point. The purpose is to give you a set of instructions that are hopefully good enough to point out the thing it's talking about, and then when you've looked at your experience and found it, you'll get what the rest is trying to say.

How to reliably signal internal experience?

To answer the question of "how to describe internal experience": you could practice describing felt senses in more detail. For example, recently when I found my mind resisting the idea of doing something, I said "when I think of doing this, it feels like there's a part of my mind that says NO, and then I have a sense of there being a brick wall in front of me and it feels like if I try to push through, I'll just end up with a splitting headache". This was literally my experience.

To answer the question of "how to reliably signal internal experience": I'd say you can't. If you are looking for something that will always convince your friends of your experience, then there is no such thing: they could always believe that you were faking, or maybe not even faking but somehow subconsciously deluding yourself. Which you could be!

To believe your report, your friends have to have at least some genuine curiosity for, and openness to, your experience. If your friends don't have that, then - as others have mentioned - it would be better to look for better friends.

To answer the question of "what to do when I think I am doing my best but an outside view suggests that I am being needlessly defeatist": I think that in this case, even if the outside view was right, the best answer would not necessarily be to force yourself forward and work harder. 

Well, it depends on the circumstances - maybe you have something left undone that really needs to be done now for you to pay your rent next month, in which case, yeah probably just push yourself.

But in general, this kind of situation means that a part of your mind has information that makes it believe it is an important priority to stop you from doing whatever it is that you feel defeatist about. If you force yourself through, that may work in the short term, but the mind will react to that by noticing that you are doing something that it perceives to be dangerous, and increase the amount of resistance until you become unable to continue pushing through the thing. (If the resistance is mild, this might not be true, especially if pushing through gets you something that feels genuinely rewarding to counterbalance it; but often it is.)

In that case, what you want to do is not to push through, but take the time to find the source of that resistance and investigate why it is that your mind considers this to be a bad idea. If it's mistaken, it can be possible to reconsolidate the emotional learning that's blocking you. Though I suspect that in a lot of cases that lead to burnout, it's actually the other way around: you are doing something because a part of your mind has the mistaken belief that doing this will lead you to something that it is optimizing for, with the rest of the mind throwing up resistance because it knows that fact to be mistaken.

Craving, suffering, and predictive processing (three characteristics series)

That said, isn't the observation that binocular rivalry doesn't create suffering a pretty big point against the theory as you've described it?

It does. I think that I've figured out a better explanation since writing this essay, but I've yet to write it up in a satisfying form...

Side note, I don't experience the alternating images you described. I see both things superimposed, something like if you averaged the bitmaps together. 

Huh, that's an interesting datapoint!

Load More