Irrationality is Socially Strategic

Valentine

It seems to me that the Hamming problem for developing a formidable art of rationality is, what to do about problems that systematically fight being solved. And in particular, how to handle bad reasoning that resists being corrected.

I propose that each such stubborn problem is nearly always, in practice, part of a solution to some social problem. In other words, having the problem is socially strategic.

If this conjecture is right, then rationality must include a process of finding solutions to those underlying social problems that don’t rely on creating and maintaining some second-order problem. Particularly problems that convolute conscious reasoning and truth-seeking.

The rest of this post will be me fleshing out what I mean, sketching why I think it’s true, and proposing some initial steps toward a solution to this Hamming problem.

Truth-seeking vs. embeddedness

I’ll assume you’re familiar with Scott & Abram’s distinction between Cartesian vs. embedded agency. If not, I suggest reading their post’s comic, stopping when it mentions Marcus Hutter and AIXI.

(In short: a Cartesian agent is clearly distinguishable from the space the problems it's solving exists in, whereas an embedded agent is not. Contrast an entomologist studying an ant colony (Cartesian) versus an ant making sense of its own colony (embedded).)

It seems to me that truth-seeking is very much the right approach for solving problems that you can view from the outside as a Cartesian agent. But often it’s a terrible approach for solving problems you’re embedded in, where your models are themselves a key feature of the problem’s structure.

Like if a man approaches a woman he’s interested in, it can be helpful for him to bias toward assuming she’s also probably into him. His bias can sometimes be kind of a self-fulfilling prophecy. Truth-seeking is actually a worse strategy for getting the result he actually cares about. That fact wouldn’t be true if his epistemic state weren't entangled with how she receives his approach. But it is.

The same thing affects prediction markets. They can be reliably oracular only if their state doesn’t interact with what they’re predicting. Which is why they can act so erratically when trying to predict (say) elections: actors using the market to influence the outcome will warp the market’s ability to reflect the truth. If actors can (or just think they can) shape the outcome this way, then the market is embedded in the context of what it's predicting, and therefore it can't reliably be part of a Cartesian model of the situation in question. Instead it just is part of the situation in question.

So when facing problems you’re embedded in, there can be (and often is) a big difference between what’s truth-seeking and what actually solves your problems.

Protected problems

Evolution cares a lot about relevant problems actually being solved. In some sense that’s all it cares about.

So if there’s a problem that fights being solved, there must be a local incentive for it to be there. The problem is protected because it’s a necessary feature of a solution to some other meaningful problem.

I’m contrasting this pattern with problems that arise from some solution but aren’t a necessary feature. Like optical illusions: those often show up because our vision evolved in a specific context to solve specific problems. In such cases, when we encounter situations that go beyond our ancestral niche in a substantial way, our previous evolved solutions can misfire. And those misfirings might leave us relevantly confused and ineffective. The thing is, if we notice a meaningful challenge as a result of an optical illusion, we’ll do our best to simply correct for it. We'll pretty much never protect having the problem.

(I imagine that strange behavior looking something like: recognizing your vision is distorted, acknowledging that it messes you up in some way you care about (e.g. makes your driving dangerous in some particular way), knowing how to fix it, being able to fix it, and deciding not to bother because you just… prefer having those vision problems over not having them. Not because of a tradeoff, but because you just… want to be worse off. For no reason.)

An exception worth noting is if every way of correcting for the illusion that you know of actually makes your situation worse. In which case you'll consciously protect the problem. But in this case you won't be confused about why. You won't think you could address the problem but you "keep procrastinating" or something. You'd just be making the best choice you can given the tradeoffs you're aware of.

So if you have a protected problem but you don’t know why it’s protected, the chances are extremely good that it’s a feature of a solution to some embedded problem. We generally orient to objective problems (i.e. ones you orient to as a Cartesian agent) like in the optical illusion case: if there's a reason to protect the problem, we'll consciously know why. So if we can't tell why, and especially if it's extremely confusing or difficult to even orient to the question of why, then it's highly likely that the solution producing the protected problem is one we're enacting as embedded agents.

I think social problems have all the right features to cause this hidden protection pattern. We’re inherently embedded in our social contexts, and social problems were often dire to solve in our ancestral niche, sometimes being more important than even raw physical survival needs.

We even observe this social connection to protected problems pretty frequently too. Things like guys brushing aside arguments that they don’t have a shot at the girl, and how most people expect their Presidential candidate to win, and Newcomblike self-deception, and clingy partners getting more clingy and anxious when the problems with their behavior get pointed out.

Notice how in each of these cases the person can’t consciously orient to their underlying social problem as a Cartesian agent. When they try (coming up with arguments for why their candidate will win, talking about their attachment style, etc.), the social solution they’re in fact implementing will warp their conscious perceptions and reasoning.

This pattern is why I think protected problems are the Hamming issue for rationality. Problems we can treat objectively might be hard, but they’re straightforward. We can think about them explicitly and in a truth-seeking way. But protected problems are an overt part of a process that distorts what we consider to be real and what we can think, and hides from us that it’s doing this distortion. It strikes me as the key thing that creates persistent irrationality.

Dissolving protected problems

I don’t have a full solution to this proposed Hamming problem. But I do see one overall strategy often working. I’ll spell it out here and illustrate some techniques that help make it work at least sometimes.

The basic trick is to disentangle conscious cognition from the underlying social problem. Then conscious cognition can act more like a Cartesian agent with respect to the problem, which means we recover explicit truth-seeking as a good approach for solving it. Then we can try to solve the underlying social problem differently such that we don’t need protected problems there anymore.

(Technically this deals only with protected problems that arise from social solutions. In theory there could be other kinds of embedded solutions that create protected problems. In practice I think very close to all protected problems for humans are social though. I don’t have a solid logical argument here. It’s just that I’ve been unable to think of hardly any non-social protected problems people actually struggle with, and in practice I find that assuming they all trace back to social stuff just works very well.)

I’ll lay out three techniques that I think are relevant here. I and some others actually use these tools pretty often, and anecdotally they’re quite potent. Of course, your mileage may vary, and I might be pointing you in the wrong direction for you. And even if they do work well for you, I'm quite sure these don't form a complete method. There's more work to do.

Develop inner privacy

Some people in therapy like to talk about their recent insights a lot. “Wow, today I realized how I push people away because I don’t feel safe being vulnerable with them!”

I think this habit of automatic sharing is an anti-pattern most of the time. It makes the content of their conscious mind socially transparent, which more deeply embeds it in their social problems.

One result is that this person cannot safely become aware of things that would break their social strategies. Which means, for instance, that the therapy will tend to systematically fail on problems arising from Newcomblike self-deception. It might even generate new self-deceptions!

A simple fix here is to have a policy of pausing before revealing insights about yourself. Keep what you discover totally private until you have a way of sharing that doesn’t incentivize thought-distortion. What I’ve described before as “occlumency”.

I want to emphasize that I don’t mean lying to or actively deceiving others. Moves like glomarization or simply saying “Yeah, I noticed something big, but I’m going to keep it private for now” totally work well enough quite a lot of the time. Antisocial strategies might locally work, but they harm the context that holds you, and they can also over time incentivize you to self-deceive in order to keep up your trickery. It’s much better to find prosocial ways of keeping your conscious mind private.

As to exactly what kind of occlumency can work well enough, I find it helpful here to think about the case of the closeted homophobe: the guy who’s attracted to other men but hates “those damn gays” as a Newcomblike self-deceptive strategy. He can’t start by asking what he’d need to be able to admit to himself that he’s gay, since that’d be equivalent to just admitting it to himself, which isn’t yet safe for him to do. So instead he needs to develop his occlumency more indirectly. He might ask:

If I had a truly awful, disgusting, wicked, evil desire… how might I make it safe for me to consciously realize it? How might I avoid immediately revealing to others that I have this horrid desire once I become aware of it?

I think most LW readers can tell that the specific desire this guy is struggling with isn’t actually evil. Labeling it “evil” is part of his self-deceptive strategy. Once his self-deception ends, the desire won’t look bad anymore. Just socially troublesome given his context.

But it does look like an unacceptable desire to his conscious identity right now. It won’t work for him to figure out how to conceal a desire he falsely believes is wicked, because that’s not what it feels like on the inside. The occlumency skill he needs here is one that feels to him like it’ll let him safely discover and fully embrace that he's an inherently evil creature (by his own standards), if that turns out to actually be true in some way.

So for you to develop the right occlumency skill for your situation, you need to imagine that you have some desire that you currently consider to be horrendously unacceptable to have, and ask what would give you room to admit it to yourself and embrace it. You might try considering specific hypothetical ones (without checking if they might actually apply to you) and reflecting on what general skill and/or policy would let you keep that bad desire private at first if you were to consciously recognize it.

Once you’ve worked out an occlumency policy-plus-skillset that you trust, though, the thought experiment has done its work and should stop. There's no reason to gaslight your sense of right and wrong here. The point isn't to rationalize actually bad things. It's to work out what skill and policy you need to disembed your conscious mind from some as-yet unknown social situation.

Look for the social payoff

Occlumency partly disentangles your conscious mind from the social scene. With that bit of internal space, you can then try looking directly at the real problem you’re solving.

I think this part is pretty straightforward. Just look at a problem you struggle with that has resisted being solved (or some way you keep sabotaging yourself), and ask:

What social advantage might I be getting from having this stubborn problem?
If I assume I’m secretly being strategic here, what might the social strategy be?

Notice that this too has a “hypothesize without checking” nature to it. That’s not strictly necessary but I find that it makes things a little easier. It helps keep the internal search from triggering habitual subjective defenses.

If your occlumency is good enough, you should get a subjective ping of “Oh. Oh, of course.” I find the revelation often comes with a flash of shame or embarrassment that quickly dissolves as the insight becomes more apparent to me.

For example, someone who’s emotionally volatile might notice they’re enacting a social control disorder. (“Oh, whenever I want my boyfriend to do what I want, he responds more readily if I’m having an emotionally hard time. That incentivizes me to have a lot of emotionally hard times when in contact with him.”) That revelation might come with a gut-punch of shame. (“How could I be such a monster???”) But that shame reaction is part of the same (or a closely related) social strategy. If the person’s occlumency skill is good enough, they should be able to see through the shame too and arrive at a very mentally and emotionally clear place internally.

In practice I find it particularly important at this point to be careful not to immediately reveal what’s going on inside me to others. By nature I’m pretty forthright, and I also just enjoy exploring subjective structures with others. So I can have an urge to go “Oh! Oh man, you know what I just realized?” But this situation is internally Newcomblike, so it’s actually pretty important for me to pause and consider what I’d be incentivizing for myself if I were to follow that urge.

In general I find it helpful to have lots of impersonal models of social problems and corresponding solutions that might be relevant. I can flesh out my general models by analyzing social situations (including ones I’m not in, like fictional ones) using heuristics like “How is this about sex?” and “How is this about power?”. Then those models grow in usefulness for later making good educated guesses about my own motives.

Notice, though, that having occlumency you trust is a prerequisite for effectively doing this kind of modeling. Otherwise the strategies that keep you from being aware of your real motives will also keep you from being able to model those motives in others, especially if you explicitly plan on using those observations to reflect on yourself.

Change your social incentives

Once you see the social problem you’re solving via your protected problem, you want to change your social incentives such that they stop nudging you toward internal confusion.

For instance, sometimes it makes sense to keep your projects private. If you’re getting a camaraderie payoff from a cycle of starting a gym habit and then falling off of it, then the “social accountability” you keep seeking might be the cause of your lack of followthrough. If you instead start an exercise program but you don’t tell anyone, you remove a bunch of social factors from your effort.

(Not to imply that this move is the correct one for exercise. Only that it can be. Caveat emptor.)

Another example is, making it socially good to welcome conscious attempts to solve social problems. For example, a wife who feels threatened by a younger woman flirting with her man might find herself suddenly “disliking” the young lady. That pattern can arise if the wife believes that letting on that she's threatened will make others think she’s insecure (and that that'd be a problem for her). So she has to protect her marriage in some covert way, possibly including lying to herself.

But suppose the wife instead has a habit of making comments to her husband like so:

I keep noticing that this girl persistently makes the conversation be about her. Though my best guess is that I’m just hypersensitive to noticing her flaws due to intrasexual competition, because I noticed her flirting with you.

Approaches like this one let the wife look self-aware (by being self-aware!) while also still making the intrasexual competitive move (i.e., still pointing out an unattractive trait in the other woman). If she expects and observes that others admire and appreciate this kind of self-aware commentary from her, she can drop pretending to herself that she dislikes the girl (which is likely socially better for both herself and the girl). She can instead consciously recognize the young lady poses a threat and make explicitly strategic moves to deal with the threat.

This makes it so that the wife’s insecurity isn’t a social problem, meaning there’s no need for her to hide the insecurity from herself. She's actually socially motivated to be consciously aware of it, since she can now both signal some positive trait about herself while still naming a negative one about her competitor.

(This kind of conscious social problem-solving can come across as distasteful. But I think it happens all the time anyway, just implicitly or subconsciously. Socially punishing people for being conscious of their social strategies seems to me like it incentivizes irrationality. I think we can consciously, and even explicitly, try to solve our social problems in ways that actually enrich communal health, versus having to pretend we're not doing something we need to. And it seems to me that it's to each individual's benefit to identify and enact those prosocial strategies, for Newcomblike reasons.)

So if she didn't already have this style of commenting, and if she notices (within an occlumency-walled garden) that she's sometimes getting jealous, she could work on adopting such a style. Perhaps initially starting with areas other than where she feels intrasexually threatened.

I think it’s generally good to aim to no longer need your occlumency shield in each given instance. You want to shift your social context (and/or your interface with your social context) such that it’s totally fine if the contents of your conscious mind “leak”. That way imperfections in your occlumency skill don’t incentivize irrationality.

For instance, the closeted homophobe should probably move out of his homophobic social context if he can. Or failing that, he should make his scene less homophobic if he can (while keeping his own sexual orientation private during the transition). If he stays in a context that would condemn his sexual desires, then even if his occlumency was initially adequate, he might not trust it’ll be perpetually adequate. So he might start questioning his earlier revelation, no matter how clear it once was to him.

The right social scene would help a lot

The technique sequence I name above is aimed at finding better solutions to specific social problems… as an individual.

Obviously it would be way more effective to be embedded in a social scene that both (a) doesn’t present you with social problems that are most easily solved by having protected problems and (b) helps you develop better social solutions than your current problem-protecting ones.

My impression is that the current rationality community embodies this setup nonzero. And a fair bit better than most scenes in many ways. For instance, I think it already does an unusually good job of reinforcing people's honesty when they explicitly note their socially competitive urges.

But I bet it could grow to become a lot more effective on this axis.

A really powerful rationality scene would, I think, systematically cause its members to dissolve their stubborn problems simply by being in the scene for a while. The dissolution would naturally happen, the way that absorbing rationalist terms naturally happens today.

In my dream fantasy, just hanging out in such a space would often be way more effective than therapy for actually solving one's problems. The community would get more and more collectively intelligent, often in implicit ways that newcomers literally cannot understand right away (due to muddled minds from protected problems), but the truth would become obvious to each person in due time as their minds clear and as they get better at contributing to the shared cultural brilliance.

I think we see nonzero of this pattern, and more of it than in most other places I know of, but not nearly as much as I think we could.

I’m guessing and hoping that having some shared awareness of how social problems can induce protected irrationality, along with lots of individuals working on prosocially resolving their own protected irrationality in this light, will naturally start moving the community more in this direction.

But I don’t know. It seems to me that how to create such a potent rationality-inducing community is at best an incompletely solved problem. I'm hoping I've gestured at enough of the vision here that perhaps we can try to better understand what a full solution might look like.

Summary

It seems to me that the Hamming problem of rationality is, what to do about problems that fight being solved.

It also seems to me that problems that fight being solved arise from solutions to embedded problems (i.e. problems that you orient to as an embedded agent). Objective problems (i.e. problems you orient to as a Cartesian agent) might be challenging to solve but won’t fight your efforts to solve them.

In particular, for humans, it seems to me that overwhelmingly the most important and common type of embedded problem we face is social. So I posit that each problem that fights being solved is very likely a feature of a solution to some social problem.

In this frame, one way to start addressing this rationality Hamming problem is to find a way to factor conscious thinking out of the socially embedded context and then solve the underlying social problems differently.

I name three steps that I find help enact this strategy:

Develop both the skill and policy of keeping your personal revelations private until it’s socially safe for you to reveal them (i.e. occlumency).
Look for the social payoff you get from having your problem.
Change your social incentives so you’re no longer inclined to have the problem.

I also speculate that a community could, in theory, have a design that causes all its members to naturally dissolve their stubborn problems over time simply by their being part of that community. The current rationality community already has some of this effect, but I posit it could become quite a lot stronger. What exactly such a cultural design would look like, and how to instantiate it, remains unknown as far as I know.

(Many thanks to Paola Baca and Malcolm Ocean for their rich feedback on the various drafts of this post. And to Claude Opus 4.6 for attempting to compress one of the earlier drafts that was far too long: it didn't work, but it inspired me to see how to write a much tighter and shorter final version.)

A really powerful rationality scene would, I think, systematically cause its members to dissolve their stubborn problems simply by being in the scene for a while. The dissolution would naturally happen, the way that absorbing rationalist terms naturally happens today.
In my dream fantasy, just hanging out in such a space would often be way more effective than therapy for actually solving one's problems.

Like keeping things dry and oxygenated instead of using antibiotics to stop anaerobic rot.

I've seen this work. One time I tried to help an online friend with her fear of needles, and it didn't really work. Then she flew out and visited for a week, and the issue resolved itself without ever addressing it further. My best attempt to track back what happened is we did scarier stuff for fun (like jump rock) and she got to experience trusting someone else more than her own fears.

Recently, a different friend told me her Raynaud's issue dissolved as soon as she saw me enter the room, because she was gonna ask for help and then realized that she didn't actually need help because I was just gonna say "So stop?" and that she would. The active ingredient here being that I actually knew she could, and beliefs are contagious when well grounded and embodied.

I agree that "create a space where nonsense falls apart" is the way forward, and it's something I've been thinking about myself recently. Like, how do we scale this?

Like if a man approaches a woman he’s interested in, it can be helpful for him to bias toward assuming she’s also probably into him. His bias can sometimes be kind of a self-fulfilling prophecy. Truth-seeking is actually a worse strategy for getting the result he actually cares about. That fact wouldn’t be true if his epistemic state weren't entangled with how she receives his approach. But it is.

I get that your point here is that the entangling distorts things, but I worry it comes off like the problem is inherent in him being part of the system. "Truth seeking is actually worse" is a local property of his broken truth seeking strategy, not one inherent to being in the system.

The guys I've seen be the most successful with women don't project "She's gonna like me!" vibes at all, they're just not worried that she won't. And yes, "Not worried that she won't" does convey some information about his likability, but it is completely compatible with "She really might not" and often the engagement with the latter possibility is active ingredient. The problem isn't with his embeddedness in the "him+her" or "him+her+others" system, it's that he's embedded in his own mind and hasn't yet figured out what to do with "she might not like me".

Could you elaborate on the Reynaud's thing? What do you mean "just stop"? Is there a different Reynaud's to the syndrome that reduces circulation in fingers when your hands get cold?

It was her feet and stress induced, but yeah, reduced circulation. I literally mean "Just don't do that", as in "Send more blood to your [extremity]" which is how I actually worded it when my wife noticed reduced circulation in her hands due to cold air despite a warm core.

I have a long sequence about how to resolve "psychological issues" with rationality, and give examples of this friend learning to not swell her injuries because I told her it's something you can just decide and of helping my wife constrict her blood vessels with a bit more guidance when that was desirable for stopping bleeding. Hopefully the explanation on the latter will help it feel a little less weird, but I'm not sure how much sense it'll make as a stand alone post without going through the build up.

The problem really is as simple as "she wasn't trying", but the hard part is not getting stopped from trying by the belief that you can't do it (e.g. if I told you to "start levitating" it'd feel impossible because it is and you wouldn't know where to begin, and this makes the process of actually trying difficult. It's the same issue even when you're wrong about what you can do). So like, the question becomes "What does it take to change this belief" and the answer can be as simple as "just tell them" which is why it worked in these cases, but there's a lot going on under the hood such that when I say weird shit like that she actually believes me.

And even then there's some residual weirdness, like the fact that she thought she had tried to dilate her blood vessels before, but despite succeeding on the swelling on her own it wasn't until I actually walked into the room that she was able to expect her blood vessels to dilate... which turns out to be the same action as intending. So I'm still working on getting things to generalize as far as they "should".

One time I tried to help an online friend with her fear of needles, and it didn't really work. Then she flew out and visited for a week, and the issue resolved itself without ever addressing it further. My best attempt to track back what happened is we did scarier stuff for fun (like jump rock) and she got to experience trusting someone else more than her own fears.
Recently, a different friend told me her Raynaud's issue dissolved as soon as she saw me enter the room, because she was gonna ask for help and then realized that she didn't actually need help because I was just gonna say "So stop?" and that she would. The active ingredient here being that I actually knew she could, and beliefs are contagious when well grounded and embodied.

Extremely cool. Yes. I'd love to see more things like these happening organically.

I get that your point here is that the entangling distorts things, but I worry it comes off like the problem is inherent in him being part of the system. "Truth seeking is actually worse" is a local property of his broken truth seeking strategy, not one inherent to being in the system.

I want to agree and disagree with you at the same time!

I agree that there's a way for him to rearrange his relationship to the situation such that truth-seeking isn't bad anymore. Strong agree. I think developing that type of rearrangement is probably an important part of a mature art of rationality.

But I do think the problem really is that he's embedded in the system. I read your point as being that he doesn't have to be. Yep, totally. But he is, and that's the source of the problem. If he weren't, then truth-seeking would be the strategic approach!

(Not to imply that being embedded always means truth-seeking is anti-strategic. Just that it can be and quite often is, and that that never happens AFAIK in non-embedded situations.)

I like your point here:

The problem isn't with his embeddedness in the "him+her" or "him+her+others" system, it's that he's embedded in his own mind and hasn't yet figured out what to do with "she might not like me".

I'm not sure whether I agree. Maybe someone standing on the outside of the whole system might agree that yeah, the guy isn't facing an embedded problem with the girl; he's facing it with his thoughts about the girl instead. Seems abstractly plausible.

The thing is, on the inside that's not at all how it looks to him. It looks like his worries are entangled with (his perception/projections of) her judgments of him.

And I'm not sure he's totally wrong! If he gets tied up in knots about his own worry, that can be offputting to her. Which is what he's worried about! Which is to say, his worry (or at least his meta-worry) is embedded with her perceptions of him, and therefore is part of the problem he's trying to solve.

The key thing that I think causes truth-seeking to break down as a strategy for embedded problems is, your thinking and models can be part of the problem space rather than being objectively about the problem space. Which means that if having a distorted model helps solve the embedded problem, you'll probably have a distorted model of what the problem is and/or how you're solving it. So as long as the act of modeling the situation interacts directly with the situation, your models are suspect, even if you're sincerely consciously trying to prioritize having true beliefs.

I think that issue is there even if the person is unknowingly objectively wrong about whether they're embedded with the actual thing they're trying to deal with.

I'm trying to tease out where we differ, exactly.

I think it's here:

Which means that if having a distorted model helps solve the embedded problem, you'll probably have a distorted model of what the problem is and/or how you're solving it.

I think there's a tricky distinction between "model" and "meta model" being lost here. The model is distorted relative to what and according to what model?

Control systems have to be a model of the system it's controlling because of the good regulator theorem. If we take a maximally simple control system like a bang bang controller, the structure of the control system has implications about the structure of the plant being controlled. So for example, a bang bang controller is not going to work as a servo position control. It could work for the velocity control loop, or a thermostat, but if you try to use it to control position directly it will oscillate.

If the controller works then it's fit to reality. The explanation can be wrong in all sorts of ways but the controller itself has to conform to reality or it won't work.

Bringing this back to the guy trying to attract women, he can say whatever he wants about "She's gonna like me because I was born on a prime day which is interesting, and chicks dig interesting guys", and if he uses that to justify the decision to stop communicating to women "Don't like me" then that will help him out... but the "distortion" is in his meta model, according to my meta model, and his object level model is actually more fit to reality or it wouldn't work.

So when he says "I gotta believe that she'll like me, so she'll like me. Truth tracking isn't necessarily optimal when you're embedded in the system!", I see that as another distorted model. He's trying to twist his meta meta model in such a way that it doesn't untwist his meta model in such a way that twists his object level model out of congruence with reality. Instead of just untwisting the relationship between his meta model and object level model, and playing it straight which is gonna work better.

Scratching the next couple paragraphs that came after this, I think I see now that you were using the word "embedded" a bit differently than I interpreted it. Hm. Now I'm less sure I disagree about anything. Because you agree that it's possible to do truth tracking so that it works, even though "probably important" is a much weaker endorsement of this path than I'd have given. And I strong agree that occlumency is an incredibly important rationality skill (at least, in existing suboptimal cultures).

Maybe I'd just have phrased it "Irrationality is socially strategic (if you're irrational)"

Okay, maybe this is it: I think occlumency is important because figuring out how to get your meta models sufficiently clean is hard. Like you're not gonna manage delivering harsh but accurate criticism in a way that people receive as loving and helpful the first time you notice that accurate criticism would be harsh, because you don't know how to be loving and helpful yet. You don't yet see your errors for what they are, and so you're necessarily going to have all sorts of distortions in your meta models to where naive implementations are going to predictably fuck up your object level models if you try to play it the first way that comes to mind. So like, "Back the fuck off. I don't know what I think yet, and I need to be able to explore thoughts that won't hold up" is critical. In the same way that training wheels make it hard to learn how to ride a bike. Sure, they keep you upright which is where you generally want to be, but it's only by leaning that you can learn how to regulate your lean, and that's a crucial part of balancing for yourself let alone turning.

So we end up with a bell curve meme where the low skill end says "Have you tried not having the problem, you dummy?", the center of the curve says "Nooo! Saying this is going to cause all sorts of problems because this and that!", and the high skill end comes back to "Have you tried not having the problem, you dummy?" -- which actually works after having untangled all the distortions that lead to that coming off as unloving and un-reality-tracking the first time it comes to mind.

I like this post a lot and I'm glad you wrote it, if for my own understanding. I also appreciate how it engages with Chesterton's Fence and suggests e.g. "For instance, the closeted homophobe should probably move out of his homophobic social context if he can."

That said, I wonder this post is an infohazard for people immersed in sufficiently strong social incentives. I know you acknowledge this, but still.. Have you tested these tools with people in very difficult socially contexts?

I would guess that in a way part of the thesis of the post is that people who are in strong social incentive contexts (relative to their capacity to differentiate and develop occlumency) will be quite likely to simply not notice how it applies to them. Like it may not work, but it wouldn't be an infohazard.

I want to believe this but I feel like I've vaguely heard stories about people discarding load-bearing copes and suffering for it, e.g. via meditation

I also appreciate how it engages with Chesterton's Fence and suggests e.g. "For instance, the closeted homophobe should probably move out of his homophobic social context if he can."

I'm glad you appreciate it… but I'm honestly unsure how this is a Chesterton's Fence thing. Can you explain? You mean e.g. I'm not proposing the closeted homophobe just force himself to become aware of his sexual orientation without getting out of dodge?

(FWIW: if that's what you meant, then I'm accidentally applying Chesterton's Fence! It's more like, I think I see the reason behind the pattern, and if correct then he has to address the reason behind the pattern in order for the pattern to shift. The theory kind of states that he can't knock down Chesterton's Fence here.)

I wonder this post is an infohazard for people immersed in sufficiently strong social incentives. I know you acknowledge this, but still.. Have you tested these tools with people in very difficult socially contexts?

Not extensively. A little bit.

FWIW I kind of lean toward the camp of tongue-in-cheek thinking that the idea of infohazards is one of the few real examples we have of infohazards. I think a more helpful framing is that memes interact in complicated ways with memetic ecosystems (the way any organism interacts in complicated ways with ecosystems), and it's possible to have an "invasive species" kind of effect from a meme, and some of those effects can be pretty bad for the ecosystem in question. And I view a human psyche as a kind of memetic ecosystem. So it's more a matter of, what's the memetic impact? How does the ecosystem adapt? Does it become more vibrant, or less? How shall we measure that vibrance?

I think that point adds relevant nuance.

My impression is that a policy of worrying about infohazards encourages a memetically unhealthy environment. It seems like care, but in practice what I find actually happens is that critical information gets blocked, the policy becomes a vector for status games (e.g. taking a patronizing view of others, like you know better than they do what's good or bad for them), and the memetic adaptation that would organically happen in contact with the truth just… doesn't happen. So you end up with lots of bizarre fictions embedded in the memetic context in question.

Like, I don't know, it sure doesn't seem to me like what originally happened around Roko's Basilisk was all that sane. The original idea strikes me as, at most, a self-fulfilling prophecy. It encourages the thinker to embed themselves in the idea and immerse themselves in an existential horror. And sure, you can do that if you want. And sure, knowing that you could makes it more likely. But I dunno, is knowing about crystal meth a basilisk? I think Roko's Basilisk is better thought of as a training opportunity: learn how to not negotiate with terrorist memes. How do you handle self-fulfilling prophecies without bricking your ability to prioritize truth?

There was a whole section of this essay I cut that's exactly about this topic. I suspect that conscious hyperstitioning, and learning to switch which attractor one hyperstitions in a multistable space, is also a key rationality skill. You need it in order to make truth-seeking safe to do when seeing a bad possibility threatens to make that possibility more likely. Otherwise it becomes strategic to refuse to acknowledge certain possibilities. That skill helps navigate situations where you're embedded and you either don't notice or can't disembed (which is the strategy the OP focuses on: kind of disembedding the conscious mind).

Anyway. I ramble a bit. My point is that I don't think worrying about infohazards is a good general approach.

That said, there's a question of whether introducing these ideas to someone in a great deal of social stress (e.g. someone in an intensely psychologically abusive relationship, like with a full-on narcissist who's really good at reading people) would make their situation worse.

I haven't tested that very much. It's possible. It strikes me as quite unlikely. I think most such people just… won't engage with the ideas.

But sure, strong psychotechnologies can have strong effects, some of which might be negative (possibly severely negative) on some people.

There was a very early version of this material that I ran on a group back in 2019. It didn't have the "Chesterton's Fence" aspects to it basically at all. It was more like a frontal assault on self-deception. It created some intense subjective experiences for many (most?) of the people in that fairly large audience. Most such effects were extremely positive over the following year or so. But one person had a reactivation of her suicidal tendencies that left her spinning for months; she eventually pulled out of it, but mostly by distracting herself from the possibility of using that early version of these tools.

(The early version was almost nothing like the ones I've presented in the OP by the way. I don't think the OP's tools are at all easy to misuse the way that the 2019 version was almost trivial to misuse.)

The thing is, even that one woman who became suicidal again had extra factors that this post just doesn't have. She was embedded in a social context that pressured her for weeks after that lecture to use the tools. That's when she became suicidal. And she stopped being suicidal when everyone backed off and she went back to distracting herself.

So I think the concern is more like, let's not start a cult where we pressure each other to eliminate all of our protected problems. A supportive context that doesn't feed social control disorders but nonetheless respects and welcomes them despite their protected problems seems maybe extremely good. But trying to challenge one another to stop rationalizing in specific ways strikes me as setting up terrible incentives.

But I don't know. What would you like to see in terms of adequate care? I want things to be good for people, but I also don't want fear to prevent us from developing potent psychotechnologies.

Excellent. This rings true on my model of the mind and the unconscious. Of course I'd use different terms. To me it seems like this addresses the difficulties of motivated reasoning, which I see as the central problem for rationality. I'll reference this when I finish that post in a few days.

The difficulty of this Hamming problem is arguably caused by a systematic fault in the common knowledge standard world model. As explored in the Parable of Lightning, a small flaw can bring down a system eventually. I think the flaw lies in our understanding of consciousness and the self. The difficulty to analyze or even define it, the high number of competing theories that make few falsifiable predictions, all that points to difficulties of thinking clearly about it. And your analysis seems to imply that the faulty part of the problem is of a social type or dynamic.

I'm glad that thinking about incentives/teleology is getting more popular!

People keep discovering ideas I came up with years ago which weren't understood. I said "Book-smarts and street-smarts" instead of "Truthseeking and embeddedness", but yeah, there's a difference between the inside and outside perspective (or in my view, between global and local optimum).

But I don't think always taking an outside perspective on things is healthy. For instance, meaning in life only exists from the inside, so if you look too much from the outside, the value of things on the inside will diminish and seem "false". No wonder philosophers are so gloomy, the problem they're trying to solve is caused by their attempt to solve it.

That truth seeking is different from winning is the core idea of the Erhard Seminars Training from the 1970s. It was a pretty interesting program.

I'll warn against trying to solve problems though. Something being a "problem" is often not a fact, but a perspective, and all problems have their merit. Even if you consider a thing in isolation, you might forget that it's connected to a whole web of other things, and that any changes will have effects elsewhere. Modeling the entire thing is likely beyond human ability, and the road to hell is paved with good intentions.

your models are themselves a key feature of the problem’s structure

Indeed. Does this not lead to a reflexive property? How agents act in reality depends on the mental model of the agents, but the agents are modeling the reality they're currently inside, its agents included. But I think there's no singular objective truth in a system with such a property.

A really powerful rationality scene would, I think, systematically cause its members to dissolve their stubborn problems simply by being in the scene for a while. The dissolution would naturally happen, the way that absorbing rationalist terms naturally happens today.
In my dream fantasy, just hanging out in such a space would often be way more effective than therapy for actually solving one's problems.

Like keeping things dry and oxygenated instead of using antibiotics to stop anaerobic rot.

I agree that "create a space where nonsense falls apart" is the way forward, and it's something I've been thinking about myself recently. Like, how do we scale this?

Like if a man approaches a woman he’s interested in, it can be helpful for him to bias toward assuming she’s also probably into him. His bias can sometimes be kind of a self-fulfilling prophecy. Truth-seeking is actually a worse strategy for getting the result he actually cares about. That fact wouldn’t be true if his epistemic state weren't entangled with how she receives his approach. But it is.

Could you elaborate on the Reynaud's thing? What do you mean "just stop"? Is there a different Reynaud's to the syndrome that reduces circulation in fingers when your hands get cold?

One time I tried to help an online friend with her fear of needles, and it didn't really work. Then she flew out and visited for a week, and the issue resolved itself without ever addressing it further. My best attempt to track back what happened is we did scarier stuff for fun (like jump rock) and she got to experience trusting someone else more than her own fears.
Recently, a different friend told me her Raynaud's issue dissolved as soon as she saw me enter the room, because she was gonna ask for help and then realized that she didn't actually need help because I was just gonna say "So stop?" and that she would. The active ingredient here being that I actually knew she could, and beliefs are contagious when well grounded and embodied.

Extremely cool. Yes. I'd love to see more things like these happening organically.

I get that your point here is that the entangling distorts things, but I worry it comes off like the problem is inherent in him being part of the system. "Truth seeking is actually worse" is a local property of his broken truth seeking strategy, not one inherent to being in the system.

I want to agree and disagree with you at the same time!

(Not to imply that being embedded always means truth-seeking is anti-strategic. Just that it can be and quite often is, and that that never happens AFAIK in non-embedded situations.)

I like your point here:

The problem isn't with his embeddedness in the "him+her" or "him+her+others" system, it's that he's embedded in his own mind and hasn't yet figured out what to do with "she might not like me".

The thing is, on the inside that's not at all how it looks to him. It looks like his worries are entangled with (his perception/projections of) her judgments of him.

I think that issue is there even if the person is unknowingly objectively wrong about whether they're embedded with the actual thing they're trying to deal with.

I'm trying to tease out where we differ, exactly.

I think it's here:

Which means that if having a distorted model helps solve the embedded problem, you'll probably have a distorted model of what the problem is and/or how you're solving it.

I think there's a tricky distinction between "model" and "meta model" being lost here. The model is distorted relative to what and according to what model?

If the controller works then it's fit to reality. The explanation can be wrong in all sorts of ways but the controller itself has to conform to reality or it won't work.

Maybe I'd just have phrased it "Irrationality is socially strategic (if you're irrational)"

I want to believe this but I feel like I've vaguely heard stories about people discarding load-bearing copes and suffering for it, e.g. via meditation

I also appreciate how it engages with Chesterton's Fence and suggests e.g. "For instance, the closeted homophobe should probably move out of his homophobic social context if he can."

I wonder this post is an infohazard for people immersed in sufficiently strong social incentives. I know you acknowledge this, but still.. Have you tested these tools with people in very difficult socially contexts?

Not extensively. A little bit.

I think that point adds relevant nuance.

Anyway. I ramble a bit. My point is that I don't think worrying about infohazards is a good general approach.

I haven't tested that very much. It's possible. It strikes me as quite unlikely. I think most such people just… won't engage with the ideas.

But sure, strong psychotechnologies can have strong effects, some of which might be negative (possibly severely negative) on some people.

But I don't know. What would you like to see in terms of adequate care? I want things to be good for people, but I also don't want fear to prevent us from developing potent psychotechnologies.

I'm glad that thinking about incentives/teleology is getting more popular!

That truth seeking is different from winning is the core idea of the Erhard Seminars Training from the 1970s. It was a pretty interesting program.

your models are themselves a key feature of the problem’s structure

LESSWRONG
LW

LESSWRONG
LW

90

Irrationality is Socially Strategic

90

Truth-seeking vs. embeddedness

Protected problems

Dissolving protected problems

Develop inner privacy

Look for the social payoff

Change your social incentives

The right social scene would help a lot

Summary

90

90