There are a lot of senses in which I or the people around me can be considered unsafe. Many-tonned hunks of metal whiz by us on the same streets we have to navigate on foot to buy our groceries. The social infrastructure by which we have access to clean drinking water is gradually being adulterated. Our country is run by occasionally genocidal white nationalists. And, of course, The Bomb. But when I hear people talk about feeling unsafe, they are almost never describing a concrete threat to their physical well-being. (As usual, life may be different for the less privileged classes, who have reason to fear the authorities, and behave accordingly.) "Safety" does not come up as a motive for actions taken or avoided in order to mitigate such threats. Instead, it seems that "safety" nearly always means a nonjudgmental context (the exact opposite of what I would naively expect to be able to ensure clean drinking water or keep the cars from colliding with us), and "feeling unsafe" is generally used to explain only why they're trying to withhold information (mainly "vulnerable," i.e. relevant-to-their-interests, information) in a way that seems out of proportion to actually existing risks and opportunities.

Judgment and Punishment

Consider a simple model: information about social censure consists of two parts. Each socially legible action is assigned a vulnerability score based on how often, empirically, someone responds by revealing the intent to punish the actor. Actions are sometimes defined contextually, so that talking loudly in a crowded bar or on the street is different than talking loudly in a library or theater, but it's not a different action depending on who is present - only impersonal context cues and stereotyped identities (e.g. some things are inappropriate "in mixed company"). Vulnerability is a global variable with respect to persons.

If Cato is observed to punish singing but not dancing, and Flaccus is observed to punish dancing but not singing, this is treated as unpredictable random variation - possibly just measurement error. Cato and Flaccus both acquire a reputation for judginess, and both singing and dancing start to feel like vulnerable activities, so people will feel inhibited about doing either activity in the presence of either censor.

At the same time, each known person is evaluated for their generic propensity to punish, or judginess. Some people will physically attack you for violating norms (they often wear dark blue or gray-green), others will just yell at you, still others will politely hint that others might disapprove, and a few are universal receivers, totally nonjudgmental. Revealing others' intent to punish is considered a veiled threat, and is therefore itself a mild form of intent-to-punish. To be nonjudgmental, one must deny others information about what is likely to be punished elsewhere.

We recognize judgmental people not merely by their actual punishment behavior (in the ancestral environment, where ostracism could easily be permanent and deadly, this might have been playing things quite a bit too close), but by their posture, the patterns of tension in their voice, and so on.

I think that this model fits how people in our society experience a sense of social safety or unsafety surprisingly well for something so simple. One virtue of this model is that it correctly predicts that "code-switching," i.e. adjusting to variations in standards between different cultures for the same activity in the same context, is more difficult than learning different behaviors for different contexts within a single culture. Code-switching imposes a greater cognitive load due to its strong dependence on theory of mind.

The Information-Suppression Field

One important characteristic of this setup is that it structurally advantages information-suppression tactics over clarity-creation tactics.

If I try to judge people adversely for giving me misleading information, I end up complaining a lot, and quickly acquire a reputation for being judgmental and therefore unsafe. Ironically, I get more of the behavior I punish, since being categorized as judgy leads to people avoiding all vulnerable behaviors around me, not just the ones I specifically punished. I cut myself off from a lot of very important information, and in exchange, maybe slightly improve the average punishment function - but this would provide an information subsidy to all other judgy agents, even ones whose interests conflict with mine and are trying to prevent me from learning some things. And most likely I just add to the morass of learned inhibitions.

On the other hand, if I wish to suppress some information - say, that some enterprise I'm profiting from is fraudulent - and I don't otherwise read as unsafe, then I can very slightly punish it - say, by gently discouraging people from talking about it because it seems likely to be harmful, because it hurts some people's feelings, etc, If I only need to suppress a few pieces of information, and there are other REALLY judgy people out there, then I can externalize most of the enforcement costs onto either the actual judgy people or the imaginations of the people I am manipulating.

A simple example:

Alice has a pervasive sense that she is being cheated in life somehow, and lashes out from time to time at people who seem like they're piling on. Carol has a consistently gentle, positive vibe, and owns a drugstore from which Alice regularly purchases expensive homeopathic medicines. Bob, who knows both of them, starts to tell Carol about how he's done some thinking about it, and homeopathy seems to him like it couldn't possibly work. Carol hints to Bob that this is a sensitive subject. Bob reasons, implicitly, that if even Carol doesn't like him talking about his idea, he had darn well better make sure not to talk about it around Alice.

This is an adversarial game that different secretive coalitions can play against each other, at the expense of other people trying to use censure for other reasons. All such moves, however, also benefit nonjudgmental people, who can collect a surplus from living in a society that relies on standards, while collecting a disproportionate amount of information and social capital by never contributing to attempts to track and censure misbehavior.


New Comment
12 comments, sorted by Click to highlight new comments since: Today at 7:35 PM

Downvoted because I don't want LW to be the kind of place where people casually make inflammatory political claims, in a way that seems to assume this is something we all know and agree with, without any supporting evidence.

*nods* I agree that the opening has one line which is both off-topic and predictably distracting. I strong-upvoted because I found the model in the rest of the post to be helpful and quite accurate.

I also downvoted for precisely this reason. I agree with Ben Pace’s take, but not with his voting decision, though I entirely agree that this post is strong-upvote-worthy with this sort of thing removed.

Over the past year or two, I've come to believe that I'm doing a fair bit of "freeloading off others who are willing to punish". But, because the dynamics you describe here, I'm still unsure what the right course of action is.

My sense is that most people doing the punishment are

a) off enough in their targets (relative to my preferences), that I don't feel good endorsing them straight away, often actively disendorsing them, and

b) similar to what you note in this post, I think most of the effect of judgment is general reluctance to speak publicly at all, for fear of being randomly judged in an unpredictable fashion (while slightly tweaking people's predictive weights about what they're likely to be punished for)

So I see a lot of the punishment as anti-social rather than pro-social, and much of my default impulse is to punish the punishers.

But, by now I've seen enough examples of me benefiting from something like "good cop / bad cop", or from pendulum swings where one person is trying to drag the norms way off in a direction that seems too far to me, but that (as you say) "maybe slightly improves the average punishment function".

Meanwhile, it is suspicious if I think the status quo is "okay" enough not to spend any social capital pushing it in some direction.

Nonetheless...'s pretty unclear to me which directions are actually helpful, and whether strategies of judgment or punishment are the way to go.

Seems to me like punishment might not accomplish what it claims to & might be harmful on net. I have a future post planned to explain that point more fully though.

Seems to me like punishment might not accomplish what it claims to & might be harmful on net

Nod. That's been my past default assumption, just noting that my overall opinion here is in flux. Looking forward to further thoughts.

(My current background strategy is something like "positive reinforcement is a better strategy than punishment", but I'm not that confident. I notice that your Oops Prize didn't amount to much. There's a chance that it could just use More Dakka. Upping the prize and doing more advertisement might be worthwhile, and think it is likely that I'd want to contribute to that.

It runs the risk of Goodharting on Looking Like You're Not Goodharting, but maybe that's a problem to worry about when you have more than one submission)

Maybe you're not freeloading on them, you're honoring their and your comparative advantages. They're willing to take more risks than you in who and how much to punish, and the fact that you don't want to correct them in either direction indicates you'd rather accept their choices than to try to calculate the proper amount yourself. Or maybe you _should_ be supervising more closely because they're wrong.

How to determine which model (freeloading vs division of labor vs dereliction of duty) fits the situation is the tricky part.

A motivator here is that the judgers seem to feel unhappy and frustrated. Something something melting gold.

Note: something in the editing of this post broke. I copy-pasted what's currently on your home blog, lemme know if there's problems.

Thanks! Unfortunately I guess signals crossed and you missed a last-minute correction I made - changing the arguably untrue and value-laden “increasingly nasty” to the more factual “occasionally genocidal.” I’ve restored the correction, but figured I’d note it here because it was the topic of a few comments.

I worry a lot about trying to reason about very complex equilibria when only looking at one force. It's _BOTH_ an adversarial and cooperative game - there are (asymmetric, but usually same sign) benefits to clear, honest communication. And even for adversarial portions, there may be a positive sum even when one player is harmed, if other players gain more than the harm.

I can make a model, even, that outsourcing the punishment so that extra-judgey people get most of the flak for the judgement, but still provide overall value, is optimal for some utility aggregation functions. I don't currently like or claim applicability of this model, but it's not obviously wrong.

New to LessWrong?