[ Question ]

Is there a definitive intro to punishing non-punishers?

by pjeby 1 min read31st Oct 20199 comments


Hey LW. I've just written a draft of an article, that I'm trying to edit to link to useful prior notes on various concepts. One of those concepts is the idea that we evolved to "punish the non-punishers", in order to ensure the costs of social punishment are shared by everyone.

The thing is, I vividly remember reading an article from the rationalsphere that introduced me to this idea many years ago, but I cannot seem to find it anywhere. The idea is mentioned many places where I could have seen the original article, including on Overcoming Bias, Slate Star Codex, and Less Wrong, but all my searches seem to find passing mentions in comments, or throwaway mentions in articles assuming you already know what it means.

Since my article will otherwise continue this apparent tradition of assuming everyone knows what it means and not linking to an introduction, can anyone point me to the article I'm recalling, or perhaps suggest another good non-technical introduction to the idea (i.e. not this paper), so readers of the article I'm writing who aren't yet familiar with the idea can learn about it? Thanks!

New Answer
Ask Related Question
New Comment

1 Answers

Grr, this was harder than I expected.

I think your best bet may be Eliezer's Tolerating Tolerance, with the key quote below.

Cooperation is unstable, in both game theory and evolutionary biology, without some kind of punishment for defection.  So it's one thing to subtract points off someone's reputation for mistakes they make themselves, directly.  But if you also look askance at someone for refusing to castigate a person or idea, then that is punishment of non-punishers, a far more dangerous idiom that can lock an equilibrium in place even if it's harmful to everyone involved.
The danger of punishing nonpunishers is something I remind myself of, say, every time Robin Hanson points out a flaw in some academic trope and yet modestly confesses he could be wrong (and he's not wrong).  Or every time I see Michael Vassar still considering the potential of someone who I wrote off as hopeless within 30 seconds of being introduced to them.  I have to remind myself, "Tolerate tolerance!  Don't demand that your allies be equally extreme in their negative judgments of everything you dislike!"

Note that this is only a citation about what 'punishing non-punishers' means and why it's bad (causes bad equilibria to be sticky), not a mechanism for how such a meta-norm would come about in a society.

There is also a short Overcoming Bias post by Robin called Why Towns Conform.

In general, social norms are enforced via two key informal mechanisms:
1. When norms are usually followed, rare violators are often undesirable in objective ways. They may lack intelligence or self-control, for example. So people avoid violating such norms to avoid sending bad signals about themselves.
2. Meta-norms often require observers of norm violations to punish violators, such as by refusing to associate with them. This includes observers of a failure to punish a failure to punish, and so on.

But mostly the post is pretty oblique and doesn't give mechanism for how such norms would have evolved.

(Robin also talks about meta-norms a lot more generally, for example his excellent series on automatic norms.)

Then you've also got Scott's epic Meditations on Moloch. I'd have to re-read all the examples to see if they fit the specific worry of punishing non-punishers, but it does start them with the following:

Bostrom makes an offhanded reference of the possibility of a dictatorless dystopia, one that every single citizen including the leadership hates but which nevertheless endures unconquered. It’s easy enough to imagine such a state. Imagine a country with two rules: first, every person must spend eight hours a day giving themselves strong electric shocks. Second, if anyone fails to follow a rule (including this one), or speaks out against it, or fails to enforce it, all citizens must unite to kill that person. Suppose these rules were well-enough established by tradition that everyone expected them to be enforced.
So you shock yourself for eight hours a day, because you know if you don’t everyone else will kill you, because if they don’t, everyone else will kill them, and so on. Every single citizen hates the system, but for lack of a good coordination mechanism it endures. From a god’s-eye-view, we can optimize the system to “everyone agrees to stop doing this at once”, but no one within the system is able to effect the transition without great risk to themselves.
And okay, this example is kind of contrived. So let’s run through – let’s say ten – real world examples of similar multipolar traps to really hammer in how important this is.

The only place I found that tried to explain how such norms would come about, is Eliezer's story Three Worlds Collide, explaining a possible history that would lead to... well I'll just say (content warning: infant death, cannibalism), and leave the full quote below ((hover over it to read).

"It's a truism in evolutionary biology that group selection can't work among non-relatives.  The exception is if there are enforcement mechanisms, punishment for defectors - then there's no individual advantage to cheating, because you get slapped down.  That's what happened with the Babyeaters.  They didn't restrain their individual reproduction because the more children they put in the tribal pen, the more children of theirs were likely to survive.  But the total production of offspring from the tribal pen was greater, if the children were winnowed down, and the survivors got more individual resources and attention afterward.  That was how their species began to shift toward a k-strategy, an individual survival strategy.  That was the beginning of their culture.

"And anyone who tried to cheat, to hide away a child, or even go easier on their own children during the winnowing - well, the Babyeaters treated the merciful parents the same way that human tribes treat their traitors.

"They developed psychological adaptations for enforcing that, their first great group norm.  And those psychological adaptations, those emotions, were reused over the course of their evolution, as the Babyeaters began to adapt to their more complex societies.  Honor, friendship, the good of our tribe - the Babyeaters acquired many of the same moral adaptations as humans, but their brains reused the emotional circuitry of infanticide to do it.


The Lady Sensory spoke up.  "I don't suppose... we could convince them they were wrong about that?"


"Even if you could persuade them, it might not be a good idea," said the Xenopsychologist.  "If you convinced the Babyeaters to see it our way - that they had committed a wrong of that magnitude - there isn't anything in the universe that could stop them from hunting down and exterminating themselves.  They don't have a concept of forgiveness; their only notion of why someone might go easy on a transgressor, is to spare an ally, or use them as a puppet, or being too lazy or cowardly to carry out the vengeance.  The word for wrong is the same symbol as mercy, you see."  The Xenopsychologist shook her head.  "Punishment of non-punishers is very much a way of life, with them.  A Manichaean, dualistic view of reality.  They may have literally believed that we ate babies, at first, just because we didn't open fire on them."

I also managed to find an old comment about punishing non-punishers by you PJEby.