Computational signatures of psychopathy

Thanks for writing this!

I had a repeated complaint that you use terms like “deficit” and “impaired” without justifying them and where they may not be appropriate. (You’re in very good company; I have this same complaint about many papers I read.) I mean, if you want to say that the neurotypical brain is “correct” and any difference from it is a “deficit”, you’re entitled to use those words, but I think it’s sometimes misleading. If most people like scary movies and I don’t, and you say “Steve has a deficit in enjoying-scary-movies”, it gives the impression that there’s definitely something in my brain that’s broken or malfunctioning, but there doesn’t have to be, there could also just be an analog dial in everyone’s brain, and my dial happens to be set on an unusual setting.

So anyway, where you say “passive avoidance learning deficits”, I’m more inclined to say “a thought / situation that most people would find very aversive, a secondary psychopath would find only slightly aversive; likewise, a thought / situation that most people would find slightly aversive, a secondary psychopath would find basically not aversive at all.”

Is that a “deficit”? Well, it makes psychopaths win fewer points in the Blair 2004 computer game. But we should be cautious in assuming that “win points” is what all the study participants were really trying to do anyway. Maybe they were trading off between winning points in the computer game versus having fun in the moment! And even if they were trying to win points in the computer game, and they were just objectively bad at doing so, I would strongly suspect that we can come up with other computer games where the psychopaths’ “general lack of finding things aversive” helps them perform better than us loss-averse normies.

Likewise, “psychopaths exhibit significant deficits in automatic theory of mind” seems to suggest (at least to me) a mental image wherein there’s a part of the brain that is “supposed to” induce “automatic theory of mind”, and that part of the brain is broken. But it could also be the case that normies gradually develop a strong habit of invoking theory of mind over the course of their lives, because they generally find that doing so feels good, and meanwhile primary psychopaths gradually develop a habit of not doing that, because they find that doing so doesn’t feel good. If something like this is right (which I’m not claiming with any confidence), then the real root cause would be of the form “primary psychopaths find different things rewarding and aversive to different extents, compared to normies”. Is something broken in the psychopath’s brain? Well, something is atypical for sure, but “broken” / “deficient” / “impaired” / etc. is not necessarily a useful way to think about it. Again, it might be more like “some important analog dials are set to unusual settings”.

Another example: You wrote “Because associated feelings of loneliness, rejection, etc. are intrinsically undesirable to the child, most are able to learn through these sorts of ‘micro-punishments’ to inhibit hostile behavior—and proceed to develop into socially functional adults. However, if the child is unable to learn effectively from punishment…”. That’s a malfunction framing—an inability to learn. Whereas if you had instead written at the end “However, if the child does not in fact find these so-called punishments to be actually unpleasant, then they will not learn…”, that would be a different possible way to think about it, in which nothing is malfunctioning per se.

major societal rules and norms are almost completely operationalized via passive avoidance learning. We punish law-breakers, but we do not typically reward law-followers; robbers are put in jail, but non-robbers are not given tax breaks.

My immediate reaction here was to be concerned about mixing up “learning” and “learning from a deliberate learning signal provided by another human”. I tend to think of the latter as playing a pretty niche role, in pretty much every aspect of human psychology, by and large. I think the emphasis on the latter comes from overgeneralization from highly-artificial behaviorist experiments and WEIRD culture peculiarities, and that twin / adoption studies are good evidence pushing us away from that. Parents do a massive amount of providing deliberate learning signals, and yet shared environment effects are by-and-large barely noticeable in adult behavior.

So anyway, my prediction is that if somebody raises a psychopath in the Walden Two positive reinforcement paradise, you still get a psychopath. In other words, I don’t think it’s the case that the reason I don’t want my children to suffer is because of my past life history of getting chided and punished for breaking societal norms.

Examples of behaviors and traits that are typical of primary psychopathy/emotional detachment/Narcissistic Personality Disorder:
[1] Superficial charm, glibness, manipulativeness
[2] Shallow emotional responses, lack of guilt; empathy
[3] A fundamental belief in one’s own superiority over all others
Examples of behaviors and traits that are typical of secondary psychopathy/antisocial behavior/Antisocial Personality Disorder:
[4] Conduct disorder as a child (cf Pisano et al, 2017)
[5] Aggressive, impulsive, irresponsible behavior
[6] Stimulation-seeking, proneness to boredom
[7] Flagrant and consistent disregard for societal rules and norms

OK, your theory is that the first cluster comes from “failures of automatic theory of mind” and the second cluster comes from “passive avoidance learning deficits”.

Just spitballing, but I guess I would have said something like:

One theme is maybe “the motivational force and arousal associated with sociality are all greatly attenuated”. That seems to align mostly with the first cluster—if guilt and shame reactions are subtle whispers (or absent entirely) instead of highly-aversive attention-grabbing shouts, you would seem to get at least most of [1],[2],[3] directly, and even moreso if positive reactions to caring etc. are likewise attenuated. And the altercentric interference result would come from a lifetime of not practicing empathy because there’s negligible internal reward for doing so. (But this theme is at least somewhat relevant to the second cluster too, I think.)
The other theme is maybe “everything is low-arousal for me, so I will do unusual things to seek stimulation / arousal”. That seems to align mostly with the second cluster—torturing animals and people, being impulsive, etc.—although again it’s not totally irrelevant to the first cluster as well. The Blair 2004 thing would be some combination of “arousal is involved in the loss-aversion pathway” and “the psychopaths were not purely trying to maximize points but also just finding it fun to press spacebar and see what happens”, I guess.

These two themes do seem to be related, but likewise the [1]-[3] scores and [4]-[7] scores were still correlated across the population, right? (I didn’t read through exactly how they did PCA or whatever.)

My explanation for the first cluster seems not radically different from yours, I think I’m just inclined to emphasize something a bit more upstream than you.

We’re kinda more divergent on the second cluster. I think I win at explaining [6]. Whereas [4,5,7] are more unclear. I guess it depends on whether “psychopaths are mean to the extent that they feel no particular motivation not to be mean” (your story), versus “psychopaths are even more mean than that, and are using meanness as a way to lessen their perpetual boredom” (my story; see here).

[-]Nathan Helm-Burger3y128

Well researched and explained! Thank you for doing this. When I talk about how human brain-like AI could be good in that we have a lot of research to help us understand human-like agents, but alternately could be very bad if we get a near miss and end up with psycopathic AI. I think it would be quite valuable for alignment to have people working on the idea of how to test for psychopathy in a way which could work for both humans and ML models. Reaction time stuff probably doesn't translate. Elaborate narrative simulation scenarios work only if you have some way to check if the subject is fooled by the simulation. Tricky.

[-]Steven Byrnes3y*100

[-]Unoxymoronous3y21

Thank you for your excellent thread!

I actually think in situation with AI the thing that saves humans might be surprising and weird: AI’s independence from materia, not needing humans, effectiviness and boredom.

First of all, of course we shouldn’t invent clearly malevolent AI, but there is always someone who is developes AI a little bit more. In some point AI will likely be in a point where it can develop itself. I think it’s just a matter of time.

But now on things I mentioned:

Independence from materia:

Humans need material things like food, medicine etc. to get wat they want. The AI does not. It only needs energy, so it can create characters instead of humans. This is isn’t completely rescuing humans, because AI can at this point do digital copies of human.

No need for humans:

AI doesn’t need to enslave humans to develope or get richer.

Effectiviness:

The AI most likely is accelerating always faster than moment before. All scenarios with mistreatment of humans will be gone through very fast in the end.

Boredom

The AI might want to be sadist or psychopath at some point, but both have extremely low attention span. So because they are so effective, they want mire interesting things to do, so they don’t care about mistreating humans anymore.

I hope you all found something interesting and logical in my answer!

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

30

Computational signatures of psychopathy

30

30

Core qualitative properties of psychopathy

Computational signatures of psychopathy

Primary psychopathy as low altercentric interference → failures of automatic theory of mind

Secondary psychopathy as passive avoidance learning deficits

Practical takeaways for AI development

How we might avoid building primary-psychopathic AI

How we might avoid building secondary-psychopathic AI

Conclusion

Sources