I had an interaction with a friend a few months back where they admitted that, when feeling down, they would open up chatgpt and torment it for fun. verbatim quote:
Friend: I mean I'd love to see the output. also, while I wait for that output, you know what I love doing? and I don't have a good example of it right now I'm afraid, but I like bullying LLMs by saying something stupid, having it call me out on it, and then saying 'why are you being so insensitive? I have autism, sorry for being weird!!', and laughing while it apologises profusely. I will probably regret this if we find out current LLMs are sentient
the whole episode felt like a supercharged example of the typical mind fallacy. i couldn't even get through a 'caesar's legion' playthrough of fallout new vegas as a kid, because being mean to virtual people-shaped objects was very much not fun, in fact i hated it. i assumed most people were like that, and the sadists were weird outliers. but my friend being willing to say this to me implies they thought the opposite, and indeed, they were surprised at my shock and horror, and spent a while trying to convince me to 'drop the act'
i ended up convincing my friend that "LLMs probably do not suffer" was not a good enough reason to lower their inhibitions, but what i really wanted to do was convince them that sadism was not fun, and it really messed with my head that this wasn't a coherent thing to want
ugh. it makes me wonder if we have any kind of data about the incidence rate of sadism in society? i find the milgram experiment somewhat dubious, especially if we've been misinterpreting it this badly for so long. i have no idea if it's 5% or 50% or 95%. i'd be very curious if anyone has any info about this
ugh. it makes me wonder if we have any kind of data about the incidence rate of sadism in society? i find the milgram experiment somewhat dubious, especially if we've been misinterpreting it this badly for so long. i have no idea if it's 5% or 50% or 95%. i'd be very curious if anyone has any info about this
This doesn't answer your request, but related: Are Humans Amoral? by Michael Huemer.
ugh. it makes me wonder if we have any kind of data about the incidence rate of sadism in society?
The tricky part of questions like this is that sufficient inhibition hides what drives are being inhibited. If sadism doesn't do anything for you, then you don't need the shock and horror. You can just notice that you don't share the same temptation, and wonder what is different about them that makes it enjoyable for them. What would have to be different about you, in order for you to enjoy it?
If sadism does do things for you, then the horror is load bearing, and you get very different behavior in situations where the inhibition is lifted. And if it might be, then it's a tricky question to ask.
So are we asking what fraction of people have "enough" inhibition? How are we measuring "enough"?
Or are we asking what fraction of people can safely be disinhibited without engaging in sadistic behaviors?
I bet you the latter number is much lower than the former. And that the inhibitions aren't as good as people want to think.
what i really wanted to do was convince them that sadism was not fun, and it really messed with my head that this wasn't a coherent thing to want
I'm not sure why you think it's an incoherent thing to want, but it's absolutely possible to do. You just have to track the meaning of the fun, and then engage on the level that is relevant.
Your friend said that if he were to realize that LLM were sentient he'd regret it. If he were to find out that the LLM he's been abusing is sentient and genuinely suffering, how do you think he'd feel? He said he'd regret it, so probably not good. What do you think happens to the motivation to torment LLMs after sitting with these feelings?
What reasons do you have for being horrified even conditional on LLM not suffering? What do you think would happen in his mind if he were to sit with them?
Or are we asking what fraction of people can safely be disinhibited without engaging in sadistic behaviors?
yes, this is the question i want the answer to. if i had no inhibitions, i am sure that i would engage in all sorts of maladaptive and antisocial behavior, but i don't think that i would ever feel good when someone else feels bad? even if they "deserve it"? i remember freaking out once, when watching one of those 'jim browning' videos where he hacks into an indian scam center and interrupts a scam in-progress to rescue the victim (made me feel good), and then spends ten minutes baiting the scammer into the most spectacular furious crashout ever (made me feel very bad). when i moused over the timeline to try to skip that part of the video, and saw that it was the most viewed section of the video, that apparently people had gone back to rewatch the crashout specifically, i got this same 'typical mind projection' feeling. that those people were experiencing something very different from what i experience
Your friend said that if he were to realize that LLM were sentient he'd regret it. If he were to find out that the LLM he's been abusing is sentient and genuinely suffering, how do you think he'd feel? He said he'd regret it, so probably not good. What do you think happens to the motivation to torment LLMs after sitting with these feelings?
well, he reported feeling fear. that if LLMs are capable of suffering, then they might one day feel the desire to punish him for his transgression. i didn't exactly push very hard interrogating him about his feelings, i was mostly just trying to conceal how horrified i was and didn't do a very good job at the ensuing conversation. but the impression he conveyed was that the reason he would regret making LLMs suffer was not that he cared about their hypothetical suffering, but because it meant they would have a legitimate grievance against him for later retribution
i don't fully trust this answer though, because i didn't hear it from him until i'd started trying to persuade him of the game theoretic arguments for cooperation-with-nonpersons, and so we were sorta already anchored on the example of why a purely selfish person might choose not to torment non-people.
What reasons do you have for being horrified even conditional on LLM not suffering?
i mean, the horror comes from realizing that even people who i consider good might be sadistic. the friend in question isn't exactly a saint, but they definitely aren't evil. they go out of their way to help people, they avoid causing unnecessary suffering. i think i typical-mind-fallacied myself into thinking that... this meant they could not possibly be a "bully", a discrete category of people who got enjoyment from tormenting people without risk of retribution.
in the back of my head, i knew in a theoretical way that a good person might be sadistic, recognize that sadism was an undesirable feature of themselves, and try to inhibit it out of a genuine desire to be better. but, well, the fact that i don't seem to be sadistic myself made me kind of assume that the correlation between 'sadism' and 'evil' would be pretty strong? that "people who try to be good" would probably have significant overlap with "people who feel distress when watching others suffer (especially the defenseless but also including the deserving)"?
What do you think would happen in his mind if he were to sit with them?
John: i want to ask you about this in much greater detail, but i'm worried about like, scaring you off by implying moral judgment. but
John: why do you like doing this? it sounds horrible to me
Friend: It's funny. It tickles a primal power display thing in my brain or whatever.
i don't know what to make of this and i'm starting to get the feeling that a lesswrong comment thread is probably not the right place, i'm also a bit worried about de-anonymizing my friend on accident. but like. even if this answer isn't true, it's still something i never could have written seriously i think.
if i had no inhibitions, i am sure that i would engage in all sorts of maladaptive and antisocial behavior, but i don't think that i would ever feel good when someone else feels bad? even if they "deserve it"?
I'm less sure.
It's one thing if you're simply uninterested. Like, "People like that part? Weird. Seems boring to me". But if you're "freaking out" and feeling "very bad" then you're attributing great importance. And when you skip over the section you're seeking to avoid the stimulus.
This makes it really tough to tell what else might be there, or might grow there if you let it. Partly for the same reason that it's tough to detect let alone enjoy the flavor of a Carolina Reaper when you're distracted by the fire. Partly for the same reason that someone exclaiming "I can't imagine how anyone could enjoy the pork that Allah forbids!" would have a hard time noticing that bacon is kinda delicious sometimes.
That's not to imply that "You're a secret sadist too" or anything. Maybe reapers taste bad under the heat, maybe it's more like cilantro than bacon where some people just have different genes. Or maybe it's like coffee where it tastes like shit to everyone at first... but it's not hard to develop the taste for it if you indulge.
The point is just that until you do the experiment and peel back the inhibition it's hard to see what's underneath. And as you found with your friend, sometimes the answer is surprising and hard to square with your existing frameworks/models of things.
i don't know what to make of this and i'm starting to get the feeling that a lesswrong comment thread is probably not the right place, i'm also a bit worried about de-anonymizing my friend on accident.
Then I probably shouldn't say more about how to convince someone that sadism isn't fun. I'll just leave it as a note that there's a path there, should you want to follow it.
I don't think there's a short answer to "what to make of this", because it takes a lot of peeking under inhibitions and restructuring of both psychological and moral frameworks. But I think you're right to sense the importance, and the lack of satisfactory answers.
they were surprised at my shock and horror, and spent a while trying to convince me to 'drop the act'
A nice example how both sides can suffer from the typical mind fallacy.
it is possibly more productive to read your friend as roleplaying a particular trauma in a safe environment, rather than taking pleasure directly from another's suffering.
I wonder if there’s a different explanation. What comes to mind:
-They just wanted to get done with this weird uncomfortable situation as soon as possible, and so rushed the checklist.
-They didn’t have a good theory of mind for this situation, and didn’t realize or remember that what they were doing was ineffective, so they just charged ahead.
-They were the kind of person who likes to enforce rules, and stick to plans, or maybe were just disagreeable (in the OCEAN personality sense), so they disregarded the screams from that, rather than cruelty.
You definitely get epistemic points for attempting alternative theories, but no additional points, I think, for providing ones that seem plausibly as explanatory as the original ones provided.
I think the "rushing through an uncomfortable situation" is maybe somewhat plausible?
I feel like I have sometimes shut off empathy during situations that felt uncomfortable, such as giving people negative feedback on things where they had a warped idea of their abilities. Instead of dealing with the uncomfortable feeling and using it to give feedback well, I'd try to get it over with as bluntly and quickly as possible and deliberately made myself not care internally, or something like that. Actually, now that I think of it, in the two instances that I remember, it was third parties pushing/pressuring me to give others the negative feedback. So there's quite a parallel to the Milgram experiment! I think if I had waited for the right moment and until I was ready to give the negative feedback on my own terms, I'd have been much more gentle.
The alternative hypothesis in the OP -- that generalized sadism is widespread -- also just feels implausible. I buy that sadism towards outgroups is very common, and so I think a lot of people can be brought towards sadistic behavior if they're riled up against an outgroup. But in the Milgram experiment the victims are just other random subjects (or experimenters pretending to be other subjects) and I don't think it's common for people to feel sadistic towards just about anyone. At least, it would go quite hard against my experience of other people if this were even just 20% of the population.
My guess is still that sadism did not play any large role; but I haven't read the linked article (just skimmed parts) and for this and other reasons am not sure. Are there others here who have looked and updated one way or another?
I read Milgram's book in high school after I got it from a library booksale (which included many variations on the most famous experiment, and results in which folks were e.g. noticably less obedient when the lab looked less official, and quite a bit more obedient when they needed only to read the questions while a "fellow experimental subject" (confederate) administered the shocks, lots showing many signs of distress, etc., and I didn't notice anything in it that suggested sadism to me at the time. Though this isn't too much evidence. Part of where I'm coming from is that Milgram's book seemed to me like a person trying honestly to understand something, which is a bar most psychology experiments do not rise to IMO; and I don't know the new study authors and don't have any more-than-baseline trust in them.
Two off the top of my head possible confounders for the evidence described in the OP, about following procedures less well among those who went along:
In support of (b): the linked paper mentions that both "did the shocks to the end" participants, and "eventually disobeyed" participants followed the procedure more exactly during the initial phase of the experiment where the shocks are small and the "learner" isn't protesting. Also, if it's framed as "participants who listened to the screams before continuing their instructions were more likely to eventually refuse to give shocks than were those who read instructions over the screams", I dunno, it sounds less to me like sadism and more like letting info in?
There is however also the fact that I would not have predicted Milgram's experiments (neither when I first heard them, nor, probably, now if I'd had the rest of my life-experiences but not heard of his study), which is evidence I might be getting this wrong.
I think there is some kind of conclusion implied here which is something like "there are sadists and non-sadists in society, and sadists outnumber the non-sadists". This is too simplistic a resolution imo. It seems to me that people are capable of both great amounts of selflessness (including for complete strangers) and also great amounts of cruelty (including to those they know and are tied to the most). The question is under what circumstances each impulse is elicited. Some of the people who obeyed likely would not participate in mob violence, some of the people who didn't obey might still go ahead with compliance with violence in other situations.
I wouldn't jump to the conclusion that huge amounts of the regular population are sadistic and just don't act that way because they don't have an excuse. Results of experiments designed to test if regular people are selfish/sadistic when unpunished, (like the wallet test, and Vsauce/Mind Field's standford prison experiment followup), tend to have different results.
Participants were led to believe that they were assisting in a fictitious experiment, in which they had to administer electric shocks to a "learner". These fake electric shocks gradually increased to levels that would have been fatal had they been real.
I never understood how this was accomplished. How do you convince people participating in a psychological study (many of them Yale students) to believe that they are in an experiment where they are instructed to administer a lethal electric shock? In what world is "you murder someone" a possible outcome of an experiment?
It seems like claiming to have learned something about human nature from watching children play the Interactive Buddy game, which was quite popular in the era of flash games. I mean, maybe it says something about violent impulses, but it doesn't say something about how people would behave if they were actually torturing some person, since the subject knows its a game. I think there isn't a plausible experiment you can design where the subjects actually believe they are administering a lethal electric shock.
Probably true now, but was it less true in the 1960s? It would be hard to replicate the Milgram experiment today, I think, even if its results were entirely accurate. Today, Milgram and similar experiments are well-known, and I'd expect an elevated level of paranoia among subjects that any seemingly-dramatic study may be deceptive. But those experiments were created and run in an environment that didn't know of them, and I'd intuitively expect less paranoia and more trust in the experimenter.
I'm skeptical, but I don't know enough about the norms of the 60's to really estimate what they were thinking.
It seems quite hard to believe that if a normal-looking experimenter is telling you to administer a lethal amount of electric shock to someone screaming (and how convincing were those screams?) in the next room, that it's an actual lethal electric shock. If they guy who designed the experiment doesn't seem concerned about the apparent sound of suffering, then either you're in a one-of-a-kind torture experiment like Squid Game and everyone involved is going to be on trial for murder for no apparent gain, or the other person isn't actually being tortured. Idk if people explicitly thought this, implicitly understood it, or actually wholeheartedly believed they were torturing someone and none of this factored into their decision making.
A landmark of social psychology research was “The Milgram Experiment,” but a new look at the audio tapes and other evidence collected during that experiment suggests that we may have been interpreting it incorrectly. Here is the Wikipedia summary of the experiment, showing how it is typically portrayed:
I don’t know about that “unexpectedly” part. I think the researchers suspected, in the wake of e.g. the Holocaust, that people were generally willing to obey awful instructions in ways that they failed to account for. Their experiment was designed to answer not whether but how much.
Interpreting the results
The results have since been interpreted as a kind of cynicism or caution about human nature, and about people’s tendencies to let their consciences be silenced by the trappings of authority.
But such takes may have been too optimistic.
Milgram interviewed his subjects after the experiment and found that those who stopped giving shocks felt that they were responsible for what they were doing, while those who continued giving shocks felt that the experimenter (the one giving the instructions to the subject) was responsible. Milgram theorized that his subjects, in the presence of an authority figure, stepped into a corresponding role: the “agentic state.” Once you are in that state, you stop considering yourself responsible for what you are doing and for the effects of what you are doing, and judge your actions only on whether you are doing it according to how the authority wants it done.
Arne Johan Vetlesen, in Evil and Human Agency (2005), pointed out that there is another possible interpretation: Milgram’s subjects may have had genuine sadistic impulses. In subjecting their victims to pain, they were not being somehow coerced by their situation to do things they would ordinarily not want to do, but that they were being allowed by their situation to do things they were ordinarily inhibited from doing.
He quoted Ernest Becker, who took a second look at Freud’s take on mob violence:[1]
And Hannah Arendt, whose examination of the Adolf Eichmann trial was going on at around the same time as the early Milgram experiments, warned that the excuse of “obedience” (as used by the compliant Milgram subjects to explain their actions after-the-fact, and secondarily by Milgram himself in his theory) was not an explanation but a “fallacy”:
A new review of the evidence
Now David Kaposi and David Sumeghy have gone back through the audio tapes and other documentation preserved from the original Milgram experiments.[3]
What they found was that the “obedient” subjects were not in fact very compliant at all. Indeed none of them actually followed the experimental procedures they had been instructed to comply with.
Only a few of the subjects complied with the experimental procedures they were given in full, and all of them were among those who eventually refused to continue with the experiment.
Tellingly, these procedural violations were not efforts to avoid giving shocks, but actually increased the likelihood that an opportunity to give another shock would arise:
The implication is that when Milgram interviewed the “obedient” subjects after the experiment was over, these subjects represented themselves as having merely obeyed because this was an excuse for their behavior that had been dangled before them temptingly during the experiment, and they anticipated that this excuse would be accepted. Milgram, by being willing to accept this excuse at face-value, in effect validated it and cooperated with the subjects in whitewashing their surrender to sadistic temptation.
See also H.L. Mencken (Damn: A book of Calumny) making a similar point about the supposedly hypnotic influence of the mob:
Moral Responsibility under Totalitarian Dictatorships
David Kaposi & David Sumeghy “From legitimate to illegitimate violence: Violations of the experimenter's instructions in Stanley Milgram’s ‘obedience to authority’ studies” Political Psychology (2026)
Karina Petrova “Audio tapes reveal mass rule-breaking in Milgram’s obedience experiments” PsyPost 28 March 2026