This post has been recorded as part of the LessWrong Curated Podcast, and can be listened to on Spotify, Apple Podcasts, and Libsyn.

I often object to claims like "charity/steelmanning is an argumentative virtue". This post collects a few things I and others have said on this topic over the last few years.

My current view is:

  • Steelmanning ("the art of addressing the best form of the other person’s argument, even if it’s not the one they presented") is a useful niche skill, but I don't think it should be a standard thing you bring out in most arguments, even if it's an argument with someone you strongly disagree with.
  • Instead, arguments should mostly be organized around things like:
    • Object-level learning and truth-seeking, with the conversation as a convenient excuse to improve your own model of something you're curious about.
    • Trying to pass each other's Ideological Turing Test (ITT), or some generalization thereof. The ability to pass ITTs is the ability "to state opposing views as clearly and persuasively as their proponents".
      • The version of "ITT" I care about is one where you understand the substance of someone's view well enough to be able to correctly describe their beliefs and reasoning; I don't care about whether you can imitate their speech patterns, jargon, etc.
    • Trying to identify and resolve cruxes: things that would make one or the other of you (or both) change your mind about the topic under discussion.
  • Argumentative charity is a complete mess of a concept⁠—people use it to mean a wide variety of things, and many of those things are actively bad, or liable to cause severe epistemic distortion and miscommunication.
  • Some version of civility and/or friendliness and/or a spirit of camaraderie and goodwill seems like a useful ingredient in many discussions. I'm not sure how best to achieve this in ways that are emotionally honest ("pretending to be cheerful and warm when you don't feel that way" sounds like the wrong move to me), or how to achieve this without steering away from candor, openness, "realness", etc.

I've said that I think people should be "nicer and also ruder". And:

The sweet spot for EA PR is something like: 'friendly, nuanced, patient, and totally unapologetic about being a fire hose of inflammatory hot takes'. 🙂

I have an intuition that those are pieces of the puzzle, along with (certain aspects or interpretations of) NVC tech, circling tech, introspection tech, etc. But I'm not sure how to hit the right balance in general.

I do feel very confident that "steelmanning" and "charity" aren't the right tech for achieving this goal. (Because "charity" is a bad meme, and "steelmanning" is a lot more niche than that.)


Things other people have said

Ozy Brennan wrote Against Steelmanning in 2016, and Eliezer Yudkowsky commented:

Be it clear: Steelmanning is not a tool of understanding and communication. The communication tool is the Ideological Turing Test. "Steelmanning" is what you do to avoid the equivalent of dismissing AGI after reading a media argument. It usually indicates that you think you're talking to somebody as hapless as the media.

The exception to this rule is when you communicate, "Well, on my assumptions, the plausible thing that sounds most like this is..." which is a cooperative way of communicating to the person what your own assumptions are and what you think are the strong and weak points of what you think might be the argument.

Mostly, you should be trying to pass the Ideological Turing Test if speaking to someone you respect, and offering "My steelman might be...?" only to communicate your own premises and assumptions. Or maybe, if you actually believe the steelman, say, "I disagree with your reason for thinking X, but I'll grant you X because I believe this other argument Y. Is that good enough to move on?" Be ready to accept "No, the exact argument for X is important to my later conclusions" as an answer.

"Let me try to imagine a smarter version of this stupid position" is when you've been exposed to the Deepak Chopra version of quantum mechanics, and you don't know if it's the real version, or what a smart person might really think is the issue. It's what you do when you don't want to be that easily manipulated sucker who can be pushed into believing X by a flawed argument for not-X that you can congratulate yourself for being skeptically smarter than. It's not what you do in a respectful conversation.

In 2017, Holden Karnofsky wrote:

  • I try to avoid straw-manning, steel-manning, and nitpicking. I strive for an accurate understanding of the most important premises behind someone's most important decisions, and address those. (As a side note, I find it very unsatisfying to engage with "steel-man" versions of my arguments, which rarely resemble my actual views.)

And Eliezer wrote, in a private Facebook thread:

Reminder: Eliezer and Holden are both on record as saying that "steelmanning" people is bad and you should stop doing it.

As Holden says, if you're trying to understand someone or you have any credence at all that they have a good argument, focus on passing their Ideological Turing Test. "Steelmanning" usually ends up as weakmanning by comparison. If they don't in fact have a good argument, it's falsehood to pretend they do. If you want to try to make a genuine effort to think up better arguments yourself because they might exist, don't drag the other person into it.


Things I've said

In 2018, I wrote:

When someone makes a mistake or has a wrong belief, you shouldn't "steelman" that belief by replacing it with a different one; it makes it harder to notice mistakes and update from them, and it also makes it harder to understand people's real beliefs and actions.

"What belief does this person have?" is a particular factual question. Steelmanning, like "charity", is sort of about unfocusing your eyes and tricking yourself into treating the factual question as though it were a game: you want to find a fairness-preserving allocation of points to all players, where more credible views warrant more points. Some people like that act of unfocusing because it's fun to brainstorm new arguments; or they think it's a useful trick for reducing social conflict or resistance to new ideas. But it's dangerous to frame that unfocusing as "steelmanning" or "charity" rather than explicitly flagging "I want to change the topic to this other thing your statement happened to remind me of".

In 2019, I said:

Charity seems more useful for rhetoric/persuasion/diplomacy; steel-manning seems more useful for brainstorming; both seem dangerous insofar as they obscure the original meaning and make it harder to pass someone's Ideological Turing Test.

"Charity" seems like the more dangerous meme to me because it encourages more fuzziness about whether you're flesh-manning [i.e., just trying to accurately model] vs. steel-manning the argument, and because it has more moral overtones. It's more epistemically dangerous to filter your answers to factual questions by criteria other than truth, than to decide to propose a change of topic.

[...] I endorse "non-uncharitableness" -- trying to combat biases toward having an inaccurately negative view of your political enemies and so on.

I worry that removing the double negative makes it seem like charity is an epistemic end in its own right, rather than an attempt to combat a bias. I also worry that the word "charity" makes it tempting to tie non-uncharitableness to niceness/friendliness, which makes it more effortful to think about and optimize those goals separately.

Most of my worries about charity and steelmanning go away if they're discussed with the framings 'non-uncharitableness and niceness are two separate goals' and 'good steelmanning and good fleshmanning are two separate goals', respectively.

E.g., actively focus on examples of:

  • being epistemically charitable in ways that aren't nice, friendly, or diplomatic.
  • being nice and prosocial in ways that require interpreting the person as saying something less plausible.
  • trying to better pass someone's Ideological Turing Test by focusing on less plausible claims and arguments.
  • coming up with steelmen that explicitly assert the falsehood of the claim they're the steelman of.

I also think that the equivocation in "charity" is doing some conversational work.

E.g.: Depending on context and phrasing, saying that you're optimizing for friendliness can make you seem manipulative or inauthentic, or it can seem like a boast or a backhanded attack ("I was trying to be nice when I said it that way" / "I'm trying to be friendly".) Framing a diplomatic goal as though it were epistemic can mitigate that problem.

Similarly, if you're in an intellectual or academic environment and you want to criticize someone for being a jerk, "you're being uncharitable" is likely to get less pushback, not only because it's relatively dry but because criticisms of tone are generally more controversial among intellectuals than criticisms of content.

"You're being uncharitable" is also a common accusation in a motte-and-bailey context. Any argument can be quickly dismissed if it makes your conclusion sound absurd, because the arguer must just be violating the principle of charity. It may not even be necessary to think of an alternative, stronger version of the claim under attack, if you're having an argument over twitter and can safely toss out the "That sounds awfully uncharitable" line and then disappear in the mist.

... Hm, this comment ended up going in a more negative direction than I was intending. The concerns above are important, but the thing I originally intended to say was that it's not an accident "charity" is equivocal, and there's some risk in disambiguating it without recognizing the conversational purposes the ambiguity was serving, contra my earlier insistence on burning the whole thing down. It may be helping make a lot of social interactions smoother, helping giving people more cover to drop false views with minimal embarrassment (by saying they really meant the more-charitable interpretation all along), etc.

(I now feel more confident that, no, "charity" is just a bad meme. Ditch it and replace it with something new.)

From 2021:

The problem isn't 'charity is a good conversational norm, but these people are doing it wrong'; the problem is that charity is a bad conversational norm. If nothing else, it's bad because it equivocates between 'be friendly' norms and 'have accurate beliefs about others' norms.

Good norms:

  • Keep discussions civil and chill.
  • Be wary of biases to strawman others.
  • Try to pass others' ITT.
  • Use steelmen to help you think outside the box.

Bad norms:

  • Treat the above norms as identical.
  • Try to delude yourself about how good others' arguments are.

From 2022:

I think the term 'charity' is genuinely ambiguous about whether you're trying to find the person's true view, vs. trying to steel-man, vs. some combination. Different people at different times do all of those things and call it argumentative 'charity'.

This if anything strikes me as even worse than saying 'I'm steel-manning', because at least steel-manning is transparent about what it's doing, even if people tend to underestimate the hazards of doing it.


32 comments, sorted by Click to highlight new comments since: Today at 4:21 PM
New Comment

I like this clarification. Thank you. I think I mostly just agree.

A nuance I find helpful for one piece:

Some version of civility and/or friendliness and/or a spirit of camaraderie and goodwill seems like a useful ingredient in many discussions. I'm not sure how best to achieve this in ways that are emotionally honest ("pretending to be cheerful and warm when you don't feel that way" sounds like the wrong move to me), or how to achieve this without steering away from candor, openness, "realness", etc.

I think the core thing here is same-sidedness.

That has nothing to do directly with being friendly/civil/etc., although it'll probably naturally result in friendliness/etc.

(Like you seem to, I think aiming for cheerfulness/warmth/etc. is rather a bad idea.)

If you & I are arguing but there's a common-knowledge undercurrent of same-sidedness, then even impassioned and cutting remarks are pretty easy to take in stride. "No, you're being stupid here, this is what we've got to attend to" doesn't get taken as an actual personal attack because the underlying feeling is of cooperation. Not totally unlike when affectionate friends say things like "You're such a jerk."

This is totally different from creating comfort. I think lots of folk get this one confused. Your comfort is none of my business, and vice versa. If I can keep that straight while coming from a same-sided POV, and if you do something similar, then it's easy to argue and listen both in good faith.

This is totally different from creating comfort. I think lots of folk get this one confused. Your comfort is none of my business, and vice versa. If I can keep that straight while coming from a same-sided POV, and if you do something similar, then it's easy to argue and listen both in good faith.

I agree that same-sidedness and comfort are totally different things, and I really appreciate the bluntness of same-sidedness as a term. I I also think you are undervaluing comfort here. People who are not comfortable do not reveal their true beliefs; same-sidedness doesn't appear to resolve this problem because people who are not comfortable do not reveal their true beliefs even to themselves.

I think the core thing here is same-sidedness.

The converse of this is that the maximally charitable approach can be harmful when the interlocutor is fundamentally not on the same side as you, in trying to honestly discuss a topic and arrive at truth. I've seen people tie themselves in knots when trying to apply the principle of charity, when the most parsimonious explanation is that the other side is not engaging in good faith, and shouldn't be treated as such. 

It's taken me a long time to internalise this, because my instinct is to take what people say at face value. But its important to remember that sometimes there isn't anything complex or nuanced going on, people can just lie.

The sense of "charity" I like is curiosity about the facts relevant to passing ITT, but exercise of charity in this sense doesn't involve intention to actually gain practical skills needed to pass ITT. It's a theoretician's counterpart to ITT, a more natural concept of curiosity that doesn't pull in the additional requirements of producing something sufficiently comprehensive to have a chance of possibly being applicable in practice.

An uncharitable attitude in this sense is lack of curiosity about the nature of someone's thinking, especially apparently illegible thinking in an unfamiliar paradigm, or illegible thinking that produces stupid or abhorrent conclusions. Intending to master this sort of thinking well enough to pass ITT sets a high bar, pointing towards what is almost always wasted effort, but glimpsing some additional elements of the unfamiliar paradigm is often enlightening, and an uncharitable attitude in this sense prevents that, doesn't keep the model-building process going.

To quote Scott Alexander, what serves as the original source for this sense of the word as introduced to LW:

This blog does not have a subject, but it has an ethos. That ethos might be summed up as: charity over absurdity.

Absurdity is the natural human tendency to dismiss anything you disagree with as so stupid it doesn’t even deserve consideration. In fact, you are virtuous for not considering it, maybe even heroic! You’re refusing to dignify the evil peddlers of bunkum by acknowledging them as legitimate debate partners.

Charity is the ability to override that response. To assume that if you don’t understand how someone could possibly believe something as stupid as they do, that this is more likely a failure of understanding on your part than a failure of reason on theirs.

The title says 'steelmanning is niche'. I felt like the post didn't represent the main niche I see steelmanning (and charity) as useful for.

The way I see it, the main utility of steelmanning is when you are ripping apart someone's argument

If philosophy paper A disagrees with philosophy paper B, then A had better do the work of steelmanning. I don't primarily want to know what the author of B really thinks; I primarily want to know whether (and which of) their conclusions are correct. If the argument in B was bad, but there's a nearby argument that's good, paper A needs to be able to notice that.

Otherwise, philosophy papers can get bogged down in assessing specific arguments and counterarguments and lose sight of the stuff they were trying to learn more about.

The same holds true outside of academia. Suppose I'm in a family discussion planning a vacation, and Alice is shooting down Bob's proposal of going to Fun Park. Bob keeps trying to come up with different ways it could make sense to go to Fun Park. It's useful for Alice to do some of this thinking as well. First, Alice might think of a workable plan involving Fun Park. Second, Alice can pre-empt some of Bob's proposals by pointing out why they won't work. 

This can be friendly and cooperative in some situations (like if Alice finds a way to make it to Fun Park while satisfying all the other constraints of the vacation). It can also be the opposite (when Alice is explaining why all Bob's plans are dead ends).

The OP focuses a lot on how steelmanning can make you lose focus on actually understanding the other person. My feeling is that steelmanning helps orient toward the world; that is, toward the object-level questions being discussed. It's not a tool you bring out when your bottleneck is understanding the other person.

(I agree that the term 'charity' is ambiguous and can therefore easily be misused, but my experience of 'the principle of charity' in academia is that it really primarily means steelman. For example, it would be 'charitable' in an informal sense to assume that data was collected in an unbiased way; but in my experience it's firmly on the shoulders of the research paper to specify that, and people don't suggest "charitable readings" where you assume there's no sampling bias. It wouldn't be a working steelman of an argument, since it requires unwarranted assumptions.)

I agree, in the sense that any good treatment of 'is P true?' should consider the important considerations both for believing P and for not believing P. I don't care about 'steel-manning' if you're replacing a very weak argument with a slightly less weak argument; but I do care if you're bringing in a strong argument.

(Indeed, I care about this regardless of whether there's a weaker argument that you're 'steel-manning'! So 'steel-man' is a fine reminder here, but it's not a perfect description of the thing that really matters, which is 'did I consider all the strong arguments/evidence on both sides?'.)

I'll note that 'steel-manning' isn't exclusively used for 'someone else believes P; I should come up with better arguments for P, if their own arguments are insufficient'. It's also used for:

  • Someone believes P; but P is obviously false, so I should come up with a new claim Q that's more plausible and is similar to P in some way.

In ordinary conversation, people tend to blur the line between 'argument', 'argument-step', and 'conclusion/claim'. This is partly because the colloquial word 'argument' is relatively vague; partly because people rarely make their full argument explicit; and partly because 'what claim(s) are we debating?' is usually something that's left a bit vague in conversation, and something that freely shifts as the conversation progresses.

All of this means that it's hard to enforce a strict distinction (in real-world practice) between the norm 'if you're debating P with someone, generate and address the best counter-arguments against your view of P, not just the arguments your opponent mentioned' and the norm 'if someone makes a claim you find implausible, change the topic to discussing a different claim that you find more plausible'.

This isn't a big deal if we treat steelmanning as niche, as a cool sideshow. But if we treat it as a fundamental conversational virtue, I think (to some nontrivial degree) it actively interferes with understanding and engaging with views you don't agree with, especially ones based on background views that are very novel and foreign to you.

Yes, there is a difference between "trying to understand precisely what this person actually meant" and "trying to salvage a potentially useful insight even from a generally horrible argument". The former is about learning their perspective... without necessarily believing it; it may include memorizing their mistakes. The latter is about enriching your perspective... quite likely in a way they would not approve of.

It would be factually wrong to assume that everyone is secretly a rationalist just like you, only operating with a different set of priors and having different experience. Some people truly are stupid, and you can pass their ITT just by saying "har har stupid outgroup". Most people are somewhere in between: a good argument or two, sometimes taken out of context, ignoring all arguments in the opposite direction, with a flavor of "and anyway, we are the good people (even if we occassionally make a mistake, hypothetically speaking) and they are the bad ones."

ITT is useful -- to correctly model your opponents and predict their actions; perhaps to be able to infiltrate them, and then maybe subvert towards your own goals.

Updating your map using true information that is typically only found in your opponent's maps is also useful, in the sense that having a better map is useful generally.

Then there is also a social process somewhere in between, when you placate your opponent by showing respect to certain parts of their map that you consider useful, and then you just "agree to disagree" about the rest.

I like this analysis! Makes the classification and pros and cons clear.

Just to add to it: I think steelmanning is useful as being 2 steps downstream of ITT. This is because ITT is a good model of someone else's views, not of their underlying logic or motives or desires, mostly because most of us are not very good at introspection. ITT reproduces this surface-level reaction without necessarily analyzing the reasons for it, but just based on the aggregation of the collected data. The intermediate step between ITT and steelmanning would be a "gears-level" model of someone. Once you have that model, you can analyze it and extract some useful information from it, by refining the model to be more self-consistent and reflecting the parts of the territory you may have missed.

Can you give examples of situations where "charity" has been used in ways that have the negative kinds of effects you're worried about? Maybe we just have different experiences, but to me the term is normally just used in the "non-uncharitableness" sense that you mention. So in practice it doesn't seem to cause any epistemic or communicative damage that I could tell.

ETA: See also

ETA2: Quoting a relevant bit:

I think person A often hopes that person B will either confirm that “yes, that’s a pretty accurate summary of my position,” or “well, parts of that are correct, but it differs from my actual position in ways 1, 2, and 3” or “no, you’ve completely misunderstood what I’m trying to say. Actually, I was trying to say [summary of person B’s position].”

One may hope for something like this, certainly. But in practice, I find that conversations like this can easily result from that sort of attitude:

Alice: It’s raining outside.

Bob, after thinking really hard: Hmm. What I hear you saying is that there’s some sort of precipitation, possibly coming from the sky but you don’t say that specifically.

Alice: … what? No, it’s… it’s just raining. Regular rain. Like, I literally mean exactly what I said. Right now, it is raining outside.

Bob, frowning: Alice, I really wish you’d express yourself more clearly, but if I’m understanding you correctly, you’re implying that the current weather in this location is uncomfortable to walk around in? And—I’m guessing, now, since you’re not clear on this point, but—also that it’s cloudy, and not sunny?



Alice: Dude. Just… it’s raining. This isn’t hard.

Bob, frowning some more and looking thoughtful: Hmm…

And so on.


From the first linked comment:

Yeah, sorry for being imprecise in my language. Can you just be charitable and see that my statement make sense if you replace “VNM” by “Dutch book” ?

I'm not sure if this is a case of "corruption" so much, as your interlocutor just suffering from illusion of transparency and thinking that it's obvious to everyone that a sentence that replaces "VNM" by "Dutch book" is what they originally meant. IME there's a very common failure mode of person A thinking that person B is being uncharitable and nitpicky when it's actually the case that A's intended meaning is much less clear to B than A assumes. But this would easily happen even when sticking to the "charitability as non-uncharitability" interpretation, since the problem is that A is incorrectly perceiving B to be uncharitable.

The "other side of the corruption" thing that you mention also seems like a case of someone applying the "don't dismiss an argument because you think the person presenting it is stupid/evil" rule (which I interpreted you to endorse, and which seems compatible with non-uncharitability), but in a mistaken manner.

From the second comment / your quoted dialogue: I think that that kind of an attempt at clarifying the other person's intent would also fall under the kinds of behaviors Rob endorses? He can correct me if I'm wrong, but I think it's not the kind of distortion he's concerned about. Even if it wasn't, I'm not sure that "trying to be charitable" is the problem there; rather it's that Bob literally doesn't understand what Alice is trying to say. (And it seems better to at least make that obvious and have the conversation stall there, than to miss that fact and continue the discussion in such a way that both parties are thinking they understand the other when they actually don't.)

The linked post had quite a lot of discussion of this sort of thing in the comments, and I hesitate to recapitulate it all, so please forgive the incompleteness of this reply… that said:

From the second comment / your quoted dialogue: I think that that kind of an attempt at clarifying the other person’s intent would also fall under the kinds of behaviors Rob endorses? He can correct me if I’m wrong, but I think it’s not the kind of distortion he’s concerned about.

Perhaps, but if so, then my reply would be that Rob’s view does not go far enough!

Even if it wasn’t, I’m not sure that “trying to be charitable” is the problem there; rather it’s that Bob literally doesn’t understand what Alice is trying to say.

Yes, indeed he does not, but the point I was trying to make there is that Bob’s attempts to “charitably understand” Alice’s words get him further from understanding, instead of closer to it.

I mean, how plausible is it, really, that Alice says “it’s raining outside” and Bob just doesn’t get what the heck Alice is talking about? No doubt the natural way to read this fictional dialogue is to see the depicted subject matter as metaphorical, but just try reading it literally—it’s ridiculous, right? Alice is saying something perfectly ordinary and straightforward. How can Bob not get it? Is he crazy or stupid or what?

What I’m trying to convey is that being on the receiving end of this sort of “charitableness” often feels like being Alice in the dialogue. You just want to yell “No! Stop it! Stop trying to interpret my words ‘charitably’! Just read what I’m actually saying! This isn’t complicated!”

Anyway, we’re definitely in “recapitulating old discussion” territory now, so I’ll leave it at that… most of what I could say on the matter, I already said, so by all means check out my many other comments on that post.

(And it seems better to at least make that obvious and have the conversation stall there, than to miss that fact and continue the discussion in such a way that both parties are thinking they understand the other when they actually don’t.)

Indeed. To quote yet another of my comments on that same post:

The problem, really, is—what? Not misunderstanding per se; that is solvable. The problem is the double illusion of transparency; when I think I’ve understood you (that is, I think that my interpretation of your words, call it X, matches your intent, which I assume is also X), and you think I’ve understood you (that is, you think that my interpretation of your words is Y, which matches what you know to be your intent, i.e. also Y); but actually your intent was Y and my interpretation is X, and neither of us is aware of this composite fact.

As I say in the comment, however, I think that attempts at “charity” are actually the opposite of a good solution to this!

How can Bob not get it? Is he crazy or stupid or what?


It looks to me like Bob doesn't respect Alice enough to fully listen to her, and much prefers the sound of his own voice. As a consequence, he truly doesn't understand her. A combination of status dynamics and not-concentrating humans not being general intelligences.

I would describe the dialogue by saying that Bob is steelmanning Alice's claim, while being uncharitable and remaining unaware of equivocation between the original claim and the steelmanned claim.

In terms of paradigms, Alice is making a factual claim, while Bob doesn't understand or tolerate the paradigm of factual claims, and is instead familiar with the paradigms of social implications and practical advice, uses for facts rather than facts considered in themselves. The charitable thing for Bob would be to figure out the paradigm of factual claims (and make use of it for the purposes of this conversation), the point of view that focuses on knowledge in itself while abstracting from its possible applications. Trying to find subtext is Bob's way of steelmanning the claim, recasting it into a shape that's more natural for the paradigms Bob understands (or insists on).

Equivocation between the claim in the paradigm of factual claims and the same claim in the paradigm of practical applications is unnecessary confusion that could be avoided. The steelmanning aims to improve the argument by performing a centrality-seeking translation, a way of making a concept more resilient to equivocation. And it would make the conversation more robust rather than more confusing, had Bob been aware of the issue of paradigms, another hidden argument that matters when interpreting language, in this case a difference in paradigms (ways of understanding) rather than a difference in preference (subjectively selected).

I am somewhat confused that you provide that comment thread as an example of charity having negative effects, when the thing that spawned that entire thread, or so it seems to me, was insufficient charity / civility / agreeableness (as e.g. evidenced by several negative-karma comments).

It hardly needs saying, but: I do not agree with your assessment.

I figured, which is why I moderated my statement as only "somewhat" confused :).

Can you give examples of situations where “charity” has been used in ways that have the negative kinds of effects you’re worried about?

This comment comes to mind:

But do you think he actually said [Y]?

I don't think he said it clearly, and I don't think he said anything else clearly. Believe it or not, what I am doing is charitable interpretation...I am trying to make sense of what he said. If he thinks [X], that would imply "[X], so [Y]", because that makes more sense than "[X], so don't [Y]". So I think that is what he is probably saying.

I think I basically agree with Rob about the importance of the thing he's pointing to when he talks about the importance of "Trying to pass each other's Ideological Turing Test", but I don't like using the concept of the ITT to point to this.

It's a niche framing for a concept that is general & basic. "Understand[ing] the substance of someone's view well enough to be able to correctly describe their beliefs and reasoning" is a concept that it should be possible to explain to a child, and if I was trying to explain it to a child I would not do that via the Turing Test and Bryan Caplan's variant.

The Ideological Turing Test points to the wrong part of the process. Caplan says that the ability to pass ITTs is the ability "to state opposing views as clearly and persuasively as their proponents." This means that the ITT is a way of testing whether a person has that ability to state opposing views well. But what we care about in practice is whether a person has that ability & is using it, not how they do on this particular way of testing whether they have that ability. This kind of imitation test usually doesn't come into play.

There have been several ITT contests, where authors write short essays about either their own views or the views of an ideological opponent, and guessers try to tell which essays represented the authors own views. The vibe of these contests doesn't match the thing that matters in conversation, and 'pretend to be a proponent of X' can be pretty different than understanding a particular person's views. The contests involve pretending to be something you're not, they involve capturing the style and not just the substance, and they involve portraying a member of the group rather than getting a single person's views.

We need better terminology for disagreements, arguments, and similar conversations.

It sounds like what you're saying is:

  • Understand the other person's core claim and their argument for it. Their core claim and argument includes their beliefs about the world, and their values. It does not include their rhetoric.
  • Try to find a consensus on your key points of disagreement in the argument for the core claim. Key points of disagreement ("cruxes") are the ones that, if you changed your mind on them, you'd change your mind about the core claim.
  • Occasionally, it might be helpful to state a version of your debate partner's argument or claim  that you think is more correct or compelling ("steelman"). However, this is risky. You might misrepresent the other person's claims and create confusion, or create an impression that you think they're less intelligent or informed than you are.
  • Be straightforward about what you think.
  • In terms of your tone, try to be a tolerable debate partner, without being fake-nice.

I appreciated this post and found its arguments persuasive. Thanks for writing this!

The one thing I wish had been different was that the essay extensively argues against "argumentative charity", but I never got a sense of what exactly the thing is that's being argued against.

Steelmanning and the Ideological Turing Test get extensive descriptions, while argumentative charity is described as "a complete mess of a concept⁠". Which, fair enough, but if the concept defies definition, I'd instead appreciate a couple examples to understand what not to do.

I figured I'd at least get a sense of the problem of charity from the extensive quotes in "Things I've said", but even there I felt like the quotes expected me to already know what exactly this "charity" thing is. Unfortunately, I have a rough idea at best.

Not everybody knows.

I'm glad I've got something really concise to link to for this now that gathers the various arguments into one place; I'm really tired of trying to rehash this! The way to 'win' an argument is to understand the other person's belief; the way to 'win' knowledge is to understand the best hypotheses you can generate. These are not the same game!

  • Keep discussions civil and chill.
  • Be wary of biases to strawman others.
  • Try to pass others’ ITT.
  • Use steelmen to help you think outside the box.

Isn't there overlap in execution of these norms that justifies general term "charity"?

it makes it harder to notice mistakes and update from them

I'm not sure explicitly pointing out mistakes to people instead of presenting better version of their argument will more often make them update.


I think the nuances of how to have good intellectual conversation is really important. I'm not sure I agree with all the points here, but do agree with many of them, and I think Rob makes at least a reasonable case for each of them. I like this post for moving the conversation forward, and look forward to debating some of the points here.

Steelmanning might be particularly useful in cases where we have reason to believe those who have engaged most with the arguments are biased toward ones side of the debate.

As described in But Have They Engaged with the Arguments?, perhaps a reason many who dismiss AI risk haven't engaged much with the arguments is the selection effect of engaging more if the first arguments one hears seems true. Therefore it might be useful to steelman arguments by generally reasonable people against AI risk that might seem off due to lack of engagement with existing counterarguments, to extract potentially relevant insights (though perhaps an alternative is funding more reasonable skeptics to engage with the arguments much more deeply?).

But the Steelman usecase is never "I shall hereby tell you my Steelman of your views, listen well!" The usefulness of the concept is just that it reminds you not to strawman, were people actually unironically stopping conversations to make the other person listen to their Steelman?I've personally always used it more as an inner force pushing me towards "my interlocutor is not stupid, this easily demolished argument that I think they're making is likely not the one that they actually believe". It's also a force pushing me towards actually modelling my opponent, instead of just barely listening to them enough to hear a weakness and then proceeding to try to demolish it.

Something that I've noticed that I sometimes do in online discussions:  I look at what my interlocutor said, which is less explicit than I'd like.  I imagine a "most likely" interlocutor based on that utterance, and when that turns out to be a person I don't like, I try to be "charitable" by imagining the most sympathetic interlocutor who might have said the same thing.

I don't want to offend the hypothetical sympathetic person, but I'm also upset about what the other hypothetical person is saying.  I end up trying to craft a reply to both of them at once; e.g. something that will sound to the sympathetic interlocutor like a polite request for clarification, while simultaneously acting as a refutation of the stupid position of the fool I think I'm probably talking to.

I've noticed that this strategy has a poor success rate.  Mostly my interlocutor becomes confused.

What I'd like to do more of--but which I seem to have trouble remembering in the moment--is just honestly asking clarifying questions.

I think the "just ask questions" strategy is less salient to me because it doesn't address anything immediately.  I'm leaving the conversation in a state where, if I failed to check back, or if the other person never replied, I would never have expressed my own position.  It feels like failing to defend against an attack.

Despite this, I think asking questions is often a good strategy and I am currently under-utilizing it.  I think most people want to be understood and are happy to answer questions from someone honestly trying to understand them.

I think that steelmanning a person is usually a bad idea, rather one should steelman positions (when one cares about the matter to which the positions are relevant).

I claim this avoids a sufficient swath of the OP's outlined problems of steelmanning for the articles claim of 'nicheness', and that the semi tautology of 'appropriate steelmanning is appropriate' more accurately maps reality. 

"The problem isn't 'charity is a good conversational norm, but these people are doing it wrong'; the problem is that charity is a bad conversational norm. If nothing else, it's bad because it equivocates between 'be friendly' norms and 'have accurate beliefs about others' norms."
here we can see a bad use case for steelmanning (having accurate beliefs about others) which makes me wonder if its not a question of doing it wrong? (contra to OP).  I also notice that i think most people should have less conversations about what people think, and more conversations about what is true (where steelmanning becomes again more relevant), and wonder where you fall (because such a thing might be upstream?).  

I also am apparently into declaratives today. 

(meta: written without much rigor or edits rather then unwritten, ) 

Steelmanning is useful as a technique because often the intuition of somebody’s argument is true even if the precise argument they are using is not. If the other person is a rationalist, then you can point out the argument’s flaws and expect them to update the argument to more precisely explore their intuition. If not, you likely have to do some of the heavily lifting for them by steelmanning their argument and seeing where its underlying intuition might be correct.

This post seems only focused on the rationalist case.

To me it sounds as if the "don't steelman" + ITT could be achieved by starmanning:

I'll give a quick outline of my own approach to this issue. Disclaimer: This is where I mention that I'm on the autism spectrum, so this is me neckbearding my way out of sucking at all of this.

I'm going with privacy erosion as an example: Someone On The Internet is arguing that there should be no privacy On The Internet (because of the children).

First, I assume that my opponent is arguing in good faith and try to reverse-engineer their mindset.

  • If some of my axioms were <what I guess theirs might be>, would I agree with them? 
    Is there a benefit I'm not seeing? 
    • Assuming that every human being is rotten at the core and that a paedophile is lurking in each of us, could enough surveillance actually make children safer?
    • If I was rotten at the core and a paedophile was waiting to come out, maybe enough surveillance would make me stay on the straight and narrow? (In that case, I probably wouldn't admit that to myself and be very unreasonable!)
  • If I had made <hypothetical experience>, would this standpoint be viable, cost/benefit wise? 
    • Maybe they were doxxed by a qAnon mob. Their perceived cost might be zero or net-negative, but they have a lot to gain (the police could finally track down the bad guys!).
    • Maybe they're traumatized in some way.  Traumatized people aren't very rational, and they might not care about the cost.

Second I run the same process assuming they aren't arguing in good faith.
That doesn't mean that they are aware of that. Humans are extremely good at lying to themselves, myself not excluded, see Elephant in the Brain.

  • They could be trolling. (Is this a cry for attention? For help? For sympathy?)
    They could be trying to increase their status by lowering mine.
    (Fortunately, NVC will obliterate both of those patterns and leave me on the moral high ground)
  • They could be signalling to their ingroup by making the right noises, and this was never an argument.

In any case, they have been in a position where saying that stupid/hateful/hurtful/uncharitable/etc thing was the outcome of their optimization strategies — out of all the dialogue options available, they chose this. People in general can be incredibly stupid (again myself not excluded), but we are very good at optimizing for something. If I find what that is they've been optimizing for, I can understand their position, and probably pass their ITT.

And finally, if the stupid/hateful/hurtful/uncharitable/etc statement was directed at me and I'm emotionally compromised, I go through the final loop:

  • their statement is about them, not about me.

I find that I never need to be charitable, because I can always provide reasons (not excuses) why people would be acting the way they do.


Maybe I'm completely wrong and there are better ways to go about it, If so, I'd love to hear!

I agree that steelmanning is bad and don’t know what to think of the “charity” cluster of principles (I at least think you should strive to understand what people said and respond as exactly as possible to what they said, not to what seems to you to be the strongest and most rational interpretation; that should only be a consideration for interpreting correctly what they said, if an interpretation being stronger makes it more likely that it was their interpretation; doing otherwise would just not be worth it even if only because it caused misunderstandings, but you’re also liable to be wrong about what interpretation is strongest and understanding people right is hard enough already that you should just not give yourself that kind of additional work because additional effort is better invested in understanding better), but I also generally don’t like the framing of argumentative virtues or the concern for diplomacy, when those work against common discourse patterns. If some discourse patterns are very common in debates, instead of working hopelessly against them, you can find ways to make use of them for your benefit. For example, you can apply specialization and trade to arguments. Two big bottlenecks on figuring out the truth through rational argument are manpower and biases and that helps with both (and especially with manpower, which I think is probably the most important bottleneck anyway).

The situation where this can benefit you is when argument spaces are large, for example when there are a lot of arguments and counterarguments on many sides of a complex issue, including often many incompatible lines of arguments for the same conclusions, and you can’t explore the full space of arguments on even a single side yourself, unless perhaps you spend weeks doing so as your main activity. So there is no way you can explore the arguments on all sides.

Instead, you can adopt the view that seems the most likely to be true to you (you can revise that as you get more information) and try to find arguments supporting that view and not try very hard to find arguments opposing it. This is the opposite of the advice usually given (the usual advice is bad in this situation). And you should argue with people who have other views. These people are more likely to focus on the weakest points in your arguments than you are to do so yourself and on the weakest assumptions you’ve made that you haven’t justified or thought that you needed to justify (I know this is not always true but only claim it’s more likely) and they’re probably going to do a better job of finding the best arguments against your position than you would yourself (also not always true; I just think these two points are more true than not when averaging over all cases). But these two points aren’t that important. The cases where they don’t apply are cases where you might be doing something wrong: if you are aware of better arguments against your position than similarly smart and rational people who disagree with your position, you’ve probably spent more time and effort than you needed to exploring arguments against your position, which you could have spent exploring arguments for your position or arguments about other things or just doing other things than exploring arguments about stuff.

The most important point is that a greater part of the space of arguments can be explored if each person only explores the arguments that support their position, and then they exchange by arguing. A deeper search can be done if each person specializes rather than both exploring the arguments on all sides. And doing a deeper search overall means getting closer to the truth in expectation. Arguing with other people allows exchanging only the best arguments so it should take less time than exploring yourself.

In this situation, you don’t need to be too worried with looking for arguments against your position since you can just leave that to the people who disagree with you. It’s sensible to worry about being biased, but the primary motivation you should get from that worry is a motivation not to make excuses for not spending time arguing with people who disagree with you, rather than a motivation to spend time looking for arguments against your position yourself.

And you should privilege debating with people who disagree with you (so it’s people who have explored different spaces of arguments than you; but arguing with people who share your conclusions for different reasons and disagree with your reasons is very good and I count them as “people who disagree with you”: the disagreements don’t have to be about the final conclusions), who are smart and rational and have thought much about the topic (so they’ll have done a deeper and better search), who have positions that are uncommon and unpopular and that aren’t those of people you’ve already debated before (there will often be not only two positions that are mutually exclusive and there will be many incompatible lines of arguments leading to these positions; you benefit more in expectation from debating with people who have explored things you haven’t already heard about, so things that are uncommon and unpopular or that you haven’t debated people about before).

Some other things that can improve the quality of the search is debates being in written form and asynchronous so people have time to think and can look up the best information and arguments on the Web and check things on Wikipedia. And you should redebate the same things again with the same people sometimes, because l’esprit de l’escalier is a very important thing and you should take care to make it possible for other people to use it to your benefit (including without having to admit that they’re doing so and that they didn’t think of the best response the first time around, because its being known that they didn’t think of the best response the first time around could be embarrassing to them and you don’t want them to double down on a worse line of argument because of that).

New to LessWrong?