Edit: made some small changes to prevent certain gross mischaracterizations of the argument. The core argument remains completely unchanged.

Among intelligent people with at least some familiarity with argumentative norms, surface level disagreements tend to be ephemeral because, even if some given debate about the issue is terminated before conclusion, both parties to the disagreement will eventually encounter the full range of common arguments pertaining to the issue. Because of this, there are really only two cases where the disagreement will persist: 1. if the key arguments are sufficiently uncommon as to not be in general circulation, or 2. if mere familiarity with surface-level arguments is insufficient to bridge the inferential gap.

We will examine these cases separately.

The first case is rare, because convincing arguments, if they can be grasped with relatively low opportunity cost, have a tendency to spread and become part of general circulation. Exceptions can however be found when the arguments pertain to a niche, though only when people interested in that niche have sufficiently little contact not to form a distinct social network of their own. More commonly, the first case arises when there is a political or social pressure not to repeat the arguments in general company, because this creates an opportunity cost to transferring them.

Already here, the practice of steelmanning can give rise to major problems, though only if you are steelmanning a position rather than the argument offered in its support, and this only in cases where the position is more socially acceptable than the argument. Consider for example the case where someone is critiquing disparate impact case law from a standpoint of HBD. Whatever you may think of the argument[1] should not matter to demonstrate the dynamic. 

The interlocutor has an instinctual aversion to HBD and flinches away from it, but notes that an argument for the position can be built on a much less offensive basis. For example, one might argue that equity can only realistically be achieved by addressing the underlying drivers of cognitive inequality (eg. early education, diet, etc.) and not merely by legislating your way to equal outcomes, which would merely place disadvantaged people in academic courses they can't keep up with, or get them into jobs whose demands they cannot meet, leading to impostor syndrome, etc.

Alternatively, to stay closer to the original argument and thus "obscure the deed", the interlocutor may point out that it is not necessary to demonstrate HBD in order to opposite disparate impact case law, and that we can instead just rely on agnosticism about the matter, since the contrary position to HBD, ie. human neurological uniformity, has never been proven.

Notice how, by sticking closer to the original argument, this latter example seems even less like a strawman than the former. But notice also how it actually leads to a much weaker conclusion, since it leaves open the possibility that disparate impact case law may work straightforwardly. The conclusion supported by the argument is in fact so weak that the interlocutor is likely to have largely forgotten about it a few weeks later. The former argument supports a somewhat stronger conclusion, but leaves open the possibility that addressing those underlying arguments will make disparate impact case law workable.

The problem here arises because the argument is more offensive than the conclusion, and so our interlocutor feels the "instinctive flinch" more keenly when it comes to the argument than the conclusion. This makes him more willing to consider the proposition than the argument offered in its support, and so he will come up with alternate arguments that wind up leading to only a weaker form of that proposition.

But of the two cases of non-ephemeral disagreements, this is the one where steelmanning is least objectionable. It is the other case where steelmanning is truly insidious.

Suppose you are trying to surmount a large inferential gap over the course of a very long conversation. It is a case of totally incompatible worldviews. To make the thought experiment more palatable to LessWrongers, let us choose a scenario that conforms to the prejudices currently in fashion. Therefore, let us suppose you are Scott Alexander who has just written the Anti-Reactionary FAQ, and your interlocutor is some garden variety neo-reactionary who is not impressed by your statistics. 

The argument you are making is difficult, but not beyond the comprehension of your interlocutor. It is however likely that he will misunderstand it at several stages, call it stupid, and point out what he thinks are obvious errors. You have already resigned yourself to the somewhat tedious task of having to address those objections one by one, and thus correct your interlocutor's misunderstandings. This also has the bonus of making your interlocutor feel a bit flustered about having called you stupid, and making him do a considerable upwards update on the possibility that you are smarter than him and have a much sounder overall worldview.

Unfortunately, your interlocutor has heard of the practice of steelmanning, and likes to think of himself as being someone who debates politely and in very good faith. Thus he will not call you stupid, and if it seems to him that you have made an obvious error, he will conclude that he must have misunderstood the argument, and try to steelman it. The result is that it will be nearly impossible to get him to consider your actual arguments, i.e. those he is presently convinced are dumb. Each time he proposes another flawed steelman, you can keep trying to redirect him back to your actual argument as you originally formulated it. Since it seems obviously weak to him, he might be reluctant to conclude that that really is the argument you're making. If so, you might even go so far as to emphasise that yes, really, your argument is the one he finds dumb, and not the one that resulted from his attempt to improve it. Unfortunately, this has the effect of making him update downwards on the possibility that you are smarter than him and have a sounder worldview, since he is literally seeing you insist on an argument which to him appears much dumber than the alternative he is proposing to examine. Priding himself on his civility and politeness, he still doesn't actually call you stupid, but this only further prevents him from being flustered when proven wrong, and so makes him still less likely to change his mind.

The problem here is that if he does not understand the line of thinking underlying your actual argument, then he cannot generate it on the spot, yet if the conversation has any considerable length (which may be assumed since we are talking about deep disagreements among people too smart to think the matter can be resolved in casual chat over coffee), then he will probably have considered pretty much all the major arguments he can generate on the spot. However, what this means is that the best argument he is capable of generating on the spot is one he was not convinced by. Therefore, the actual effect of steelmanning is simply to assume that the opposition is making an unconvincing argument that will leave you unmoved — which is pretty much the exact opposite of the principle of charity.

We see then that in such a case, the attempt to steelman, far from being the epitome of charitable discourse, is pretty much its nadir, but what is insidious about it is that it makes it extremely difficult for you to convince your interlocutor that, no, he really is not being the bastion of charity and good faith that he likes to imagine himself.

So if steelmanning is so terrible, why has it become so popular?

Well, for starters, in the case of ephemeral disagreements, it genuinely does tend to ingratiate people, maximize civility, and even save time — all these are quite considerable benefits that should not be underestimated.

Secondly, it is very effective when talking to people not habituated to argumentative norms. They are accustomed to an outright combative interlocutor, and will be taken aback by your willingness to go to great lengths to make their arguments for them. But such people are not exactly powerhouses of the intellect. By all means, keep using steelmen in such cases, but recognise that what you are engaged in is something more like polite condescension than charitable discourse. 

Edit: fixed some typos

  1. ^

    incidentally, I do not have very high regard for the HBD crowd — they remind me too much of scientism, technocracy, and progressive-era eugenics.

New to LessWrong?

New Comment
13 comments, sorted by Click to highlight new comments since: Today at 1:28 PM

Steelmanning is not the Ideological Turing Test.

ITT = simulating the other side

SM = can I extract something useful for myself from the other side's arguments

With ITT, my goal is to have a good model of the other side. That can be instrumentally useful to predict their behavior, or to win more debates against them (because they are less likely to surprise me). That is, ITT is socially motivated activity. If I knew that the other side will disappear tomorrow and no one will want to talk about them, ITT would be a waste of time.

With SM, my goal is to improve my model of the world. The other side is unimportant, except as a potential source of true information that may be currently in my blind spot. That is, SM is a selfishly motivated activity. Whether the other side approves of my steelman of them, is irrelevant; my activity is not aimed at them.

SM is trying to find a diamond in a heap of dung. ITT is learning to simulate someone who enjoys the dung.

I would further state that steelmanning is something you don't usually do in an active debate.  Certainly not an in-person debate; anytime you start to say "I think your argument would be better stated as [...]", in person that should immediately be checked with your interlocutor.  If it's more like a series of letters exchanged, or articles published, then that time-delay is some justification for doing interpretive work on your side... But at the very least, I think you should call out explicitly every time you're ignoring one argument in favor of one you think is better, for legibility and the opportunity for your interlocutor to say "No, that argument is a worse one."  Steelmanning is best when it's you vs a flood of statistically bogus articles by partisan hacks and you're trying to keep yourself grounded.

Peter Boghossian has been doing activities where, having found two people on opposing sides of an issue, he asks both of them to write down their best reason for their side, then asks each person to guess what the other one wrote down, and asks the other person "Is that the reason you wrote?", and, if not (which it's usually not), "Is that better than the reason you wrote?" (which I don't remember ever being a yes).  It's an interesting approach.  Example.

By the way, separately from beliefs, another issue is values.  Suppose you favor policy X, because it helps with value A, and your opponent favors policy Y, because it helps with value B; suppose you care a lot about A, and care little about B, and your opponent is vice-versa on that.  Then, when you "steelman" the argument for Y, you might say "Well, someone who favored Y might think that it helps with A, and I guess a naive look would indeed support that, but if we dig into the details we conclude strongly that X is much better for A.  Case closed."

Trouble is that even checking the steelman with the other person does not avoid the failure modes I am talking about. In fact, some moments ago, I made slight changes to the post to include a bit where the interlocutor presents a proposed steelman and you reject it. I included this because many redditors objected that this is by definition part of steelmanning (though none of the cited definitions actually included this criterion), and so I wanted to show that it makes no difference at all to my argument whether the interlocutor asks for confirmation of the steelman versus you becoming aware of it by some other mechanism. What's relevant is only that you somehow learn of the steelman attempt, reject it as inadequate, and try to redirect your interlocutor back to the actual argument you made. The precise social forms by which this happens (the ideal being something like "would the following be an acceptable steelman [...]") are only dressing, not substance.

I have in fact had a very long email conversation spanning several months with another LessWronger who kept constructing would-be steelmen of my argument that I kept having to correct.

As it was a private conversation, I cannot give too many details, but I can try to summarize the general gist

I and this user are part of a shared IRL social network, which I have been feeling increasingly alienated from, but which I cannot simply leave without severe consequences. Trouble is that this social network generally treats me with extreme condescension, disdain, patronisation, etc, and that I am constrained in my ability to fight back in my usual manner. I am not so concerned about the underlying contempt, except for its part in creating the objectionable behaviour. It seems to me that they must subconsciously have extreme contempt for me, but since I do not respect their judgement of me, my self-esteem is not harmed by this knowledge. The real problem is that situations where I am treated with contempt and cannot defend myself from it, but must remain polite and simply take it, provide a kind of evidence to my autonomous unconscious status tracking processes (what JBP claims to be the function of the serotoninergic system, though idk if this is true at all), and that this is not so easily overridden by my own contempt for their poor judgement as my conscious reasoning about their disdain for me is.

I repeatedly explained to this LessWrong user that the issue is that these situations provide evidence for contempt for me, and that since I am constrained in my ability to talk back, they also provide systematically false evidence about my level of self respect and about how I deserve to be treated. Speaking somewhat metaphorically, you could say that this social network is inadvertently using black magic against me and that I want them to stop. It might seem that this position could be easily explained, and indeed that was how it seemed to me too at the outset of the conversation, but it was complicated by the need to demonstrate that I was in fact being treated contemptuously, and that I was in fact being constrained in my ability to defend myself against it. It was not enough to give specific examples of the treatment, because that led my interlocutor to overly narrow abstractions, so I had to point out that the specific instances of contemptuous treatment demonstrated the existence of underlying contempt, and that this underlying contempt should a priori be expected to generate a large variety of contemptuous behaviour. This in turn led to a very tedious argument over whether that underlying contempt exists at all, where it would've come from, etc.

Anyway, I eventually approached another member of this social network and tried to explain my predicament. It was tricky, because I had to accuse him of an underlying contempt giving rise to a pattern of disrespectful behaviour, but also explain that it was the behaviour itself I was objecting to and not the underlying contempt, all without telling him explicitly that I do not respect his judgement. Astonishingly, I actually made a lot of progress anyway.

Well, that didn't last long, because the LW user in question took it into his own hands to attempt to fix the schism, and told this man that if I am objecting to a pattern of disrespectful behaviour, then it is unreasonable to assume that I am objecting to the evidence of disrespect, rather than the underlying disrespect itself. You will notice that this is exactly the 180 degree opposite of my actual position.  It also had the effect of cutting off my chance at making any further progress with the man in question, since it is now to my eyes impossible to explain what I actually object to without telling him outright that I have no respect for his judgement.

I am sure he thought he was being reasonable. After all, absent the context, it would seem like a perfectly reasonable observation. But as there were other problems with his behaviour that made it seem smug and self righteous to me, and as the whole conversation up to that point had already been so maddening and let to so much disaster (it seems in fact to have played a major part in causing extreme mental harm to someone who was quite close to me), I decided to cut my losses and not pursue it any further, except for scolding him for what seemed to me like the breach of an oath he had given earlier.

Anyway, the point is not to generalise too much from this example. What I described in the post was actually inspired by other scenarios. The point of telling you this story is simply that even if you are presented with the interlocutor's proposed steelman and given a chance to reject it, this does not save you, and the conversation can still go on for literally months and not get out of the trap I described. I have had other examples of this trap being highly persistent, even with people who were more consistent in explicitly asking for confirmation of each proposed steelman, but what was special about this case was that it was the only one that lasted for literally months with hundreds of emails, that my interlocutor started out with a stated intent to see the conversation through to the end, and that my interlocutor was a fairly prolific LessWrong commenter and poster, whom I would rate as being at least in the top 5% and probably top 1% of smartest LessWrongers

I should mention for transparency that the LessWrong user in question did not state outright that he was steelmanning me, but having been around in this community for a long time, I think I am able to tell which behaviours are borne out of an attempt to steelman, or more broadly, which behaviours spring from the general culture of steelmanning and of being habituated to a steelman-esque mode of discourse. As my post indicated, I think steelmanning is a reasonable way to get to a more expedient resolution between people who broadly speaking "share base realities", but as someone with views that are highly heterodox relative to the dominant worldviews on LessWrong, I can say that my own experience with steelmanning has been that it is one of the nastiest forms of argumentation I know of.

I focused on the practice of steelmanning as emblematic of a whole approach to thinking about good faith that I believe is wrongheaded more generally and not only pertaining to steelmanning. In hindsight, I should have stated this. I considered doing so, but decided to make it the subject of a subsequent post, and I didn't notice that making a more in-depth post about the abstract pattern does not preclude me from making a brief mention in this post that steelmanning is only one instance of a more general pattern I am trying to critique.

The pattern is simply to focus excessively on behaviours and specific arguments as being in bad faith, and paying insufficient attention to the emotional drivers of being in bad faith, which also tend to make people go into denial about their bad faith.

Indeed, that was the purpose of steelmanning in its original form, as it was pioneered on Slate Star Codex.

Interestingly, when I posted it on r/slatestarcodex, a lot of people started basically screaming at me that I am strawmanning the concept of steelmanning, because a steelman by definition requires that the person you're steelmanning accepts the proposed steelman as accurate. Hence, your comment provides me some fresh relief and assures me that there is still a vestige left of the rationalist community I used to know.

I wrote my article mostly concerning how I see the word colloquially used today. I intended it as one of several posts demonstrating a general pattern of bad faith argumentation that disguises itself as exceptionally good faith. 

But setting all that aside, I think my critique still substantially applies to the concept in its original form. It is still the case, for example, that superficial mistakes will tend to be corrected automatically just from the general circulation of ideas within a community, and that the really persistent errors have to do with deeper distortions in the underlying worldview. 

Worldviews are however basically analogous to scientific paradigms as described by Thomas Kuhn. People do not adopt a complicated worldview without it seeming vividly correct from at least some angle, however parochial that angle might be. Hence, the only correct way to resolve a deep conflict between worldviews is by the acquisition of a broader perspective that subsumes both. Of course, either worldview, or both, may be a mixture of real patterns coupled with a bunch of propaganda, but in such a case, the worldview that subsumes both should ideally be able to explain why that propaganda was created and why it seems vividly believable to its adherents. 

At first glance, this might not seem to pose much of a problem for the practice of steelmanning in its original form, because in many cases it will seem like you can completely subsume the "grain of truth"  from the other perspective into your own without any substantial conflict. But that would basically classify it as a "superficial improvement", the kind that is bound to happen automatically just from the general circulation of ideas, and therefore less important than the less inevitable improvements. But if an improvement of this sort is not inevitable, it indicates that your current social network cannot generate the improvement on its own, but instead can only generate it through confrontations with conflicting worldviews from outside your main social network, and that means that your existing worldview cannot properly explain the grain of truth from the opposing view, since it could not predict it in advance, which means there is more to learn from this outside perspective than can be learned by straightforwardly integrating its apparent grain of truth.

This is basically the same pattern I am describing in the post, but just removed from the context of conversations between individuals, and instead applied to confrontations between different social networks with low-ish overlap. The argument is substantially the same, only less concrete.

If we take the example of the HBD argument, one issue that arises is that it centers latent variables which the interlocutor might not believe are observable (namely, innate racial differences). If these latents aren't observable, then either the argument is nonsense, or the real origin for the conclusion is something else (for the case of HBD, an obvious candidate would be a lot of work that has attempted and failed to achieve racial equity, indicating that whatever causes the gaps must be quite robust). Thus from the interlocutors' perspective, the three obvious options would be to hear how one could even come to observe these latents, to hear what the real origin for the conclusion is, or to just dismiss the conclusion as baseless speculations.

This trichotomy seems pretty common.

The second option, trying to uncover the real origin of the conclusion, being obviously the best of the three. It is also most in-line with canonical works like Is That Your True Rejection?

But it belongs to the older paradigm of rationalist thinking; the one that sought to examine motivated cognition and discover the underlying emotional drives (ideally with delicate sensitivty), whereas the new paradigm merely stigmatizes motivated cognition and inadvertently imposes a cultural standard of performativity, in which we are all supposed to pretend that our thinking is unmotivated. The problems with present rationalist culture would stand out like a glowing neon sign to old-school LessWrongers, but unfortunately there are not many of these left.

I think it also depends. If you are engaging in purely political discourse, then sure, this is correct. But e.g. if you're doing a good-faith project to measure latent variables, such that the latent variables are of primary interest and the political disputes are of secondary interest, then having people around who are postulating elaborate latent variable models in order to troll their political opponents are distracting. At best, they could indicate that there is a general interest in measuring the sort of latent variables they talk about, and so they could be used as inspiration for what to develop measures on, but at worst they could interfere with the research project by propagating myths.

They are not doing it in order to troll their political opponents. They are doing it out of scientism and loyalty to enlightenment aesthetics of reason and rationality, which just so happens to entail an extremely toxic stigma against informal reasoning about weighty matters.

Sort of true, but it seems polycausal; the drive to troll their political opponents makes them willing to push misleading and bad-faith arguments, whereas the scientism makes them disguise those bad arguments in scientific-sounding terms.

Both causes seem toxic to good-faith attempts at building good measurement though. Like you're correct that the scientism has to be corrected too, but that can be handled much more straightforwardly if they're mostly interested in the measurement project than if they are interested in political discourse.

The measuring project is symptomatic of scientism and is part of what needs to be corrected.

That is what I meant when I said that the HBD crowd is reminiscent of utilitarian technocracy and progressive-era eugenics. The correct way of handling race politics is to take an inventory of the current situation by doing case studies and field research, and to develop a no-bullshit commonsense executive-minded attitude for how to go about improving the conditions of racial minorities from where they're currently at.

Obviously, more policing is needed, so as to finally give black business-owners in black areas a break and let them develop without being pestered by shoplifters, riots, etc. Affirmative action is not working, and nor is the whole paradigm of equity politics. Antidiscrimination legislation was what crushed black business districts that had been flourishing prior to the sixties.

Whether the races are theoretically equal in their genetic potential or not is utterly irrelevant. The plain fact is that they are not equal at present, and that is not something you need statistics in order to notice. If you are a utopian, then your project is to make them achieve their full potential as constrained by genetics in some distant future, and if they are genetically equal, then that means you want equal outcomes at some point. But this is a ridiculous way of thinking, because it extrapolates your policy goals unreasonably far into the future, never mind that genetic inequalities do not constrain long-term outcomes in a world that is rapidly advancing in genetic engineering tech.

The scientistic, statistics-driven approach is clearly the wrong tool for the job, as we can see from just looking at what outcomes it has achieved. Instead it is necessary to have human minds thinking reasonably about the issue, instead of trying to replace human reason with statistics "carried on by steam" as Carlyle put it. These human minds thinking reasonably about the issue should not be evaluating policies by whether they can theoretically be extrapolated to some utopian outcome in the distant future, but simply about whether they actually improve things for racial minorities or not. This is one case where we could all learn something from Keynes' famous remark that "in the long run, we are all dead".

In short: scientism is the issue, and statistics by steam are part of it. Your insistence on the measurement project over discussing the real issues is why you do not have much success with these people. You are inadvertently perpetuating the very same stigma on informal reasoning about weighty matters that is the cause of the issue.

This is correct as an analysis of racial politics, but you end up with latent variable measurement projects for multiple reasons. In the particular case of cognitive abilities, there's also cognitive disability, military service, hiring, giftedness, cognitive decline, genomics and epidemiology, all of which have interest in the measurement of cognitive abilities. Furthermore, the theory and tools for cognitive abilities can be informed by and is informative for the measurement of other latent variables, so you also end up with interest from people who study e.g. personality.

People should do serious research to inform their policy, and they should do serious research on latent variables, but they should avoid using disingenuous arguments where they talk like they've done serious research when really they're trying to troll.

No, the reasoning generalises to those fields too. The problem with those areas driving their need to have measurement of cognitive abilities is excessive bureaucratisation and lack of a sensible top-down structure with responsibilities and duties in both directions. A wise and mature person can get a solid impression of an interviewee's mental capacities from a short interview, and can even find out a lot of useful details that are not going to be covered by an IQ test. For example, mental health, maturity, and capacity to handle responsibility.

Or consider it from another angle: suppose I know someone to be brilliant and extremely capable, but when taking an IQ test, they only score 130 or so. What am I supposed to do with this information? Granted, it's pretty rare — normally the IQ would reflect my estimation of their brilliance, but in such cases, it adds no new information. But if the score does not match the person's actual capabilities as I have been able to infer them, I am simply left with the conclusion that IQ is not a particularly useful metric for my purposes. It may be highly accurate, but an experienced human judgement is considerably more accurate still.

Of course, individualised judgements of this sort are vulnerable to various failure modes, which is why large corporations and organizations like the military are interested in giving IQ tests instead. But this is often a result of regulatory barriers or other hindrances to simply requiring your job interviewers to avoid those failure modes and holding them accountable to it, with the risk of demotion or termination if their department becomes corrupt and/or grossly incompetent.

This issue is not particular to race politics. It is a much more general matter of fractal monarchy vs procedural bureaucracy.

Edit: or, if you want a more libertarian friendly version, it is a general matter of subsidiarity vs totalitarianism.

No, the reasoning generalises to those fields too. The problem with those areas driving their need to have measurement of cognitive abilities is excessive bureaucratisation and lack of a sensible top-down structure with responsibilities and duties in both directions. A wise and mature person can get a solid impression of an interviewee's mental capacities from a short interview, and can even find out a lot of useful details that are not going to be covered by an IQ test. For example, mental health, maturity, and capacity to handle responsibility.

I'm not convinced about this, both from an efficiency perspective and an accuracy perspective.

Take military service as an example. The way I remember it, we had like 60 people all take an IQ test in parallel, which seems more efficient than having 60 different interviews. (Somewhere between 2x more efficient and 60x more efficient, depending on how highly one weights the testers and testee's time.) Or in the case of genomics, you are often dealing with big databases of people who signed up a long time ago for medical research; it's not so practical to interview them extensively, and existing studies deal with brief tests that were given with minimal supervision.

From an accuracy perspective, my understanding is that the usual finding is that psychometric tests and structured interviews provide relatively independent information, such that the best accuracy is obtained by combining both. This does definitely imply that there would be value in integrating more structured interviews into genomics (if anyone can afford it...), and more generally integrating information from more different angles into single datasets, but it doesn't invalidate those tests in the first place.

Or consider it from another angle: suppose I know someone to be brilliant and extremely capable, but when taking an IQ test, they only score 130 or so. What am I supposed to do with this information? Granted, it's pretty rare — normally the IQ would reflect my estimation of their brilliance, but in such cases, it adds no new information. But if the score does not match the person's actual capabilities as I have been able to infer them, I am simply left with the conclusion that IQ is not a particularly useful metric for my purposes. It may be highly accurate, but an experienced human judgement is considerably more accurate still.

I mean there's a few different obvious angles to this.

IQ tests measure g. If you've realized someone has some non-g factor that is very important for your purposes, then by all means conclude that the IQ test missed that.

If you've concluded that the IQ test underestimated their g, that's a different issue. You're phrasing things like your own assessment correlates at like 0.9 with g and that the residual is non-normally distributed, which I sort of doubt is true (should be easy enough to test... though maybe you already have experiences you can reference to illustrate it?)