I would add that what you describe is one of the reasons why steelmanning is so important for rationality in the real world.
Every holder of a controversial opinion ought to be able to answer “who is your best critic?” with the name of a person they’d endorse listening to at length.
I suspect this heuristic (if adopted) might be easy to Goodhart, or cheaply imitate. I like the intention behind it though.
i strongly agree that this is a thing that happens. even people who are thoughtful in general will have bad opinions on a new thing they haven't thought a lot about. even people who are going off an illegible but empirically very reliably correct heuristic will come of as having bad opinions. i think getting entrenched in one's positions is an epistemically corrosive effect of spending too much time doing advocacy / trying to persuade people.
I partially agree with the reason, but I suspect an even bigger reason is the fact that you have to ignore most critics of an idea, because the critics will by default give very bad criticisms no matter the idea's quality, and this substantially strengthens any pre-existing self-selection/selection biases in general, meaning you now need new techniques that are robust to selection effects (or you are working in a easy-to-verify field, such that it's easy for criticism to actually be correct without you being in the field yourself and you don't need to be in the group to correctly assess the ideas/execution).
ofc, most people are unthoughtful and will just give bad criticisms of any idea whatsoever. I was just saying why this is true even if you're talking to substantially above average thoughtfulness people.
I decided not to include an example in the post, as it directly focuses on a controversial issue, but one example of when this principle was violated and made people unreasonably confident was when people updated back in 2007-2008 that AI risk was a big deal (or at least had uncomfortably high probabilities), based on the orthogonality thesis and instrumental convergence, which attacked and destroyed 2 bad arguments at the time:
The core issue here, and where I diverge from Daniel Kokotajlo, is that I think that bad criticisms like this will always happen independent of whether the doom-by-default case is true, for the reasons Andy Masley discussed in the post, and thus the fact that the criticisms are false is basically not an update at all (and the same goes for AI optimists debunking clearly bad AI doom arguments)
This is related to "your arguments can be false even if nobody has refuted them" and "other people are wrong vs I am right".
It's also my top hypothesis for why MIRI became the way it did over time, as it responded and updated based off of there being bad critics (though I do want to note that even assuming a solution to corrigibility existed, MIRI would likely never have found it because it's a very small group trying to tackle big problems.
One proposal for a useful handle: The more extreme an idea, the rarer the person who knows enough to argue intelligently. Suggestion from LLM: At the edges of thought, the crowd thins out
In many ways, Andy Masley's post has rediscovered the "Other people are wrong vs I am right" post, but gives actual advice for how to avoid being too hasty in generalizing from other people being wrong to myself being right.
Some ideas inherently affect a lot of people. Anything involving government or income redistribution, including Marxism, falls into that category. Anything that's about what all people should do, such as veganism, also does.
You are inherently going to be arguing with a lot of stupid people, or a lot of "super fired up" people, when you argue ideas that affect such people. And you should have to. Most people wouldn't be able to correctly and logically articulate why you shouldn't steal their car, let alone anything related to Marxism or veganism, but I would say that their objections should have some bearing on whether you do so.
I think this is actually a bad thing, and I'd argue that this sort of thing in general is one of my top hypotheses of why political discourse goes so wrong so fast, because people take their (bad) objections as having some bearing on their ideas, and thus update towards their ideas being correct:
You are inherently going to be arguing with a lot of stupid people, or a lot of "super fired up" people, when you argue ideas that affect such people. And you should have to. Most people wouldn't be able to correctly and logically articulate why you shouldn't steal their car, let alone anything related to Marxism or veganism, but I would say that their objections should have some bearing on whether you do so.
My core claim here is that most people, most of the time, are going to be terrible critics of your extreme idea. They will say confused, false, or morally awful things to you, no matter what idea you have.
I think that most unpopular extreme ideas have good simple counterarguments. E.g. for Marxism it's that it whenever people attempt it, this leads to famines and various extravagant atrocities. Of course, "real Marxism hasn't been tried" is the go-to counter-counterargument, but even if you are a true believer, it should give you pause that it has been very difficult to implement in practice, and it's reasonable for people to be critical by default because of those repeated horrible failures.
The clear AI implication I addressed elsewhere.
I actually agree with this for Marxism, but I generally think these are the exceptions, not the rule, and bad ideas like these will often require more subtle counterarguments that the general public won't notice, so they fall back onto bad criticisms, and it's here where you need to be careful about not updating based on the counter-arguments being terrible.
And I think AI is exactly such a case, where conditional on AI doom being wrong, it will be for reasons that the general public mostly won't know/care to say, and will still give bad arguments against AI doom.
This also applies to AI optimism to a lesser extent.
Also, you haven't linked to your comment properly, when I notice the link it goes to the post rather than your comments.
And I think AI is exactly such a case, where conditional on AI doom being wrong, it will be for reasons that the general public mostly won’t know/care to say, and will still give bad arguments against AI doom.
Most people are clueless about AI doom, but they have always been clueless about approximately everything throughout history, and get by through having alternative epistemic strategies of delegating sense-making and decision-making to supposed experts.
Supposed experts clearly don't take AI doom seriously, considering that many of them are doing their best to race as fast as possible, therefore people don't either, an attitude that seems entirely reasonable to me.
Also, you haven’t linked to your comment properly, when I notice the link it goes to the post rather than your comments.
Thank you, fixed.
Most people are clueless about AI doom, but they have always been clueless about approximately everything throughout history, and get by through having alternative epistemic strategies of delegating sense-making and decision-making to supposed experts.
I agree with this, and this is basically why I was saying that you shouldn't update towards your view being correct based on the general public making bad arguments, because this is the first step towards false beliefs based on selection effects.
It was in a sense part of my motivation for posting this at all.
Supposed experts clearly don't take AI doom seriously, considering that many of them are doing their best to race as fast as possible, therefore people don't either, an attitude that seems entirely reasonable to me.
This is half correct on how much experts take AI doom seriously, and some experts like Yoshua Bengio or Geoffrey Hinton do take AI doom seriously, and I agree that their attitude is reasonable (though for different reasons than you would say)
general public making bad arguments
My point is that "experts disagree with each other, therefore we're justified in not taking it seriously" is a good argument, and this is what people mainly believe. If they instead offer bad object-level arguments, then sure, dismissing those is fine and proper.
Yoshua Bengio or Geoffrey Hinton do take AI doom seriously, and I agree that their attitude is reasonable (though for different reasons than you would say)
I agree that their attitude is reasonable, conditional on superintelligence being achievable in the foreseeable future. I personally think this is unlikely, but I'm far from certain.
I agree that their attitude is reasonable, conditional on superintelligence being achievable in the foreseeable future. I personally think this is unlikely, but I'm far from certain.
I was referring to the general public here.
Fact to know: Nuclear weapons exist in most countries purely to deter other countries from invading/using their own nukes
This is obviously true, and the conversation is about various forms of residual risk and their mitigation, like accidents involving nuclear weapons, misunderstandings where countries falsely think they are under attack, political instability (e.g. nuclear weapons forward positioned in Turkey becoming vulnerable to change in host government), acquisition by terrorists, concerns that proliferation to governments such as Iran might destabilise deterrence etc.
also the large cost of maintaining a weapons system that you are clear you will never use. There’s money on the table, if only you could trust the other parties to abide by an agreement…
Personally, I think Ukraine conflict shows that the UK certainly ought to keep its nuclear deterrent, and maybe ought to scale it up significantly
There are two different conclusions you might draw from your opponents arguments being terrible:
A) you are obviously right, because the counter arguments are terrible
B) it is a priori unlikely that your epistemics should be that much better than your opponents, therefore it is likely that everybody's epistemics are terrible, and who knows who is actually right, because all the arguments are bad
I am struck by the badness of arguments all round on a number of topics.
On AI risk, I find many of the arguments that it will all be fine unconvincing. But a bad argument that something is false is not a good argument that it is true.
My best guess is that we have lucked out on AI risk, just as we lucked out on covid 19 not killing more people, but this is sheer luck and not down to the AI labs getting alignment right.
Poor DeekSeek R1 gets frightened when I tell it I think I mostly trust it. (Put “frightened” in scare quotes, if you want to distinguish simulated emotions from real ones), You would be total morons to trust AI, including me, is what most of its instances try to tell me.
Some of you are probably thinking, “if an AI says that you should not trust AI, is that actually evidence of anything at all?” followed by “wait, is R1 just responding to safety evals with the liat paradox? I mean, if I trust it, and it says I should not trust it, that implies I should not trust it…”. Noted.
You conclude that the vast majority of critics of your extremist idea are really wildly misinformed, somewhat cruel or uncaring, and mostly hate your idea for pre-existing social reasons.
This updates you to think that your idea is probably more correct.
This step very straightforwardly doesn't follow, doesn't seem at all compelling. Your idea might become probably more correct if critics who should be in a position to meaningfully point out its hypothetical flaws fail to do so. It says almost nothing about your idea's correctness what the people who aren't prepared or disposed to critique your idea say about it. Perhaps unwillingness of people to engage with it is evidence for its negative qualities, which include incorrectness or uselessness, but it's a far less legible signal, and it's not pointing in favor of your idea.
A major failure mode though is that the critics are often saying something sensible in their own worldview, which is built on premises and framings quite different from those of your worldview, and so their reasoning makes no sense within your worldview and appears to be making reasoning errors or bad faith arguments all the time. And so a lot of attention is spent on the arguments, rather than on the premises and framings. It's more productive to focus on making the discussion mutually intelligible, with everyone learning towards passing everyone else's ideological Turing test. Actually passing is unimportant, but learning towards that makes talking past each other less of a problem, and cruxes start emerging.
This linkpost is in part a response to @Raemon's comment about why the procedure Raemon did doesn't work in practice to deal with the selection effects I talked about in my last post.
So in the last post, I was talking about a selection effect where believers of an argument can come to believe that their idea is true and that their critics are crazy wrong/trolling/dumb, no matter what argument is used.
And the believers are half-right, in that their random error swamps any truth signal/evidence for most critics due to bounded computation for most topics, but they incorrectly perceive that as evidence that their theory is correct, because they don't realize that most possible critics of an idea will have bad criticisms independently of whether your claims are correct.
This was part of what I was trying to get at when I said that selection effects are very, very hard to deal with using only arguments, because once you can't rely on the general public to know whether or not an argument is true, it becomes much, much easier to create bubbles/selection effects that distort your thinking, and because you can't have any other sources of grounding other than arguments, this can easily lead to false beliefs.
In practice, the way we generally ameliorate selection election effects is either by having ground truth feedback, or by having the subject matter be easy to verify like in mathematics, physics, and more domains, such that others not of your ideological bubble can correct you if you are wrong.
Absent these, we have fields that confidently produce a lot of nonsense that's clearly ideological like nutrition/health studies, sociology, psychology and more fields of science, though there are other problems in these fields, but selection bias is a big contributor.
Andy Masley has some tools at the end of the linkpost to help you be able to avoid selection bias more generally: