I’m trying to figure out what you mean-my current interpretation is that my post is an example of reason that will lead us astray. I could be wrong about this, and would appreciate correction, as the analogy isn’t quite “clicking” for me.
If I’m right, I think it’s generally a good norm to provide some warrant for these types of things: I can vaguely see what you might mean, but it’s not obvious enough to me to be able to engage in productive discourse, or change my current endorsement of my opinion: I’m open to the possibility you might be right, but I don’t know what you’re saying. This might be just an understanding failure on my part, in which case I’d appreciate any guidance/correction/clarification.
This post seems excellent overall, and makes several arguments that I think represent the best of LessWrong self-reflection about rationality. It also spurred an interesting ongoing conversation about what integrity means, and how it interacts with updating.
The first part of the post is dedicated to discussions of misaligned incentives, and makes the claim that poorly aligned incentives are primarily to blame for irrational or incorrect decisions. I’m a little bit confused about this, specifically that nobody has pointed out the obvious corollary: the people in a vacuum, and especially people with well-aligned incentive structures, are broadly capable of making correct decisions. This seems to me like a highly controversial statement that makes the first part of the post suspicious, because it treads on the edge of proving (hypothesizing?) too much: it seems like a very ambitious statement worthy of further interrogation that people’s success at rationality is primarily about incentive structures, because that assumes a model in which humans are capable and preform high levels of rationality regularly. However, I can’t think of an obvious counterexample (a situation in which humans are predictably irrational despite having well-aligned incentives for rationality), and the formulation of this post has a ring of truth for me, which suggests to me that there’s at least something here. Conditional on this being correct, and there not being obvious counterexamples, this seems like a huge reframing that makes a nontrivial amount of the rationality community’s recent work inefficient-if humans are truly capable of behaving predictably rationally through good incentive structures, then CFAR, etc. should be working on imposing external incentive structures that reward accurate modeling, not rationality as a skill. The post obliquely mentions this through discussion of philosopher-kings, but I think this is a case in which an apparently weaker version of a thesis actually implies the stronger form: philosopher-kings being not useful for rationality implies that humans can behave predictably rationally, which implies that rationality-as-skill is irrelevant. This seems highly under-discussed to me, and this post is likely worthy of further promotion solely for its importance to this issue.
However, the second broad part of the post, examining (roughly) epistemic incentive structures, is also excellent. I strongly suspect that a unified definition of integrity with respect to behavior in line with ideology would be a significant advance in understanding how to effectively evaluate ideology that’s only “viewable” through behavior, and I think that this post makes an useful first step in laying out the difficulties of punishing behavior unmoored from principles while avoiding enforcing old unupdated beliefs. The comment section also has several threads that I think are worthy of revisitation: while the suggestion of allowing totally free second-level updating was found untenable due to the obvious hole of updating ideology to justify in-the-moment behavior, the discussion of ritual around excessive vows and Zvi’s (I believe) un-followed-up suggestion of distinguishing beliefs from principle both seem to have real promise to them: my guess would be that some element of ritual is necessary to avoid cheapening principle and allowing for sufficient contradictory principles to justify any behavior.
Finally, the discussion of accountability seems the least developed, but also a useful hook for further discussion. I especially like the suggestion of “mandatory double-crux”‘powers: I’ve informally tried this system by double-cruxing controversial decisions before action and upon reflection, I believe it’s the right level and type of impediment: likely to induce reflection, a non-trivial inconvenience, but not a setting that’s likely to shake well-justified beliefs and cause overcorrection.
Overall, I support collation of this post, and would strongly support collation if it was updated to pull more on the many potential threads it leaves.
That’s a fair point-see my comment to Raemon. The way I read it, the mod consensus was that we can’t just curate the post, meaning that comments are essentially the only option. To me, this means an incorrect/low quality post isn’t disqualifying, which doesn’t decrease the utility of the review, just the frame under which it should be interpreted.
That’s fair-I wasn’t disparaging the usefulness of the comment, just pointing out that the post itself is not actually what’s being reviewed, which is important, because it means that a low-quality post that sparks high-quality discussion isn’t disqualifying.
Note that this review is not of the content that was nominated; nomination justifications strongly suggest that the comment suggestion, not the linkpost, was nominated.
(Epistemic status: I don’t have much background in this. Not particularly confident, and attempting to avoid making statements that don’t seem strongly supported.)
I found this post interesting and useful, because it brought a clear unexpected result to the fore, and proposed a potential model that seems not incongruent with reality. On a meta-level, I think supporting these types of posts is quite good, especially because this one has a clear distinction between the “hard thing to explain” and the “potential explanation,” which seems very important to allow for good discussion and epistemology.
While reading the post, I found myself wishing that more time was spent discussing the hypothesis that IQ tests, while intelligence-loaded in general, are not a great way to analyze intelligence for autistic people. The post briefly touches on this, but “mutations positively correlate with intelligence but negatively with test-taking ability through some mediator, meaning that at first, increased intelligence outweighs the negative effects, but depending on exact circumstance, intelligence is not possible to express on a standard IQ test after enough mutations accumulate” seems like a natural hypothesis that deserves more analysis. However, upon further reflection, I think that the neglection of this hypothesis isn’t actually an issue, because it conceals a regress: why does intelligence outweigh lack of test-taking ability at first, only to bring eventual significant costs? I think there are several just-so stories that could explain an inflection point, but I’d prefer not to posit them unless someone with more background/knowledge in this subject suggests that this is viable so as to prevent harmful adoption.
I think a more serious issue is the selection bias mentioned in the discussion of autism. Because IQ is positively correlated with good outcomes writ large (https://www.gwern.net/Embryo-selection, see an early section), including functionality, and autism in the DSM-V is defined as requiring various deficits and significant impairment (https://www.autismspeaks.org/autism-diagnosis-criteria-dsm-5), it would be somewhat shocking if autism was not negatively correlated with IQ. If we assume the two variables are completely independent, it would still be less likely for higher-IQ people to be diagnosed as autistic, because they are nearly definitionally less likely to meet the diagnostic criteria. This suggests a much simpler model, given the apparent correlation between autism and IQ: autism mutations push up intelligence in the vast majority of cases, and lower IQ autistic people are far more likely to be diagnosed. I wonder whether this could even explain some of the diverse harms associated with autism—if autism mutations push up “technical” intelligence/performance on iq tests relative to general intelligence, then could i.e. social skills appear to suffer because they’re correlated with a lower general intelligence (obviously way over-simplified, and entirely speculative).
Overall, I’d appreciate if this post was more comprehensive, but I think it’s a good category of post to promote as is. I’d weakly advocate for inclusion, and strongly advocate for inclusion conditional on editing to spend more time discussing selection effects.
I strongly oppose collation of this post, despite thinking that it is an extremely well-written summary of an interesting argument on an interesting topic. The reason that I do so is because I believe it represents a substantial epistemic hazard because of the way it was written, and the source material it comes from. I think this is particularly harmful because both justifications for nominations amount to "this post was key in allowing percolation of a new thesis unaligned with the goals of the community into community knowledge," which is a justification that necessitates extremely rigorous thresholds for epistemic virtue: a poor-quality argument both risks spreading false or over-proven ideas into a healthy community, if the nominators are correct, and also creates conditions for an over-correction caused by the tearing down of a strongman. When assimilating new ideas and improving models, extreme care must be taken to avoid inclusion of non-steelmanned parts of the model, and this post does not represent that. In this case, isolated demands for rigor are called for!
The first major issue is the structure of the post. A more typical book review includes critique, discussion, and critical analysis of the points made in the book. This book review forgoes these, instead choosing to situate the thesis of the book in the fabric of anthropology and discuss the meta-level implications of the contributions at the beginning and end of the review. The rest of the review is dedicated to extremely long, explicitly cherry-picked block quotes of anecdotal evidence and accessible explanations of Heinrich's thesis. Already, this poses an issue: it's not possible to evaluate the truth of the thesis, or even the merit of the arguments made for it, with evidence that's explicitly chosen to be the most persuasive and favorable summaries of parts glossed over. Upon closer examination, even without considering that this is filtered evidence, this is an attempt to prove a thesis using exclusively intuitive anecdotes disguised as a compelling historical argument. The flaws in this approach are suggested by the excellent response to this post: it totally neglects the possibility that the anecdotes are being framed in a way that makes other potentially correct explanations not fit nicely. Once one considers that this evidence is filtered to be maximally charitable, the anecdotal strategy offers little-to-no information. The problem is actually even worse than this: because the information presented in the post does not prove the thesis in any way shape or form, but the author presents it as well-argued by Heinrich, the implication is that the missing parts of the book do the rigorous work. However, because these parts weren't excerpted, a filtered-evidence view suggests that they are even less useful than the examples discussed in the post.
The second major issue is that according to a later SSC post, the book is likely factually incorrect in several of its chosen anecdotes, or at the very least exaggerates examples to prove a point. Again, this wouldn't necessarily be a negative impact on the post, except that the post a). does not point this out, which suggests a lack of fact-checking, and b). quotes Heinrich so extensively that Heinrich's inaccurate arguments are presented as part of the thesis of the post. This is really bad on the naive object level: it means that parts of the post are both actively misleading, and increase the chance of spreading harmful anecdotes, and also means that beyond the evidentiary issues presented in the previous paragraph, which assumed good-faith, correct arguments, the filtered evidence is actively wrong. However, it actually gets worse from here: there are two layers of Gell-Mann amnesia-type issues that occur. First, the fact that the factual inaccuracies were not discovered at the time of writing suggests that the author of the post did not spot-check the anecdotes, meaning that none of Heinrich's writing should be considered independently verified. Scott even makes this explicit when he passes responsibility on factual inaccuracies to the author instead of him on supporting the thesis of his post in the follow-up post. This seems plausibly extremely harmful, especially because of the second layer of implicated distrust: none of Heinrich's writing can be taken at face value, which, taken in combination with the previous issue, means that the thesis of both this post and the book should be viewed as totally unsupported, because, as mentioned above, they are entirely supported by anecdotes. This is particularly worrying given that at least one nominator appreciated the "neat factoids" that this post presents.
I would strongly support not including this post in any further collections until major editing work has been done. I think the present post is extremely misleading, epistemically hazardous, and has the potential for significant harm, especially in the potential role of "vaccinating" the community against useful external influence. I do not think my criticism of this post applies to other book reviews by the same author.
This seems to me like a valuable post, both on the object level, and as a particularly emblematic example of a category ("Just-so-story debunkers") that would be good to broadly encourage.
The tradeoff view of manioc production is an excellent insight, and is an important objection to encourage: the original post and book (haven't read in the entirety) appear to have leaned to heavily on what might be described as a special case of a just-so story: the phenomena is a behavior difference is explained as an absolute by using a post-hoc framework, and then doesn't evaluate the meaning of the narrative beyond the intended explanatory effect.
This is incredibly important, because just-so stories have a high potential to deceive a careless agent. Let's look at the recent example of a AstroZeneca's vaccine. Due to a mistake, one section of the vaccine arm of the trial was dosed with a half dose followed by a full dose. Science isn't completely broken, so the possibility that this is a fluke is being considered, but potential causes for why a half-dose full-dose regime (HDFDR) would be more effective have also been proposed. Figuring out how much to update on these pieces of evidence is somewhat difficult, because the selection effect is normally not crucial to evaluating hypotheses in the presence of theory.
To put it mathematically, let A be "HDFDR is more effective than a normal regime," B be "AstroZeneca's groups with HDFDR were more COVID-safe than the treatment group," C be "post-B, a explanation that predicts A is accepted as fact," and D be "pre-B, a explanation that predicts A is accepted as the scientific consensus.
We're interested in P(A|B), P(A|(B&C)), and P(A|(B&D)). P(A|B) is fairly straightforward: By simple application of Bayes's theorem, P(A|B)=P(B|A)*P(A)/(P(A)*P(B|A)+P(¬A)*P(B|¬A). Plugging in toy numbers, let P(B|A)=90% (if HDFDR was more effective, we're pretty sure that the HDFDR would have been more effective in AstroZeneca's trial), P(A)=5% (this is a weird result that was not anticipated, but isn't totally insane). P(B|¬A)=10% (this one is a bit arbitrary, and it depends on the size/power of the trials, a brief google suggests that this is not totally insane). Then, P(A|B)=0.90*0.05/(0.9*0.05+0.95*0.1)=0.32
Next, let's look at P(A|B&C). We're interested in finding the updated probability of A, after observing B and then observing C, meaning we can use our updated prior: P(A|C)=P(C|(A&B))*P(A|B)/(P(C|(A&B))*P(A|B) + P(C|(¬A)&B) * P(¬A|B)). If we slightly exaggerate how broken the world is for the sake of this example, and say that P(C|A&B)=0.99 and P(C|¬A&B)=0.9 (If there is a real scientific explanation, we are almost certain to find it, if there is not, then we'll likely still find something that looks right), then this simplifies to 0.99*0.32/(0.99*0.32+ 0.9 * 0.68), or 0.34: post-hoc evidence adds very little credence in a complex system in which there are sufficient effects that any result can be explained.
This should not, however, be taken as a suggestion to disregard all theories or scientific explorations in complex systems as evidence. Pre-hoc evidence is very valuable: P(A|D&B) can be first evaluated by evaluating P(A|D)=P(D|A)*P(A)/(P(A)*P(D|A)+P(¬A)*P(D|¬A). As before, P(A)=0.05. Filling in other values with roughly reasonable numbers: P(D|¬A)=0.05 (coming up with an incorrect explanation with no motivation is very unlikely), P(D|A)=0.5 (there's a fair chance we'll find a legitimate explanation with no prior motivation). These choices also roughly preserve the log-odds relationship between P(C|A&B) and P(C|¬A&B). Already, this is a 34% chance of A, which further demonstrates the value of pre-registering trials and testing hypotheses.
P(A|B&D) then equals P(B|(A&D))*P(A|D)/(P(D|(A&B))*P(A|D)) + P(B|(¬A)&D) * P(¬A|D)). Notably, D has no impact on B (assuming a well-run trial, which allows further generalization), meaning P(B|A&D)=P(B|A), simplifying this to P(B|(A))*P(A|D)/(P(B|(A))*P(A|D)) + P(B|(¬A)) * P(¬A|D)), or 0.9*0.34/( 0.9*0.34+0.1*0.66), or 0.82. This is a stark difference from the previous case, and suggests that the timing of theories is crucial in determining how a Bayesian reasoner ought to evaluate statements. Unfortunately, this information is often hard to acquire, and must be carefully interrogated.
In case the analogy isn't clear, in this case, the equivalent of a unexpected regime being more effective is that reason apparently breaks down and yields severely suboptimal results: the hypothesis that reason is actually less useful than culture in problems with non-monotonically increasing rewards as the solution progresses is a possible one, but because it was likely arrived at to explain the results of the manioc story, the existence of this hypothesis is weak evidence to prefer it over the hypothesis with more prior probability mass: that different cultures value time in different ways.
Obviously, this Bayesian approach isn't particularly novel, but I think it's a useful reminder as to why we have to be careful about the types of problems outlined in this post, especially in the case of complex systems where multiple strategies are potentially legitimate. I strongly support collation on a meta-level to express approval for the debunking of just-so stories and allowing better reasoning. This is especially true when the just-so story has a ring of truth, and meshes well with cultural narratives.
I think this post significantly benefits in popularity, and lacks in rigor and epistemic value, from being written in English. The assumptions that the post makes in some part of the post contradict the judgements reached in others, and the entire post, in my eyes, does not support its conclusion. I have two main issues with the post, neither of which involve the title or the concept, which I find excellent:
First, the concrete examples presented in the article point towards a different definition of optimal takeover than is eventually reached. All of the potential corrections that the “Navigating to London” example proposes are examples where the horse is incapable of competently preforming the task you ask it to do, and needs additional human brainpower to do so. This suggests an alternate model of “let the horse do what the horse was intended to do, let the human treat the horse as a black box.” However, this alternate model is pretty clearly not total horse takeover, which to me, suggests that total takeover is not optimal for sensorily constrained humans. One could argue that the model in the article, “horse-behaved by default, human-behaved when necessary” is a general case of the specific model suggested by the specific example, which I think brings up another significant issue with the post:
The model chosen is not a valuable one. The post spends most of its length discussing the merits of different types of horse-control, but the model endorsed does not take any of this deliberation into account: all physically permitted types of horse-control remain on the table. This means the definition for total control ends up being “you have total control when you can control everything that you can control” which, while not exactly false, doesn’t seem particularly interesting. The central insight necessary to choose the model that the post chooses is entirely encompassed in the first paragraph of the post.
Finally, I think the agent-level modeling applied in the post is somewhat misleading. The bright line on what you can “tweak” with this model is very unclear, and seems to contradict itself: I’m pretty sure a horse could put holes in its thighs if you have total control over its movements, for example. Are you allowed to tweak the horse’s internal steroid production? Neurotransmitters? The horse doesn’t have conscious control over blood flow, but it’s regulatable: do you get control over that? These seem like the kind of questions this post should address: what makes a horse a horse, and what does controlling that entity mean. I think this issue is even more pronounced when applied to government: does controlling the government as an entity mean controlling constituent parts? Because of these types of questions, I suspect that the question “what does total control of a horse mean” is actually more complex, not less, than it is for a government, and it worries me that the simplifying move occurs from government to horse.
In its current form, I would not endorse collation, because I don’t feel as though the post addresses the questions it sets out to answer.
Oops, you're correct.