I mostly agree with this post. I wrote some notes, trying to understand and extend this idea. Apologies for the length which ended up being longer than your post while having less important content, I was a bit rushed and therefore spent less time making my post succinct than the topic deserved.
What model?
I think this discussion is very hard to have because people have substantial variation in how good their base intuitions, including social intuitions, are and also substantial variation in how good their rational minds (S2's) are.
Most good forecasters start off with pretty good base intuitions for probabilities, and forecasting/calibration practice helps refine their intuitions to be even better. Some good forecasters start with terrible probabilistic intuitions and the formal exercises help improve their intuitions. (I suppose it's also theoretically possible for some good forecasters to start with terrible probabilistic intuitions, continue to have terrible probabilistic intuitions, but forecast well because their formal practice and analysis allows them to override their intuitions. I've just never heard about this in practice).
Many mental health conditions come from creating a wrong intuition/prior about the world that rational-adjacent therapy methods like CBT teach you to override. Depression being the most famous/common. When I'm depressed, it's natural to fixate on specific high-noise signals and automatically assume bad interpretations of the evidence (eg someone didn't respond to a text because I've upset them, or they don't like me for other reasons). Presumably there's the inverse problem as well (eg manics have a very positive "rose-colored goggles" prior for everything), but I've never heard of this as a serious problem in practice.
In the realm of social intuitions, I think there's large variation in how good different social intuitions are. Autism in particular is a disorder which has a core manifestation of being bad at parsing social intuitions. I imagine most young autistic people would benefit from learning higher-order rationality and principles about social questions counteracting/overriding their own poor social intuitions ("vibes"), and to a lesser extent training/calibrating their poor social intuitions to become better. On the flip side, neurotypical people that are prone to ideological capture should potentially learn to trust their hearts more, and be less inclined to trust higher order ideologies and "rationality" (which often manifests as rationalizations), in favor of their base intuitions.
But even all of these comments are overly high-order glosses. Autism does not make you immune to rationalization, nor does ideological capture suddenly make you incapable of rationality. Reverse all advice you hear, etc.
My own background is in academic social science and national security, for whatever that’s worth
Why should we assume the AI wants to survive? If it does, then what exactly wants to survive?
...
Why should we assume that the AI has boundless, coherent drives?
Are you familiar with the "realist" school of international relations, and in particular their theoretical underpinnings?
If so, I think it'd be helpful to consider Yudkowsky and Soares's arguments in that light. In particular, how closely does the international order for emerging superintelligences look like the anarchic international order for realist states? What are the weaknesses of the realist school of analysis, and do they apply to AIs?
I'll just hop on the bandwagon and say that I'll be posting my thoughts over at inchpin.substack.com!
Do you want to come up with some other "obvious exceptions" to your "Nobody says X" claim?
Tbh, I find this comment kinda bizarre.
Popular belief analogizes internet arguments to pig-wrestling: "Never wrestle with a pig because you both get dirty and the pig likes it" But does the pig, in fact, like it? I set out to investigate.
Nobody writes a story whose moral is that you should be selfish and ignore the greater good,
This seems obviously false. Ayn Rand comes to mind as the most iconic example, but eg Camus' The Stranger also had this as a major theme, as does various self-help books. It is also the implicit moral of JJ Thompson's violinist thought-experiment. My impression from reading summaries is that it's also a common theme for early 20th century Japanese novels (though I don't like them so I never read one myself).
I have a number of notes on questions I'm interested in. Would love to see some feedback and further thoughts from people!