First of all. I am not a researcher. This is just a bunch of isolated observations from working inside a different frame than what a lot of people working in AI alignment posess.
The AI alignment field is staffed for formal and mechanistic competencies. This is not a controversial observation. The field demands a very narrow set of skills. Some are very defined: mathematical literacy. On a looser, higher level: the ability to condense claims into concepts like "sycophantic drift" or "inner alignment". More personally, they share a very specific kind of obsessiveness; and I'd even claim a little bit of isolationist tendencies. (that translates to the ability to operationalize hypotheses into sustained research that often goes in a contrarian direction, it means the person is able to isolate societal parameters and large scale cultural dynamics from a distanced perspective, and draw conclusions from an isolated point of view.)
Either way, I am not going to make claims about these competencies having certain limitations, being "good" or "bad", etc. I think people who read this essay already know that some of these competencies often fail to acknowledge things that aren't accounted for in the specific environment of AI alignment researchers. I don't want to restate the obvious but I do want to explain myself clearly.
There's many examples of this phenomenon: for one. there's almost no hiring pressure towards people with strong empirical social-science competence. Which if you think about it, is the field that has been tasked with predicting the outcomes of societal-scale interventions. The WEIRD bias that plagues almost any societal scale research also affects the AI alignment field. No one is investigating how politically persuasive AI models are in African countries.
Moving on, this field needs a competence it does not currently select for: the ability to discriminate, in practice, what is good, what feels good, and what is good for humanity.
I am not saying AI a