Imagine the set “the 50 people who would most helpfully contribute to technical alignment research, were they to be working on it, yet who are working on something else instead.” If you had to guess—if you had to make up a story which seems plausible—why are they working on something else instead? And what is it they’re working on?
When you are older, you will learn that the first and foremost thing which any ordinary person does is nothing.
-- Professor Quirrel, HPMOR
Like, obviously I don't mean the above straightforwardly, which kind of just dodges the question, but I think the underlying generator of it points towards something real. In particular, I think that most of human behavior is guided by habit and following other people's examples. Very few humans are motivated by any form of explicit argument when it comes to their major life decisions and are instead primarily trying to stabilize their personal life, compete locally to get access to resources, and follow the example that other people around them have set and were socially rewarded for.
Concretely I think that humanity at large, in its choice of what it works on, should be modeled as an extremely sluggish system that tries to make minimal adjustments to its actions unless very strong forces compel it to (the industrial revolution was one such force, which did indeed reshape humanity's everyday life much more than basically any event before it*).
So, most of the people I would like to be working on the important things are following deeply entrenched paths that only shift slowly, mostly driven by habits, local satisficing and social precedence.
One natural category of answer, is that humans are scared of risking social stability. Alignment research is not an avenue that is old and safe within the institutions where most research-like things happen (universities), nor is it really an avenue at all. Most of the places where it happens are weird and new and not part of any establishment.
Then again, OpenAI and DeepMind are exciting-yet-fairly-safe places, and it's not like the best mathematicians of our day are knocking down their doors looking for a prestigious job where they can also work on alignment. I guess for those people I do think that this type of optimisation is a somewhat alien thought process. They primarily pick topics that they find interesting, not important. It's one thing to argue that they should be working in a field, it's another thing to get them fascinated by it.
(I do think the Embedded Agency sequence is one of the best things that exists for building curiosity about bounded rationality, and am curious to hear of any good mathematicians/computer scientists/physicists who read it and what they feel about the problems contained therein.)
Example answers which strike me as plausible:
- Most members of this set simply haven’t yet encountered one of the common attractors—LessWrong, CFAR, Superintelligence, HPMOR, 80k, etc. Perhaps this is because they don’t speak English, or because they’re sufficiently excited about their current research that they don’t often explore beyond it, or because they’re 16 and can’t psychologically justify doing things outside the category “prepare for college,” or because they’re finally about to get tenure and are actively trying to avoid getting nerd sniped by topics in other domains, or because they don’t have many friends so only get introduced to new topics they think to Google, or simply because despite being exactly the sort of person who would get nerd sniped by this problem if they’d ever encountered it they just… never have, not even the basic “maybe it will be a problem if we build machines smarter than us, huh?”, and maybe it shouldn’t be much more surprising that there might still exist pockets of extremely smart people who’ve never thought to wonder this than that there presumably existed pockets of extremely smart people for millennia who never thought to wonder what effects might result from more successful organisms reproducing more?
- Most members of this set have encountered one of the common attractors, or at least the basic ideas, but only in some poor and limited form that left them idea inoculated. Maybe they heard Kurzweil make a weirdly-specific claim once, or the advisor they really respect told them the whole field is pseudoscience that assumes AI will have human-like consciousness and drives to power, or they tried reading some of Eliezer’s posts and hated the writing style, or they felt sufficiently convinced by an argument for super-long timelines that investigating the issue more didn’t feel decision-relevant.
- The question is ill-formed: perhaps because there just aren’t 50 people who could helpfully contribute who aren't doing so already, or because the framing of the question implies the “50” is the relevant thing to track whereas actually research productivity is power law-ey and the vast majority of the benefit would come from finding just one or three particular members of this set and finding them would require asking different questions.
Discounting. There is no law of nature that can force me to care about preventing human extinction years from now, more than eating a tasty sandwich tomorrow. There is also no law that can force me to care about human extinction much more that about my own death.
There are, of course, more technical disagreements to be had. Reasonable people could question how bad unaligned AI will be or how much progress is possible in this research. But unlike those questions, the reasons of discounting are not debatable.