People fiercely argue whether AGI will likely be an existential threat or not.
Most of the arguments explore conceptual, technical, or governance aspects of the topic, and are based on reason, logic, and predictions, but things are so complex and uncertain that sometimes I can hardly tell, who is right, especially when I'm not an expert on the topic. Because of this complexity people often make judgement based on their personal preferences and emotions, and this post is a deep dive into psychological reasons of why people might dismiss existential risks from AGI. I don't mean that there are no rational arguments for this position, because there are. It's just not the scope of this post.
Alan is a high-ranking manager working on the LLM project at a tech giant. He believes that AI development is a great opportunity for him to climb up the corporate ladder and to earn a lot of money.
Alan is a speaker at a tech conference, and after his speech a journalist asks him about his thoughts on existential risks.
There are several things going on in Alan's mind at that moment. AI is a gateway for him to having a better career than all his friends. Also, with all the money he will earn, he'll be able to buy the house of his wife's dreams, and send his daughter to any university she wants and don't care about the money.
Thoughts about x-risks threaten his dreams, so he tries to avoid them, and to prove to himself, that he is actually doing a good thing. "It's not just me who will benefit from the technology. It will make the world a way better place for everyone, and those AI doomers are trying to prevent it."
Alan also can't publicly say that AI is a threat. PR department in his company won't like it, and this will cause a lot of problems for him. So, the only thing that is safe for him to say is that that his company have excellent cyber security team, and they do extensive testing before they deploy their models, so there is no reason to worry.
Joep is a respected machine learning scientist, and he has an unhealthy habit to cope with stress by denying it. He has narcissistic mother who didn't show any affection to him when he was a kid. This was painful for him, and he learned to cope with this problem by denying it and telling himself that everything is good. Because of this, every time he experiences fear or anxiety, he tries to convince himself that the cause of his anxiety actually don't exists. So, even he is anxious thinking about x-risks, he tries to convince himself that these risks don't actually exist.At the same tech conference at which Alan, the hero of the previous story, presented his work, a journalist approached Joep and asked him whether he is worried about the existential risks from AI.
Joep becomes visibly annoyed and tells the journalist that the only people who believe in these risks are fear-mongering Luddites. He tells her that his team spent 6 months before releasing their last LLM because they did thorough security testing, and in any critical application they always keep people in the loop. He also recalls an important prediction made by Yudkovsy that turned out to be completely false. "We come up with the new safety ideas every day. We'll sort out all the upcoming problems". It seemed like he wants to prove this to himself more that to the journalist.
After the interview he is still angry and annoyed, and repeats in his head all the arguments against existential risks he just told.
Ada, a young AI ethics researcher, felt a mix of excitement and nerves as she prepared to present her work on AI risks at a high-profile tech conference. The same conference at which we met Alan and Joep from the previous stories. She respected both of them and was excited to hear their talks.
Alan spoke about his company's robust cybersecurity measures, radiating confidence that AI posed no threat. Joep followed, highlighting the responsible steps his team was taking in their work with AI. The audience was visibly reassured, and Ada started to doubt part of her own research that was focused on existential risks from AGI.
When it was her turn, Ada hesitated. Her presentation included slides about x-risks, but recalling Alan's and Joep's confidence, she was so anxious to look foolish, that she skipped over them. Instead, she echoed optimism about the future of AI, and said that there are many talented and responsible people, so there is no reason to worry.
As she left the stage, Ada felt a sense of relief but also an unsettling feeling that she wasn't entirely honest. To feel better, she tried to convince herself that these experts know the AI field way better than her, so her fears about x-risks are probably erroneous.
Dario and Claude are cohosts of an AI podcast. The subject of their newest episode is existential risks from AI. Claude is concerned about the risks, while Dario remains skeptical.
Once recording begins, Claude outlines the arguments that we won't be able to control AI that is way smarter than humans, and this might lead to a disaster. Dario can't fully agree. In his mind, AGI is still far off, and all these threats feel too distant and abstract, so he just can't imagine how it can pose a threat to humanity.
Dario says "It's not like AI can control nuclear weapons or something. I think that high confidence that AGI will want to destroy us is way speculative". In order to believe, Dario needs vision. Something vivid and convincing, and not abstract principles, but each time he asks Claude about concrete scenarios of doom, his answers are vague and uncertain. The answers rely on high-level ideas like orthogonality thesis or "chimps can't control humans, and we'll be like chimps for AGI". Claude acknowledges the difficulty in envisioning concrete doomsday scenarios but insists that the absence of clear examples doesn't negate the risk. He tells "Just because we can't draw a picture doesn't mean the danger isn't real."
Dario is not convinced. He believes that some people are too certain about their abstract ideas and ignore the reality in which we are making incremental progress in alignment.
Eli is a software developer who keeps up with the latest tech trends. Recently he started following AI developments, and got interested in the discussions around existential risks from AI, and he noticed that discussions about these risks are often emotional.
As he scrolled through Twitter, he found a thread written by an emotional doomer who hasn't got any doubt about his views. He reminded Eli environmental activists, and he thought "there's always some people that's convinced the sky is falling." Eli believes that the climate change is a complicated problem that requires a lot of thought and effort to be solved, but overly-emotional and overconfident activists do more harm than good. They annoy people, and poison any thoughtful discussions.
Eli sees AI alarmists as similar people. He believes that they spoil the image of the AI safety community, and also make it hard to discuss more real and near-term problems like the spread of misinformation, or concentration of power in the hands of AI labs.
A friend recommended Eli the episode of the x-risk-themed episode of the AI podcast hosted by Dario and Claude. Eli decided to give it a listen during his commute. Alarmist Claude sounded like he is terrified by the existential risks, and even though he seems like a smart person, Eli immediately classifies him as an activist, and doesn't take his arguements too seriously. "Ah, another one of those anxious Luddites" he thinks.
I think most of these can be viewed as separate types of motivated reasoning. People believe what sounds good to believe. That's an evaluation of both the logic, if they've thought about it, and the value of that belief for that particular person. Belief formation involves a series of decisions, and those decisions are made by reinforcement learning mechanisms involving the dopamine system influencing related brain systems.
The definition of motivated reasoning (MR) overlaps with that of confirmation bias. I think it is the reason confirmation bias is so strong. Scott Alexander has talked about this a good deal; he says
Of the fifty-odd biases discovered by Kahneman, Tversky, and their successors, forty-nine are cute quirks, and one is destroying civilization. This last one is confirmation bias - our tendency to interpret evidence as confirming our pre-existing beliefs instead of changing our minds. This is the bias that explains why your political opponents continue to be your political opponents, instead of converting to your obviously superior beliefs. And so on to religion, pseudoscience, and all the other scourges of the intellectual world. (source)
So, to apply the motivated reasoning lens to your categories:
The last one seems like the most important. And it's also the loosest connection to motivated reasoning. You need to wrap in the halo/horns effect. This is what Scott Alexander calls The noncentral fallacy - the worst argument in the world? or undefined arguments.
There's more explanation and exploration to be done, but I haven't gotten to it yet, so I'm putting my brief thoughts here, since you're addressing this important topic. I want to write a post called something like "people don't believe in x-risk because rationality isn't rational", exploring this topic. Motivated reasoning makes or thinking locally optimal for our survival (one meaning of rational), at the cost of being logically wrong in some out-of-distribution cases of pure logical induction of the truth (another meaning of rational).
I agree with your sentiment that most of this is influenced by motivated reasoning.
I would add that "Joep" in the Denial story is motivated by cognitive dissonance, or rather the attempt to reduce cognitive dissonance by discarding one of the two ideas "x-risk is real and gives me anxiety" and "I don't want to feel anxiety".
In the People Don't Have Images story, "Dario" is likely influenced by the availability heuristic, where he is attempting to estimate the likelihood of a future event based on how easily he can recall similar past events.
Thanks for your thoughtful answer. It's interesting how I just describe my observations, and people make conclusions out of it that I didn't think of
Thank you for writing this Igor. It helps highlight a few of biases that commonly influence peoples decision making around x-risk. I don't think people talk about this enough.
I was contemplating writing a similar post to this around psychology, but I think you have done a better job than I was going to. Your description of 5 hypothetical people communicates the idea more smoothly than what I was planning. Well done. The fact that I feel a little upset that I didn't write something like this sooner, and the fact that the other comment has talked about motivated reasoning, produces an irony that it not lost on me.
Thanks for your feedback. It's always a pleasure to see that my work is helpful for people. I hope you will write articles that are way better than mine!
People don't have images of AI apocalypse
Worse yet, and probably more common, is having an image of an AI apocalypse, that came from irrational, or distorted sources.
Having a very clear image of an obviously fictional AI apocalypse, which your mind very easily jumps to whenever you hear people talking about X-risks, is often far more thought-limiting than having no preconceived image at all.
This was the main hurdle I had to believing in AI doom - I didn't have any coherent argument against it, and I found the doomy arguments pretty convincing. But the conclusion just sounded silly.I'd fall back on talking points like "Well in the 1800s, people who believed in sci-fi narratives like you do, thought that electricity would ressurect the dead, and we'd be punished for playing god. You shouldn't take these paranoias so seriously."
(This is why I, and several other people I know, intentionally avoid evoking sci-fi-associated imagery when talking about AI)