I'm exploring the possibility of changing my belief that 'I should act on risks from powerful AI'. I haven't heard good arguments or reasons for not worrying about advanced AI, but it all might be just bias and stubbornness on my part. This is mostly an exercise in epistemics.

Given the premise "I should be doing something about risks from powerful AI", the question is: What observations of the world would cause me to think that much less. What would have to happen?

Additional assumptions:

  • The likelihood of AI doom within the next 50 years is 5 to 20 percent.
    • by doom, I mean human extinction or AI-controlled dystopia
  • Almost nobody works on it. Recently, a lot more people talk about it though.
  • If AI kills us all, it will be something like a paper clip maximizer rather than something possibly valuable like trans-human super consciousness.
  • This is the most important problem.

 

The following categories are roughly in order of plausibility/sanity. No ordering within categories.

EA 101

If those happen, the EA part of me might deprioritize AI safety. The following points are mostly related to Importance x Tractability x Neglectedness framework.

  1. Very good arguments that P(Doom|even if we try really hard) ≈ 1: well, yeah. If we are doomed for sure, why worry at all? Just enjoy things while they last. Or actually not. If timelines are short (a few decades) and we are still EAs then the most we can do is probably by helping existing people and animals to have better lives.
  2. Proof that P(Doom) ≈ 0
  3. No longer neglected 
    1. I can imagine a few scenarios where AI alignment is a finite problem. We just need to throw enough manpower on it and then it will get solved. Yay, we are good for eternity. 
      Unfortunately, not much collective human effort has gone into alignment so far. For roughly ten years few hundred people have been working on it at most. So it is a few thousand person-years. Feels like we haven't even tried to solve it. 
    2. More likely than the finite nature of the problem we have something resembling a rocket equation. The more capability research we do, the more alignment research we need. So we would need larger . At some point, people smarter than me would say AI alignment is no longer neglected.
  4. Some other cause area suddenly becoming more pressing e.g. being on the brink of development of self-replicating non-AI nanobots

Reasonable possibilities

  1. Someone solving AI alignment in an easily replicable way. Not just a theoretical possibility of provably aligned AI, but a practical, easy to implement, not resource intensive, etc. approach to alignment.
  2. If no smart person would warn us about AI anymore. Especially, if people who are considering AI as x-risk now would stop worrying. Even more so, if they had smart arguments why they don't worry about it.
  3. Hitting another AI winter. As of now (Nov 2023) huge news every few weeks are normal. Even if there was e.g. 6 month period, where nothing important would happen (not for lack of trying) it would be great. If such period lasted years, it would be better.
    1. Currently, we have exponential trends in training data size, training compute, and algorithmic improvements. If it ALL started hitting diminishing returns, that would be good.
  4. Worldwide stigma against AI research. We don't do human germline DNA editing etc. If there was a huge stigma and AI research was illegal, there would be less to worry about. (assumption: AGI cannot be developed very quietly, it needs a lot of researchers, compute, etc.)
  5. Starting to partially solve alignment - e.g. provably and easily solving at least some areas of alignment, e.g. specification gaming or getting really good at mechanistic interpretability.
  6. Sufficiently predictive theory of deep learning capabilities suggesting that we definitely need some unknown breakthrough to achieve x-risk level capabilities. 
  7. Someone deploying AGI. Shortly after this happens I expect we are going to be in a "why worry at all" situation.
  8. Strong international cooperation. Something like alignment project Manhattan or alignment CERN.
  9. More forecasting going against my current position like The Existential Risk Persuasion Tournament organized by FRI in 2022

Personal

  1. Shifts in my personal philosophy. 
    1. Becoming a moral nihilist. I already think that the most likely the case is that there are not true values, but I am basically Pascal mugging myself in the following way. Something valuable might exists and therefore moral nihilism can be ignored. Also, extinction is bad because it if you are dead, it is harder to find out what has value. (this is extreme tldr though)
      Also, my intuitions about ethics completely crumble if there is an infinite amount of value in the universe/multiverse. 
  2. Signals that my contribution is likely to be counterproductive. 
    1. e.g. I engaged in some discourse with some Czech AI ethicist. If he and a few other smart people told me, that after reading what I wrote, they are now even more convinced that AI safety is implausible rubbish.
      On the other hand, this would be either signal that I should not engage in public discourse, or that I need to become better at it if I can.
  3. Signals that my contribution is not likely to be useful.
    1. Hard to get a job, even though everyone is saying it is neglected.
    2. Hard to come up with relevant personal projects.
  4. Losing trust in EA/AI Safety communities ability to do net good
    If all those communities do is accelerate AI, create drama and annoy people of power to the point they don't want to have to do anything with "ai safety" we have a problem. 
    But would it mean that just the community is irredeemable or the cause itself?
    Beware systemic change shows there were do-gooder movements, that plausibly ended net negative. Case in point, plenty of high-integrity Marxists dropped the ideology altogether, when they saw where it is leading in practice.
    It is hard for me to imagine circumstances, where actions done by members of the AI Safety community would discredit the cause. But if those action discredited the movement itself, it would definitely become harder to contribute to the cause. 

Aliens

  1. Direct divine/alien intervention.
  2. Discovering enough advanced alien civilizations and at the same time not discovering many signs of civilizations destroyed by advanced AI. This would be great, actual empirical data. But those data would be very hard to get even if there is something to see out there. My guess is that the next-gen telescope (Habitable Worlds Observatory planned to launch in 2041) won't be enough so we would need at least 2 more generations of large telescopes specialized in exoplanet analysis. So realistically 2050s or 2060s. Unless someone like Musk decides this is a good way to spend tens of billions of dollars and does it in like 5 years. 
    (For context, Metaculus has 9 percent for at least a single technosignature detection before 2050. Also shoutout to high quality YouTube channel about exoplanet and exomoon astronomy: coolworlds.)
  3. If there are enough aliens, humanity is comparatively less valuable. Some of them might have similar enough values. The best argument we live in such a universe are Hanson's grabby aliens.

Conclusion

Looking at this list, the most reasonable seems some combination of "alignment is easy", "we will make good progress, if we work on it" and "truly dangerous AI is hard" + "Might be very hard for me to contribute". 
Other points seem more far fetched and I originally hoped to come up with more realistic scenarios. If you have some, share them :)

Meantime, I continue to think that "I should act on risks from powerful AI" and I don't see any signs, that this should change in the near future. 

 

New Comment