The risks of unaligned-AGI are usually couched in terms of existential risk, whereby the outcome is explicitly or implicitly human extinction. However, given the nature of what an AGI would be able to do, it seems though there are plausible scenarios worse than human extinction. Just as on an individual level it is possible to imagine fates we would prefer death to (e.g. prolonged terminal illness or imprisonment), it is also possible to expand this to humanity as a whole, where we end up in hell-type fates. This could arise from some technical error, wilful design by the initial creators (say religious extremists) or some other unforeseen mishap.

How prominently do these concerns feature for anyone? How likely do you think worse-than-extinction scenarios to be?

New Answer
New Comment

1 Answers sorted by

Dagon

Aug 01, 2022

52

I give those scenarios a much smaller probability weight than simple extinction or irrelevance, small enough that the expected utility contribution, while negative, is not much of a factor compared to the big 0 of extinction (IMO, biggest probability weight) and the big positive of galactic flourishing (low probability, but not as low as the torture/hell outcomes).  

The reverse-causality scenarios (CF Roko's Basilisk) rely on some pretty tenuous commitment and repeatability assumptions - I think it's VERY unlikely that resurrection for torture purposes is worth the resources to any goal-driven agent.  

Even today, there are very few cases of an individual truly unambiguously wanting death and not being able to achieve it (perhaps not as trivially, quickly, or free-of-consequence-for-survivors as hoped, but not fully prevented).  It's pretty much horror- or action-movie territory where adversarial life extension happens.  

Though I should be clear - I would call it positive if today's environment were steady-state.  If you feel otherwise, you may have different evaluations of possible outcomes (including whether "flourishing" that includes struggle and individual unhappiness is net positive).

Mundane adversarial life extension happens all the time: https://slatestarcodex.com/2013/07/17/who-by-very-slow-decay/ 

But I agree that the Roko's basilisk type scenarios are probably very very unlikely. 

4Dagon2y
Uncaring and harmful life extension, yes.  Not actually adversarial, where the purpose is the suffering.  Still, horrific, even if I don't take it as evidence that shifts my likelihood of AGI torture scenarios. I don't actually know the stats, but this also seems less common than it used to be.  When I was younger, I knew a few old people in care facilities that I couldn't imagine would be their choice, but in the last 10 years, I've had more relatives and acquaintances die, and very few of them kept beyond what I'd expect is their preference.  I've participated in a few very direct discussions, and in all cases, the expressed wishes were honored (once after a bit of heated debate, including the sentiment "I'm not sure I can allow that", but in the end it was allowed).