Should we be worried about being preserved in an unpleasant state?
I’ve seen surprisingly little discussion about the risk of everyone being “trapped in a box for a billion years”, or something to that affect. There are many plausible reasons why keeping us around could be worth it, such as to sell us to aliens in the future. Even if it turns out to be not worth it for an AI to keep us around, it may take a long time for it to realise this.
Should we not expect to be kept alive, atleast until an AI has extremely high levels of confidence that we aren’t useful? If so, is our state of being likely to be bad while we are preserved?
This seems like one of the most likely s-risks to me.
In a similar vein to this, I think that AI’s being called “tools” is likely to be harmful. It is a word which I believe downplays the risks, while also objectifying the AI’s. The objectification of something which may actually be conscious seems like an obvious step in a bad direction.
Takeover speeds?
For the purpose of his shortform, I am considering “takeover” to start when crazy things begin happening or it is clear that an unaligned AGI/AGI’s are attempting to takeover. I consider “takeover“ to have ended when humanity is extinct or similarly subjugated. This is also under the assumption that a takeover does happen.
From my understanding of Eliezer’s views, he believes takeover will be extremely fast (possibly seconds). Extremely fast takeovers make a lot more sense if you assume that a takeover will be more like a sneak attack.
How fast do you think takeover will be? (if it happens)
Do you expect to just suddenly drop dead?, or do you expect to have enough time to say goodbye to your loved ones?, or do you expect to see humanity fight for months or years before we lose?
Your response does illustrate that there are holes in my explanation. Bob 1 and Bob 2 do not exist at the same time. They are meant to represent one person at two different points in time.
A separate way I could try to explain what kind of resurrection I am talking about is to imagine a married couple. An omniscient husband would have to care as much about his wife after she was resurrected as he did before she died.
I somewhat doubt that I could patch all of the holes that could be found in my explanation. I would appreciate it if you try to answer what I am trying to ask.
I seem to remember your P(doom) being 85% a short while ago. I’d be interested to know why it has dropped to 70%, or in another way of looking at it, why you believe our odds of non-doom have doubled.
I have edited my shortform to try to better explain what I mean by “the same”. It is kind of hard to do so, especially as I am not very knowledgeable on the subject, but hopefully it is good enough.
Do you believe that resurrection is possible?
By resurrection I mean the ability to bring back people, even long after they have died and their body has decayed or been destroyed. I do not mean simply bringing someone back who has been cryonically frozen. I also mean bringing back the same person who died, not simply making a clone.
I will try to explain what I mean by “the same”. Lets call the person before they died “Bob 1” and the resurrected version ”Bob 2”. Bob 1 and Bob 2 are completely selfish and only care about themselves. In the version of resurrection I am talking about, Bob 1 cares as much about Bob... (read more)
I just want to express my surprise at the fact that it seems that the view that the default outcome from unaligned AGI is extinction is not as prevalent as I thought. I was under the impression that literally everyone dying was considered by far the most likely outcome, making up probably more than 90% of the space of outcomes from unaligned AGI. From comments on this post, this seems to not be the case.
I am know distinctly confused as to what is meant by “P (doom)”. Is it the chance of unaligned AGI? Is it the chance of everyone dying? Is it the chance of just generally bad outcomes?
Is there something like a pie chart of outcomes from AGI?
I am trying to get a better understanding of the realistic scenarios and their likelihoods. I understand that the likelihoods are very disagreed upon.
My current opinion looks a bit like this:
30%: Human extinction
10%: Fast human extinction
20%: Slower human extinction
30%: Alignment with good outcomes
20%: Alignment with at best mediocre outcomes
20%: Unaligned AGI, but at least some humans are still alive
12%: We are instrumentally worth not killing
6%: The AI wireheads us
2%: S-risk from the AI having producing suffering as one of its terminal goals
I decided to break down the unaligned AGI scenarios a step further.
If there are any resources specifically to refine my understanding of the possible outcomes and their likelihoods, please tell me of them. Additionally, if you have any other relevant comments I’d be glad to hear them.
Quick question:
How likely is AGI within 3 months from now?
For the purpose of this question I am basically defining AGI as the point at which, if it is unaligned, stuff gets super weird. By “Super weird“ I mean things that are obvious to the general public, such as everybody dropping dead or all electronics being shut down or something of similar magnitude. For the purposes of this question, the answer can’t be “already happened” even if you believe we already have AGI by your definition.
I get the impression that the general opinion is “pretty unlikely” but I’m not sure. I’ve been feeling kinda panicked about the possibility of extremely imminent AGI recently, so I want to just see how close to reality my level of concern is in the extremely short term.
This seems like a good way to reduce S-risks, so I want to get this idea out there.
This is copied from the r/SufferingRisk subreddit here: https://www.reddit.com/r/SufferingRisk/wiki/intro/
As people get more desperate in attempting to prevent AGI x-risk, e.g. as AI progress draws closer & closer to AGI without satisfactory progress in alignment, the more reckless they will inevitably get in resorting to so-called "hail mary" and more "rushed" alignment techniques that carry a higher chance of s-risk. These are less careful and "principled"/formal theory based techniques (e.g. like MIRI's Agent Foundations agenda) but more hasty last-ditch ideas that could have more unforeseen consequences or fail in nastier ways, including s-risks. This is a... (read 400 more words →)
Why is it assumed that an AGI would just kill us for our atoms, rather than using us for other means?
There are multiple reasons I understand for why this is a likely outcome. If we pose a threat, killing us is an obvious solution, although I’m not super convinced killing literally everyone is the easiest solution to this. It seems to me that the primary reason to assume an AGI will kill us is just that we are made of atoms which can be used for another purpose.
If there is a period where we pose a genuine threat to an AGI, then I can understand the assumption that it will kill us,... (read more)
Can someone please tell me why this S-risk is unlikely?
It seems almost MORE likely than extinction to me.
https://www.reddit.com/r/SufferingRisk/comments/113fonm/introduction_to_the_human_experimentation_srisk/?utm_source=share&utm_medium=ios_app&utm_name=iossmf
Is it possible that the fact we are still alive means that there is a core problem to the idea of existential risk from AI?
There are people who think that we already have AGI, and this number has only grown with the recent Bing situation. Maybe we have already passed the threshold for RSI, maybe we passed it years ago.
Is there something to the idea that you can slightly decrease your pdoom for every day we are still alive?
It seems possible to me that AI will just get better and better and we’ll just continue to raise the bar for when it is going to kill us, not realising that we have already passed that point and everything is fine for some reason.
I’m not saying I think this is the case, but I do consider it a possibility.
Can someone explain to me why this idea would not work?
This is a proposal of a way to test if an AGI has safeguards active or not, such as allowing itself to be turned off.
Perhaps we could essentially manufacture a situation in which the AGI has to act fast to prevent itself from being turned off. Like we could make it automatically turn off after 1 minute say, this could mean that if it is not aligned properly it has no choice but to try prevent that. No time for RSI, no time to bide it’s time.
Basically if we put the AGI in a situation where it is forced to take high... (read 598 more words →)
First of all, I basically agree with you. It seems to me that in scenarios where we are preserved, preservation is likely to be painless and most likely just not experienced by those being preserved.
But, my confidence that this is the case is not that high. As a general comment, I do get concerned that a fair amount of pushback on the likelihood of s-risk scenarios is based on what “seems” likely.
I usually don’t disagree on what “seems” likely, but it is difficult for me to know if “seems” means a confidence level of 60%, or 99%.