Droopyhammock

Wiki Contributions

Comments

Your response does illustrate that there are holes in my explanation. Bob 1 and Bob 2 do not exist at the same time. They are meant to represent one person at two different points in time.

A separate way I could try to explain what kind of resurrection I am talking about is to imagine a married couple. An omniscient husband would have to care as much about his wife after she was resurrected as he did before she died.

I somewhat doubt that I could patch all of the holes that could be found in my explanation. I would appreciate it if you try to answer what I am trying to ask.

I seem to remember your P(doom) being 85% a short while ago. I’d be interested to know why it has dropped to 70%, or in another way of looking at it, why you believe our odds of non-doom have doubled.

I have edited my shortform to try to better explain what I mean by “the same”. It is kind of hard to do so, especially as I am not very knowledgeable on the subject, but hopefully it is good enough.

Do you believe that resurrection is possible?

By resurrection I mean the ability to bring back people, even long after they have died and their body has decayed or been destroyed. I do not mean simply bringing someone back who has been cryonically frozen. I also mean bringing back the same person who died, not simply making a clone.

I will try to explain what I mean by “the same”. Lets call the person before they died “Bob 1” and the resurrected version ”Bob 2”. Bob 1 and Bob 2 are completely selfish and only care about themselves. In the version of resurrection I am talking about, Bob 1 cares as much about Bob 2’s experience as Bob 1 would care about Bob 1’s future experience, had Bob 1 not died.

It is kind of tricky to articulate exactly what I mean when I say “the same”, but I hope the above is good enough.

If you want to, an estimate of the percentage chance of this being possible would be cool, but if you just want to give your thoughts I would be interested in that aswell.

I just want to express my surprise at the fact that it seems that the view that the default outcome from unaligned AGI is extinction is not as prevalent as I thought. I was under the impression that literally everyone dying was considered by far the most likely outcome, making up probably more than 90% of the space of outcomes from unaligned AGI. From comments on this post, this seems to not be the case.

I am know distinctly confused as to what is meant by “P (doom)”. Is it the chance of unaligned AGI? Is it the chance of everyone dying? Is it the chance of just generally bad outcomes?

Is there something like a pie chart of outcomes from AGI?

I am trying to get a better understanding of the realistic scenarios and their likelihoods. I understand that the likelihoods are very disagreed upon.

My current opinion looks a bit like this:

30%: Human extinction

10%: Fast human extinction

20%: Slower human extinction 

 

30%: Alignment with good outcomes

 

20%: Alignment with at best mediocre outcomes

 

20%: Unaligned AGI, but at least some humans are still alive

12%: We are instrumentally worth not killing

6%: The AI wireheads us

2%: S-risk from the AI having producing suffering as one of its terminal goals

 

I decided to break down the unaligned AGI scenarios a step further.

If there are any resources specifically to refine my understanding of the possible outcomes and their likelihoods, please tell me of them. Additionally, if you have any other relevant comments I’d be glad to hear them.

I have had more time to think about this since I posted this shortform. I also posted a shortform after that which asked pretty much the same question, but with words, rather than just a link to what I was talking about (the one about why is it assumed an AGI would just use us for our atoms and not something else). 
 

I think that there is a decent chance that an unaligned AGI will do some amount of human experimentation/ study, but it may well be on a small amount of people, and hopefully for not very long. 
To me, one of the most concerning ways this could be a lot worse is if there is some valuable information we contain, which takes a long time for an AGI to gain through studying us. The worst case scenario would then probably be if the AGI thinks there is a chance that we contain very helpful information, when in fact we don’t, and so endlessly continues studying/ experimenting on us, in order to potentially extract that information.

I have only been properly aware of the alignment problem for a few months, so my opinions and understanding of things is still forming. I am particularly concerned by s-risks and I have OCD, so I may well overestimate the likelihood of s-risks. I would not be surprised if a lot of the s-risks I worry about, especially when they are things which decrease the probability of AGI killing everyone, are just really unlikely. From my understanding Eliezer and others think that literally everyone dying makes up the vast majority of the bad scenarios, although I’m not sure how much suffering is expected before that point. I know Eliezer said recently that he expects our deaths to be quick, assuming an unaligned AGI.

Quick question:

How likely is AGI within 3 months from now?

For the purpose of this question I am basically defining AGI as the point at which, if it is unaligned, stuff gets super weird. By “Super weird“ I mean things that are obvious to the general public, such as everybody dropping dead or all electronics being shut down or something of similar magnitude. For the purposes of this question, the answer can’t be “already happened” even if you believe we already have AGI by your definition.

I get the impression that the general opinion is “pretty unlikely” but I’m not sure. I’ve been feeling kinda panicked about the possibility of extremely imminent AGI recently, so I want to just see how close to reality my level of concern is in the extremely short term.

This seems like a good way to reduce S-risks, so I want to get this idea out there. 


This is copied from the r/SufferingRisk subreddit here: https://www.reddit.com/r/SufferingRisk/wiki/intro/

As people get more desperate in attempting to prevent AGI x-risk, e.g. as AI progress draws closer & closer to AGI without satisfactory progress in alignment, the more reckless they will inevitably get in resorting to so-called "hail mary" and more "rushed" alignment techniques that carry a higher chance of s-risk. These are less careful and "principled"/formal theory based techniques (e.g. like MIRI's Agent Foundations agenda) but more hasty last-ditch ideas that could have more unforeseen consequences or fail in nastier ways, including s-risks. This is a phenomenon we need to be highly vigilant in working to prevent. Otherwise, it's virtually assured to happen; as, if faced with the arrival of AGI without yet having a good formal solution to alignment, most humans would likely choose a strategy that at least has a chance of working (trying a hail mary technique) instead of certain death (deploying their AGI without any alignment at all), despite the worse s-risk implications. To illustrate this, even Eliezer Yudkowsky, who wrote Separation from hyperexistential risk, has written this (due to his increased pessimism about alignment progress):

At this point, I no longer care how it works, I don't care how you got there, I am cause-agnostic about whatever methodology you used, all I am looking at is prospective results, all I want is that we have justifiable cause to believe of a pivotally useful AGI 'this will not kill literally everyone'.

The big ask from AGI alignment, the basic challenge I am saying is too difficult, is to obtain by any strategy whatsoever a significant chance of there being any survivors. (source)

If even the originator of these ideas now has such a singleminded preoccupation with x-risk to the detriment of s-risk, how could we expect better from anyone else? Basically, in the face of imminent death, people will get desperate enough to do anything to prevent that, and suddenly s-risk considerations become a secondary afterthought (if they were even aware of s-risks at all). In their mad scramble to avert extinction risk, s-risks get trampled over, with potentially unthinkable results.
One possible idea to mitigate this risk would be, instead of trying to (perhaps unrealistically) prevent any AI development group worldwide from attempting hail mary type techniques in case the "mainline"/ideal alignment directions don't bear fruit in time, we could try to hash out the different possible options in that class, analyze which ones have unacceptably high s-risk to definitely avoid, or less s-risk which may be preferable, and publicize this research in advance for anyone eventually in that position to consult. This would serve to at least raise awareness of s-risks among potential AGI deployers so they incorporate it as a factor, and frontload their decisionmaking between different hail marys (which would otherwise be done under high time pressure, producing an even worse decision).

 

I believe that this idea of identifying which Hail Mary strategies are particularly bad from an s-risk point of view seems like a good idea. I think that this may be a very important piece of work that should be done, assuming it has not been done already.

Not necessarily

Suicide will not save you from all sources of s-risk and may make some worse. If quantum immortality is true, for example. If resurrection is possible, this then makes things more complicated.

The possibility for extremely large amounts of value should also be considered. If alignment is solved and we can all live in a Utopia, then killing yourself could deprive yourself of billions+ years of happiness.

I would also argue that choosing to stay alive when you know of the risk is different from inflicting the risk on a new being you have created.

With that being said, suicide is a conclusion you could come to. To be completely honest, it is an option I heavily consider. I fear that Lesswrong and the wider alignment community may have underestimated the likelihood of s-risks by a considerable amount.

Load More