Well, let's start with the conditional probability if humans don't find some other way to kill ourselves or end civilization before it comes to this. Eliezer seems to argue the following:
A. Given we survive long enough, we'll find a way to write a self-modifying program that has, or can develop, human-level intelligence. (The capacity for self-modification follows from 'artificial human intelligence,' but since we've just seen links to writers ignoring that fact I thought I'd state it explicitly.) This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws. (I don't know how we'd give it all of our disadvantages even if we wanted to. If we did, then someone else could and eventually would build an AI without such limits.)
B. Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
C. Given B, the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
D. Given B and C, the AI could wipe out humanity if it 'wanted' to do so.
My estimate for the probability of some of these fluctuates from day to day, but I tend to give them all a high number. Claim A in particular seems almost undeniable given the evidence of our own existence. (I only listed that one separately so that people who want to argue can do so more precisely.) And when it comes to Claim E saying that if you tell a computer to kill you it will try to kill you, I don't think the alternative has enough evidence to even consider. So I find it hard to imagine anyone rationally getting a total lower than 12% or just under 1/8.
Now that all applies to the conditional probability (if human technological civilization lives that long). I don't know how to evaluate the timescale involved or the chance of us killing ourselves before the issue would come up. The latter certainly feels like less than 11/12.
The question would grow in importance if we found out that we needed to convince a nationally important number of people to pay attention to the issue before someone creates a theory of Friendly AI including AI goal stability. I really hope that doesn't apply, because I suspect that if it does we're screwed.
Link: johncarlosbaez.wordpress.com/2011/04/24/what-to-do/
His answer, as far as I can tell, seems to be that his Azimuth Project does trump the possibility of working directly on friendly AI or to support it indirectly by making and contributing money.
It seems that he and other people who understand all the arguments in favor of friendly AI and yet decide to ignore it, or disregard it as unfeasible, are rationalizing.
I myself took a different route, I was rather trying to prove to myself that the whole idea of AI going FOOM is somehow flawed rather than trying to come up with justifications for why it would be better to work on something else.
I still have some doubts though. Is it really enough to observe that the arguments in favor of AI going FOOM are logically valid? When should one disregard tiny probabilities of vast utilities and wait for empirical evidence? Yet I think that compared to the alternatives the arguments in favor of friendly AI are water-tight.
The problem why I and other people seem to be reluctant to accept that it is rational to support friendly AI research is that the consequences are unbearable. Robin Hanson recently described the problem:
I believe that people like me feel that to fully accept the importance of friendly AI research would deprive us of the things we value and need.
I feel that I wouldn't be able to justify what I value on the grounds of needing such things. It feels like that I could and should overcome everything that isn't either directly contributing to FAI research or that helps me to earn more money that I could contribute.
Some of us value and need things that consume a lot of time...that's the problem.