I find the hypothesis that an AGI's values will remain frozen highly questionable. To be believable one would have to argue that the human ability to question values is due only or principally to nothing more than the inherent sloppiness of our evolution. However, I see no reason to suppose that an AGI would apply its intelligence to every aspect of its design except its goal structure. I see no reason to suppose that relatively puny and sloppy minds can do a level of questioning and self-doubt that a vastly superior intelligence never will or can.
I also find in extremely doubtful that any human being has a mind sufficient to make guarantees of what will remain immutable in a much more sophisticated mind after billions of iterative improvements. It will take extremely strong arguments before this appears even remotely feasible.
I don't find CEV at all convincing as the basis for FAI as detailed some time ago on the SL4 list.
Please explicate what you mean by "reflective equilibria of the whole human species. What does the "human species" have to do with it if the "human" as we know it is only a phase on the way to something other that humanity or at least some humans may become?
I don't think it is realistic to both create an intelligence that goes "FOOM" by self-improvement and that is any less than a god compared to us. I know you think you can create something that is not necessarily ever self-aware and yet can maximize human well-being or at least you have seemed to hold this position in the past. I do not believe that is possible. An intelligence that mapped human psychology that deeply would be forced to map our relationships to it. Thus self-awareness along with a far deeper introspection than humans can dream of is inescapable.
That humans age and die does not imply a malevolent god set things up (or exists of course). This stage may be inescapable for the growing of new independent intelligences. To say that this is obviously evil is possibly provincial and a very biased viewpoint. We do not know enough to say.
If "testing is not sufficient" then exactly how are you to know that you have got it right in this colossal undertaking?
From where I am sitting it very much looks like you are trying to do the impossible - trying to not only create an intelligence that dwarfs your own by several orders of magnitude but also to guarantees its fundamental values and the overall results of its implementations of those values in reality with respect to humanity. If that is not impossible then I don't know what is.
Hmm, there are a lot of problems here.
"Unlimited power" is a non-starter. No matter how powerful the AGI is it will be of finite power. Unlimited power is the stuff of theology not of actually achievable minds. Thus the ditty from Epicurus about "God" does not apply. This is not a trivial point. I have a concern Eliezer may get too caught up in these grand sagas and great dilemnas on precisely such a theological absolutist scale. Arguing as if unlimited power is real takes us well into the current essay.
"Wieldable without side effects or configuration constraints" is more of the same. Imaginary thinking far beyond the constraints of any actually realizable situation. More theological daydreaming. It should go without saying that there is no such thing as operating without any configuration constraints and with perfect foresight of all possibly effects. I grow more concerned.
"It is wielded with unlimited precision"?! Come on, what is the game here? Surely you do not believe this is possible. In real engineering effective effort only needs to be precise enough. Infinite precision would incur infinite costs.
Personally I don't think that making an extremely powerful intelligence that is not sentient is moral. Actually I don't think that it is possible. If its goal is to be friendly to humans it will need to model humans to a very deep level. This will of necessity include human recognition of and projection of agency and self toward . It will need to model itself in relation to the environment and its actions. How can an intelligence with vast understanding and modeling of all around it not model the one part of that environment that is itself? How can it not entertain various ways of looking at itself if it models other minds that do so? I think your notion of a transcendent mind without sentience is yet another impossible pipe dream at best. It could be much worse than that.
I don't believe in unlimited power, certainly not as in the hands of humans, even very bright humans, that create the first AGI. There is no unlimited power and thus the impossible is not in our grasp. It is ridiculous to act as if it is. Either an AGI can be created or it can't. Either it is a good idea to create it without sentience or it isn't. Either this is possible or it is not. Either we can predict its future limiting parameters or we cannot. It seems you believe we or you or some very bright people somewhere have unlimited power to do whatever they decide to do.