Wanted: backup plans for "seed AI turns out to be easy"

by Wei_Dai 8y28th Sep 201164 comments


Earlier, I argued that instead of working on FAI, a better strategy is to pursue an upload or IA based Singularity. In response to this, some argue that we still need to work on FAI/CEV, because what if it turns out that seed AI is much easier than brain emulation or intelligence amplification, and we can't stop or sufficiently delay others from building them? If we had a solution to CEV, we could rush to build a seed AI ourselves, or convince others to make use of the ideas.

But CEV seems a terrible backup plan for this contingency, since it involves lots of hard philosophical and implementation problems and therefore is likely to arrive too late if seed AI turns out to be easy. (Searching for whether Eliezer or someone else addressed the issue of implementation problems before, I found just a couple of sentences, in the original CEV document: "The task of construing a satisfactory initial dynamic is not so impossible as it seems. The satisfactory initial dynamic can be coded and tinkered with over years, and may improve itself in obvious and straightforward ways before taking on the task of rewriting itself entirely." Which does not make any sense to me—why can't every other AGI builder make the same argument, that their code can be "tinkered with" over many years, and therefore is safe? Why aren't we risking the "initial dynamic" FOOMing while it's being tinkered with? Actually, it seems to me that an AI cannot begin to extrapolate anyone's volition until it's already more powerful than a human, so I have no idea how the tinkering is supposed to work at all.)

So, granting that "seed AI is much easier than brain emulation or intelligence amplification" is a very real possibility, I think we need better backup plans. This post is a bit similar to The Friendly AI Game, in that I'm asking for a utility function for a seed AI, but the goal here is not necessarily to build an FAI directly, but to somehow make an eventual positive Singularity more likely, while keeping the utility function simple enough that there's a good chance it can be specified and implemented correctly within a relatively short amount of time. Also, the top entry in that post is an AI that can answer formally specified questions with minimal side effects, apparently with the idea that we can use such an AI to advance many kinds of science and technology. But I agree with Nesov—such an AI doesn't help, if the goal is an eventual positive Singularity:

We can do lots of useful things, sure (this is not a point where we disagree), but they don't add up towards "saving the world". These are just short-term benefits. Technological progress makes it easier to screw stuff up irrecoverably, advanced tech is the enemy. One shouldn't generally advance the tech if distant end-of-the-world is considered important as compared to immediate benefits [...]

To give an idea of the kind of "backup plan" I have in mind, one idea I've been playing with is to have the seed AI make multiple simulations of the entire Earth (i.e., with different "random seeds"), for several years or decades into the future, and have a team of humans pick the best outcome to be released into the real world. (I say "best outcome" but many of the outcomes will probably be incomprehensible or dangerous to directly observe, so they should mostly judge the processes that lead to the outcomes instead of the outcomes themselves.) This is still quite complex if you think about how to turn this "wish" into a utility function, and lots of things could still go wrong, but to me it seems at least the kind of problem that a team of human researchers/programmers can potentially solve within the relevant time frame.

Do others have any ideas in this vein?