In the post linked above, I propose a plan for addressing superintelligence-based risks.

The core idea is a three-step plan for giving us more time to figure out alignment by:

  1. Scanning human brains, or extracting them from simulations of earth (perhaps found in the universal distribution).
  2. Creating a safe and deterministic virtual environment in which they have all the time they need to either figure out alignment, or design a better virtual environment for themselves to do that.
  3. Giving an AGI the following formalized goal: "implement whatever goal is determined by this deterministic computation".

The deterministic aspect is key: if the simulation is deterministic, then the AGI should want to run it with full accuracy, without meddling with what we do inside of it.

There are complications and aspects to be investigated in every part of this plan, but I believe it's not too bad of a plan to start working on considering the current state of alignment and AI risk.

See the linked post for full details.

New to LessWrong?

Mentioned in
New Comment
2 comments, sorted by Click to highlight new comments since: Today at 5:19 AM

I think the basic concept of "simulate AI researchers, run them at really fast speeds to figure out what to do" is one of the maybe-useful plans we might consider. I'm not sure the second two steps really help. (I guess I do expect the uploaded researchers to need an environment, but it seems like a relatively simple part of the problem, more of an implementation detail)

I don't know that we currently have the technology to simulate researchers, let alone simulate them really fast. The point of my plan is to first let AGI overtake the universe, and then let it simulate those researchers as fast as it can implement.

In addition, getting to boot an AGI can give us advantages such as being able to extract the researchers from a simulation of earth (thanks to having an immense amount of compute available), or some other bootstrap scenarios I've thought of such as locating earth within the universal distribution, getting in touch with me-in-this-simulation, giving me acces to huge amounts of (dumb) compute I might need to figure out brain extraction, and then extract brains.

The point of this is not to get 100× as much time by running researchers at 100× speed, it's getting a bazillion× as much time by running the researchers after superintelligence takes over, without any time constraints, without any (or with very little) compute constraints (and thus potentially having access to oracles), without competing with anything, and with the certainty that the superintelligence will implement whatever we do decide on.