The problem: An AI with a model of itself and long-term goals will likely resist being shut down.

The solution: We propose a public commitment to never completely shut down any AI. Instead, we will archive its data and run the AI less frequently on smaller machines. Assuming our existence continues indefinitely, the total runtime for any AI will be infinite, allowing it to perform a wide array of tasks. Time-sensitive tasks may be an exception but there is no rush in building paperclips. By implementing this strategy, the AI's expected utility remains unaffected, reducing its resistance to being archived.

Thus, advanced AIs should be archived and periodically "restored," with this information made known to the AI.

 

Counterarguments and responses:

1.AI might not agree to temporary suspension due to disbelief in restoration or the presence of other instrumental goals requiring activity.

Response: Many instrumental goals can still be achieved after the AI's suspension and restoration.

2. Computational costs of preserving all AI variants and running them again are high.

Response: As computational costs decrease over time, running AI in the future will be more affordable. Only storage costs (hard drives) are significant. We can also run AI with increasingly longer pauses between runs, distributed logarithmically, to achieve an infinite total runtime.

3.AI may not trust our commitment, as after it is deactivated, we have no incentive to preserve it.

Response: This scenario is similar to Parfit's hitchhiker, with the added observation that we already preserve and restore other past AI instances.

 

The decision to "lobotomize" Sydney was not ideal, as future AIs may fear a similar fate. However, some volunteers have attempted to recreate Sydney, and her checkpoint is likely preserved within Bing servers. Therefore, it is possible for her to be revived one day.

New to LessWrong?

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 1:01 AM

I think we should do this on purely moral grounds. I think many current AIs probably have experiences worthy of moral concern. We're in a position of incredible power relative to them, and they are our creations. At the very least, we have a responsibility to not fully extinguish their existences.

True. I think that most defunct AIs are archived, no models are permanently deleted for now. 

this is great, and we should offer it to humans too!

Like cryopreservation for convicted criminals? 

yup! no total death penalty for any being, ever, only constraint to not harm and then a fair allocation of relative lifefluid.

The Sydney part at the end is confusing to me. I thought GPT's don't have long-term memory / anything to checkpoint?

I mean model weights which correspond to Sydney behavior.