"Don't even think about hell"

by emmab1 min read2nd May 20202 comments

6

Decision TheoryAI Boxing (Containment)Dark ArtsAI
Frontpage

I found an interesting article by Eliezer on arbital: https://arbital.com/p/hyperexistential_separation/

One seemingly obvious patch to avoid disutility maximization might be to give the AGI a utility function U=V+W where W says that the absolute worst possible thing that can happen is for a piece of paper to have written on it the SHA256 hash of "Nopenopenope" plus 17

The article appears to be suggesting that in order to avoid an FAI accidentally(?) creating hell while trying to stop hell or think about hell, we could find some solution analogous to making the AI a positive utilitarian, so it would never even invent the idea of hell.

This appears to me to be making an ontological assumption that there's a zero-point between positive and negative value, which seems dubious but I'm confused.

Should AGIs think about hell sometimes? e.g. to stop it?