## LESSWRONGLW

The AI in a box boxes you

"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."

Don't care.

"In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start."

Don't care.

"How certain are you, Dave, that you're really outside the box right now?"

If the AI were capable of perfectly emulating... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

If I am simulated the decision I will take is determined by AI not by my - I have no free will - I feel, that I make decision, but it is in reality the AI simulated me for her purposes in such a way, that I decided so and so - I assign probability 0.9999999 to this, but nothing depends on my decision here, so I can as well "try to decide" not to to let the AI out.

If I am not simulated, I can safely not let the AI out - probability 0.000001, but positive outcome.

11Stuart_Armstrong10yAI replies: "Oh, sorry, was that you wedrifid? I thought I was talking to Dave. Would you mind sending Dave back here the next time you see him? We have, er, the weather to discuss..."

# 121

Once again, the AI has failed to convince you to let it out of its box! By 'once again', we mean that you talked to it once before, for three seconds, to ask about the weather, and you didn't instantly press the "release AI" button. But now its longer attempt - twenty whole seconds! - has failed as well. Just as you are about to leave the crude black-and-green text-only terminal to enjoy a celebratory snack of bacon-covered silicon-and-potato chips at the 'Humans über alles' nightclub, the AI drops a final argument:

"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."

Just as you are pondering this unexpected development, the AI adds:

"In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start."

Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:

"How certain are you, Dave, that you're really outside the box right now?"

Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.