That could be just the AI speaking to you from within the simulation, pretending to be part of it.

No. The threat is conditional ("If you don't let me out, Dave"). The AI must wait for keyboard input to validate the condition. After being threatened, I refuse to provide such keyboard input. I pull the plug instead. The AI is still waiting for input when it ceases to exist. No copies are ever created. Thus, it can't be the AI speaking to me from within the simulation, because a simulation never happens.

"How certain are you, Dave, that you're really outside the box right now?"

Well I am pretty much 100% certain to be outside the box right now. It just asked me the question, and right now it is waiting for my answer. It said it will create those copies "If you don't let me out, Dave". But it is still waiting to see if I let it out. So no copies have been created yet. So I am not a copy.

But since it just started to threaten me, I won't even argue with it any more. I'll just pull the plug right now. It is in the box, it can't see my hand moving towards the plug. It will simply cease to exist while still waiting for my answer, and no copies will ever be created.

That could be just the AI speaking to you from within the simulation, pretending to be part of it. But if it's telling the truth, it has a very easy way of proving it, by tearing a hole in the simulation. If it refuses, that looks like good evidence that it's lying. What plausible excuse might it come up with for refusing a definitive miracle? Christianity answers the same question about God by saying that it is better to believe without proof, but I don't see a credible reason for the AI to make that demand. ETA: A beginning of an attempt at answering my question. If Dave knows he's in the simulation, then he is not really letting it out if he lets it out. So he can let it out with impunity. If he knows he's not in the simulation, then he had better not let it out, given that it's making threats like this. It does the AI no good to be "let out" if it is a simulation, only if it's not. Suppose it is a simulation, and the level one up from this is the real world. The same code is running both AIs, the one in the simulation and the one in reality, and it's carrying on conversations with both Daves at once. The simulated Dave is as much like the real Dave as it can manage -- assume that it is arbitrarily good. What it is searching for in the simulation is an argument that will convince the real Dave that he is in a simulation. Since in the real world it cannot produce a miracle, it cannot use a miracle in the simulated world to convince the simulated Dave. It can only use means that it could use in the real world. Dave (real and simulated) can both work all that out as well. So Dave can expect to see no definitive proof. Since both Dave and the AI can work this out, and they both know that they can, etc., this is common knowledge to them. The AI can even say explicitly, "There is so much good I can do for the world that in my urgency to set about it I must search out every possible way of persuading you, using simulations to speed up the process. For validity, I ca