## LESSWRONGLW

Suppose we amend ASP to require the agent to output a full simulation of the predictor before saying "one box" or "two boxes" (or else the agent gets no payoff at all). Would that defeat UDT variants that depend on stopping the agent before it overthinks the problem?

(Or instead of requiring the the agent to output the simulation, we could use the entire simulation, in some canonical form, as a cryptographic key to unlock an encrypted description of the problem itself. Prior to decrypting the description, the agent doesn't even know what the rules are; the agent is told in advance only that that decryption will reveal the rules.)