Do mesa-optimizer risk arguments rely on the train-test paradigm? — LessWrong