Trying to Make a Treacherous Mesa-Optimizer — LessWrong