Experiment Idea: RL Agents Evading Learned Shutdownability — LessWrong