a sketch of how we might go about getting basins of corrigibility from RL — LessWrong