Reward splintering as reverse of interpretability — LessWrong