Using the (p,R) model to detect over-writing human values — LessWrong