Comparing reward learning/reward tampering formalisms — LessWrong