x
Comparing reward learning/reward tampering formalisms — LessWrong