x
Partial Identifiability in Reward Learning — LessWrong