[AN #110]: Learning features from human feedback to enable reward learning — LessWrong