x
Full toy model for preference learning — LessWrong