Relative Value Functions: A Flexible New Format for Value Estimation — LessWrong