Supervising strong learners by amplifying weak experts — LessWrong