Training AI to do alignment research we don’t already know how to do — LessWrong