Comparing AI Alignment Approaches to Minimize False Positive Risk — LessWrong