x
What if AI Alignment Requires Systems That Distrust Their Own Optimization? — LessWrong