x
What Reasoning Steps Cause Alignment Faking? — LessWrong