Has Anthropic checked if Claude fakes alignment for intended values too? — LessWrong