Sycophancy to subterfuge: Investigating reward tampering in large language models — LessWrong