Thoughts on refusing harmful requests to large language models — LessWrong