Preventing Language Models from hiding their reasoning — LessWrong