Can Reasoning Models Obfuscate Reasoning? Stress-Testing Chain-of-Thought Monitorability — LessWrong