Prompting Models to Obfuscate Their CoT
by Josh Engels and Felix Tudose
Authors: Felix Tudose*, Joshua Engels** * primary contributor **advice and mentorship Summary: * Models can sometimes obfuscate their CoT when prompted to do so on a basic reasoning task * We can increase the rate of obfuscation by telling the model it failed previously * Obfuscation doesn’t significantly decrease task...
Dec 8, 202516