unRLHF - Efficiently undoing LLM safeguards — LessWrong