x
Covert Malicious Finetuning — LessWrong