x
Poisoning Fine-tuning Datasets of Constitutional Classifiers — LessWrong