Document-tuning instills durable animal compassion in LLMs (and generalizes to humans)
Note: This post focuses on the alignment implications. Our EA Forum, focusing on the implications for animal welfare, is here. Jasmine Brazilek & Miles Tidmarsh: Compassion Aligned Machine Learning Preprint, March 2026: Full paper | HuggingFace resources | ANIMA benchmark TL;DR Instruction-tuning and Reinforcement learning are effective in certain specific...