x
How to reduce capability degradation from off-model SFT — LessWrong