x
Gradient routing is better than pretraining filtering — LessWrong