Train first VS prune first in neural networks. — LessWrong