resolving some neural network mysteries — LessWrong