Modelling and Understanding SGD — LessWrong