Results from the interpretability hackathon — LessWrong