Polysemanticity and Capacity in Neural Networks — LessWrong