One-dimensional vs multi-dimensional features in interpretability — LessWrong