x
Weight-Sparse Circuits May Be Interpretable Yet Unfaithful — LessWrong