Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning — LessWrong