[Replication] Conjecture's Sparse Coding in Small Transformers — LessWrong