Small foundational puzzle for causal theories of mechanistic interpretability — LessWrong