Your Causal Variables Are Irreducibly Subjective
Mechanistic interpretability needs its own shoe leather era. Reproducing the labeling process will matter more than reproducing the Github. Crossposted from Communication & Intelligence substack When we try to understand large language models, we like to invoke causality. And who can blame us? Causal inference comes with an impressive toolkit:...