Interpretability through two lenses: biology and physics
> Interpretability is the nascent science of making the vast complexity of billion-parameter AI models more comprehensible to the human mind. Currently, the mainstream approach is reductionist: dissecting a model into many smaller components, much like a biologist mapping cellular pathways. Here, I describe and advocate for the complementary perspective...
Aug 12, 202524