x
Introspective Interpretability: a Definition, Motivation, and Open Problems — LessWrong