Hello, World of Mechanistic Interpetability
This post is an introduction for the series of posts, which will be dedicated to mechanistic interpretability in its broader definition as a set of approaches and tools to better understand the processes that lead to certain AI-generated outputs. There is an ongoing debate on what to consider a part...
Mar 155