Part 12 of 12 in the Engineer’s Interpretability Sequence.
TAISIC = “the AI safety interpretability community”
MI = “mechanistic interpretability”
There might be some addenda later, but for now, this is the final post in The Engineer’s Interpretability Sequence. I hope you have found it interesting and have gotten some useful ideas. I will always be happy to talk to people about the topics from this sequence in the comments or via email. For now, the last thing I will do is offer a summary of key points post by post :)
I hope you enjoyed this sequence and found some useful ideas. Let me know if you’d like to talk about interpretability, adversaries, etc. sometime.
I am very thankful for TAISIC and others in the AI safety space for doing important and interesting work. For me personally, TAISIC members have been excellent sources of inspiration and collaboration, and I’m glad to be a part of this community.