LESSWRONG
LW

678
Kriz Tahimic
3020
Message
Dialogue
Subscribe

CS Undergraduate at De La Salle University | Interested in AI and Mechanistic Interpretability

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Attribution-based parameter decomposition
Kriz Tahimic6mo41

I think renaming "circuit" as "mechanism" is the right call. Prior to reading this post, when they talked about circuits, I thought they only meant cross-layer features.

Reply
Sparsify: A mechanistic interpretability research agenda
Kriz Tahimic6mo10

This is a paradigm shift for me. When I entered the field, I thought the uniqueness of MechInterp was that it treated AI like natural science (e.g., biology); hence, I assumed the field primarily does hypothesis testing. However, I agree that augmenting it with a big data-driven approach is the way to move forward.

Reply
No posts to display.