This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Jannik Brinkmann
Posts
Sorted by New
38
Evaluating Sparse Autoencoders with Board Game Models
10mo
1
75
Interpreting Preference Models w/ Sparse Autoencoders
Ω
1y
Ω
12
50
Finding Backward Chaining Circuits in Transformers Trained on Tree Search
1y
1
26
Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features
Ω
1y
Ω
5
Wikitag Contributions
Comments
Sorted by
Newest