This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
911
Jannik Brinkmann — LessWrong
Jannik Brinkmann
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
38
Evaluating Sparse Autoencoders with Board Game Models
1y
1
75
Interpreting Preference Models w/ Sparse Autoencoders
Ω
1y
Ω
12
52
Finding Backward Chaining Circuits in Transformers Trained on Tree Search
1y
1
26
Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features
Ω
2y
Ω
5
Comments