LESSWRONG
LW

Tommy Xie
4110
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
From Messy Shelves to Master Librarians: Toy-Model Exploration of Block-Diagonal Geometry in LM Activations
Tommy Xie1mo10

Thanks for the post! Not sure if I understood (I like it) it correctly about the idea of alignment between the geometry and the learned dictionary: this produces "separation" between features that is closer to the "true" dictionary—maybe it's just a paraphrasing of the same thing, but what did you mean by "semantic continuity"?

Reply
5Run-time Steering Can Surpass Post-Training: Reasoning Task Performance
1mo
2