LESSWRONG
LW

1892
Tommy Xie
4120
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Hidden Reasoning in LLMs: A Taxonomy
Tommy Xie1d10

Alien language

View as possible compression of human language? Can change my thoughts If they prove to have no consistency, or not interpretable/decodable by LLM.

Reply
From Messy Shelves to Master Librarians: Toy-Model Exploration of Block-Diagonal Geometry in LM Activations
Tommy Xie2mo10

Thanks for the post! Not sure if I understood (I like it) it correctly about the idea of alignment between the geometry and the learned dictionary: this produces "separation" between features that is closer to the "true" dictionary—maybe it's just a paraphrasing of the same thing, but what did you mean by "semantic continuity"?

Reply
5Run-time Steering Can Surpass Post-Training: Reasoning Task Performance
2mo
2