LESSWRONG
LW

31
Tommy Xie
4130
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5Run-time Steering Can Surpass Post-Training: Reasoning Task Performance
3mo
2
From Oragnized Shelves to Layered Catalogs: Architectural Explorations for Sparse Autoencoders -- Crosscoders & Ladder SAEs Towards Hierarchical Data Structure
Tommy Xie1mo10

LELBO=−Eqϕ(z|y)[logpθ(x|z)]+βKL(qϕ(z|y)∣∣∣∣p(z))

Question:
If maximising ELBO is 1. learn to reconstruct the data faithfully 2. regularising the latent space to generalise on similar but new data

Are the two terms in this formula doing 1 and 2 separately? If so, how?

Reply
Hidden Reasoning in LLMs: A Taxonomy
Tommy Xie2mo10

Alien language

View as possible compression of human language? Can change my thoughts If they prove to have no consistency, or not interpretable/decodable by LLM.

Reply
From Messy Shelves to Master Librarians: Toy-Model Exploration of Block-Diagonal Geometry in LM Activations
Tommy Xie3mo10

Thanks for the post! Not sure if I understood (I like it) it correctly about the idea of alignment between the geometry and the learned dictionary: this produces "separation" between features that is closer to the "true" dictionary—maybe it's just a paraphrasing of the same thing, but what did you mean by "semantic continuity"?

Reply