LESSWRONG
LW

Yuxiao
12300
Message
Dialogue
Subscribe

I'm an AI safety researcher — mostly working on ways to see inside the systems we’ve built and understand what moves them. My background runs through statistical inference, machine learning, and generative models; lately I’ve been in the borderlands between mechanistic interpretability and probabilistic thinking, trying to make large language models a little less opaque.

I’ve moved between academia, industry, and independent research, but the constant thread is the same: bridging abstract mathematics with the hidden structures of deep networks. I view this blog as the place to keep scientific diary, maintain emotional balance, and make friends :)

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
2From Oragnized Shelves to Layered Catalogs: Architectural Explorations for Sparse Autoencoders -- Crosscoders & Ladder SAEs Towards Hierarchical Data Structure
25d
0
5From Messy Shelves to Master Librarians: Toy-Model Exploration of Block-Diagonal Geometry in LM Activations
2mo
1
8From Unruly Stacks to Organized Shelves: Toy Model Validation of Structured Priors in Sparse Autoencoders
2mo
0