LESSWRONG
LW

Julian Minder
102Ω11100
Message
Dialogue
Subscribe

PhD @ EPFL with Robert West. MATS 7 Scholar with Neel Nanda. Interested in mechanistic interpretability and the what the process of finetuning does to models.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No Comments Found
No wikitag contributions to display.
24Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences
Ω
7h
Ω
1
104What We Learned Trying to Diff Base and Chat Models (And Why It Matters)
Ω
2mo
Ω
2