LESSWRONG
LW

84
Daniel Lee
78210
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Open Thread Summer 2024
Daniel Lee1y80

Hi, excited to learn more about Mech Int!

Reply
54Finding Features Causally Upstream of Refusal
8mo
5
28Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
1y
0