LESSWRONG
LW

247
Jiaxing Wu
0010
Message
Dialogue
Subscribe

Undergraduate student, currently researching mechanistic interpretability, always welcome reaching out.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
Addressing Feature Suppression in SAEs
Jiaxing Wu10mo10

Hi, thanks for your work. I was wondering why we use scaling to modify the activation here rather than using an analytical solution by compensating for the −cd/2 term.

Reply