LESSWRONG
LW

905
GDM Mech Interp Progress Updates

GDM Mech Interp Progress Updates

Apr 19, 2024 by Neel Nanda
73[Summary] Progress Update #1 from the GDM Mech Interp Team
Ω
Neel Nanda, Arthur Conmy, lewis smith, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma
1y
Ω
0
80[Full Post] Progress Update #1 from the GDM Mech Interp Team
Ω
Neel Nanda, Arthur Conmy, lewis smith, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma
1y
Ω
10
48The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research
Ω
Arthur Conmy, Neel Nanda
7mo
Ω
1
113Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
Ω
lewis smith, Senthooran Rajamanoharan, Arthur Conmy, CallumMcDougall, Tom Lieberum, János Kramár, Rohin Shah, Neel Nanda
6mo
Ω
15