Mech Interp Project Advising Call: Memorisation in GPT-2 Small

Neel Nanda

Mech Interp Project Advising Call: Memorisation in GPT-2 Small

by Neel Nanda

1 min read4th Feb 2023No comments

7 Ω 5

Interpretability (ML & AI)AI

Frontpage

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I've recently been having advising calls with REMIX teams (Redwood's interpretability sprint) trying to give advice & feedback on projects. As an experiment, I've published a recording of one advising call (with Tessa Barton & Kushal Jain on memorisation in GPT-2 Small), I'm curious whether this is useful to anyone! IMO getting detailed feedback from a more experienced research is one of the best ways to improve at research, but have no idea whether someone else's feedback is comparatively useful, or whether my advice is good enough lol. Thanks to the team for being down to publish this, and the work!

https://youtu.be/39hDx25qsS8

New Comment

Moderation Log

LESSWRONG
LW

Mech Interp Project Advising Call: Memorisation in GPT-2 Small

7

Ω 5

New to LessWrong?