Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I've recently been having advising calls with REMIX teams (Redwood's interpretability sprint) trying to give advice & feedback on projects. As an experiment, I've published a recording of one advising call (with Tessa Barton & Kushal Jain on memorisation in GPT-2 Small), I'm curious whether this is useful to anyone! IMO getting detailed feedback from a more experienced research is one of the best ways to improve at research, but have no idea whether someone else's feedback is comparatively useful, or whether my advice is good enough lol. Thanks to the team for being down to publish this, and the work!

New Comment

New to LessWrong?