2 AI Safety Thursdays: Can we make LLMs forget? An Intro to Machine Unlearning

1 min read

2

LLMs are pre-trained on a large fraction of the internet. As a result, they can regurgitate private, copyrighted, and potentially hazardous information, causing deployment and safety challenges.

Lev McKinney will guide us through machine unlearning in LLMs—how models retain facts, methods for identifying influential training data, and techniques for suppressing predictions. Finally, we'll assess current research and its effectiveness for policy and safety concerns.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

2

AI Safety Thursdays: Can we make LLMs forget? An Intro to Machine Unlearning

2

2