LESSWRONG
LW

Wikitags

Machine Unlearning

Edited by NickyP last updated 23rd Oct 2023

In Machine Unlearning, the aim is to reduce performance on some "unlearned" tasks, while keeping performance on some "retained" tasks.  While traditionally used in the context of privacy preservation and GDPR, some of the research is relevant to the field of AI Interpretability. Here is some terminology often used in the machine unlearning literature. (note that there can be some minor differences in use):

  • Forgotten/Unlearned task: task or knowledge you want the model to forget.
  • Retained task: task or knowledge you want to have the model stay good at. (i.e: the entire dataset except for the unlearned task).
  • Original model: the base model that you start off with.
  • Unlearned model: the model after the machine unlearning technique is applied. This model should be worse at some "unlearned" task, but should still be good at the "retained" task.
  • Relearned model: train the unlearned model to do the unlearned task again. 
  • Retrained model: train a randomly initialised model from scratch on the whole dataset, excluding the task you don't want it to do (i.e: only on retained tasks). Can be very expensive for large models.
  • Streisand effect: parameter changes are so severe that the unlearning itself may be detected. (Related to Goodhart-ing the unlearning metrics).


For an overview, one can look at "A Survey of Machine Unlearning" 

Subscribe
Subscribe
Discussion0
Discussion0
Posts tagged Machine Unlearning
33Machine Unlearning Evaluations as Interpretability Benchmarks
Ω
NickyP, Nandi
2y
Ω
2
126Deep Forgetting & Unlearning for Safely-Scoped LLMs
Ω
scasper
2y
Ω
30
9Machine Unlearning in Large Language Models: A Comprehensive Survey with Empirical Insights from the Qwen 1.5 1.8B Model
Rudaiba
7mo
2
232Distillation Robustifies Unlearning
Ω
Bruce W. Lee, Addie Foote, alexinf, leni, Jacob G-W, Harish Kamath, Bryce Woodworth, cloud, TurnTrout
3mo
Ω
43
169Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
Ω
cloud, Jacob G-W, Evzen, Joseph Miller, TurnTrout
9mo
Ω
12
102The case for unlearning that removes information from LLM weights
Ω
Fabien Roger
11mo
Ω
18
55Unlearning via RMU is mostly shallow
Ω
Andy Arditi, bilalchughtai
1y
Ω
4
53Breaking Circuit Breakers
mikes, tbenthompson
1y
13
24Unlearning Needs to be More Selective [Progress Report]
Ω
Filip Sondej, Yushi Yang, Marcel Windys
2mo
Ω
6
Add Posts