Thanks for the excellent note. I wanted to offer an alternative hypothesis on the unlearning of the random birthday (RB) dataset: the RB dataset might be effectively unlearned primarily because the labels are completely randomly generated.
Because there are no semantic relationships across the data points, the model is forced to memorize each fact in strict isolation. This highly localized memorization likely makes the unlearning process structurally easier and more thorough. Furthermore, because this memorization is isolated and lacks shared latent heuristics among the data points, performing RTT cannot propagate updates that would affect or recover the evaluation on set V.
Thanks for the excellent note. I wanted to offer an alternative hypothesis on the unlearning of the random birthday (RB) dataset: the RB dataset might be effectively unlearned primarily because the labels are completely randomly generated.
Because there are no semantic relationships across the data points, the model is forced to memorize each fact in strict isolation. This highly localized memorization likely makes the unlearning process structurally easier and more thorough. Furthermore, because this memorization is isolated and lacks shared latent heuristics among the data points, performing RTT cannot propagate updates that would affect or recover the evaluation on set V.