In my previous post I went over some common approaches for updating LMs with fresh knowledge. Here, I detail a specific approach that has gained popularity in recent years - locating and editing factual associations in language models. I do not believe in this approach, in this post I try to summarize it fairly, and explain why I don’t quite like it.
Do LMs Store Facts in Their Weights?
Language models that employ the transformer architecture have Feed Forward Networks (FFNs) as an important subcomponent. For any specific layer, the FFN has 2 sublayers within it.
During the forward pass of a LM, the FFN takes as input a dense vector representation from the previous... (read 934 more words →)